<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Roland's Blog</title>
	<atom:link href="http://digitalvampire.org/blog/index.php/feed/" rel="self" type="application/rss+xml" />
	<link>http://digitalvampire.org/blog</link>
	<description>Linux hacker, recovering mathematician, former athlete</description>
	<lastBuildDate>Tue, 08 Mar 2011 04:58:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Do you know everything about RDMA?</title>
		<link>http://digitalvampire.org/blog/index.php/2011/03/07/do-you-know-everything-about-rdma/</link>
		<comments>http://digitalvampire.org/blog/index.php/2011/03/07/do-you-know-everything-about-rdma/#comments</comments>
		<pubDate>Tue, 08 Mar 2011 04:58:07 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[infiniband]]></category>
		<category><![CDATA[rdma]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=175</guid>
		<description><![CDATA[As I mentioned on Twitter (by the way, are you following @rolanddreier?), I&#8217;ll be speaking at the Linux Foundation Collaboration Summit in San Francisco on April 7.  My general mandate is to give an introduction to RDMA and InfiniBand on Linux, and to talk about recent developments and what might be coming next in the [...]]]></description>
			<content:encoded><![CDATA[<p>As I mentioned on <a href="http://twitter.com/#!/rolanddreier/status/44186901346910208">Twitter</a> (by the way, are you following <a href="http://www.twitter.com/rolanddreier">@rolanddreier</a>?), I&#8217;ll be speaking at the <a href="http://events.linuxfoundation.org/events/collaboration-summit">Linux Foundation Collaboration Summit</a> in San Francisco on April 7.  My general mandate is to give an introduction to RDMA and InfiniBand on Linux, and to talk about recent developments and what might be coming next in the area.  However, I&#8217;d like to make my talk a little less boring than my usual talks, so I&#8217;d be curious to hear about specific topics you&#8217;d like me to cover.  And if you&#8217;re at the summit, stop by and say hello.</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2011/03/07/do-you-know-everything-about-rdma/&via=rolanddreier&text=Do you know everything about RDMA?&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2011/03/07/do-you-know-everything-about-rdma/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Want to work with me?</title>
		<link>http://digitalvampire.org/blog/index.php/2011/02/16/want-to-work-with-me/</link>
		<comments>http://digitalvampire.org/blog/index.php/2011/02/16/want-to-work-with-me/#comments</comments>
		<pubDate>Thu, 17 Feb 2011 00:40:45 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[linux]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=169</guid>
		<description><![CDATA[Why? More seriously, in the past few weeks, my new employer (Pure Storage) has said a little more, and I can now link to a real jobs page.  As you can see from the listings, we&#8217;re looking for both kick-ass developers as well people with more QA/tools/scripting skills.  And we definitely are willing to help [...]]]></description>
			<content:encoded><![CDATA[<p>Why?</p>
<p>More seriously, in the past few weeks, my new employer (<a title="Pure Storage" href="http://www.purestorage.com/">Pure Storage</a>) has said a little more, and I can now link to a real <a title="Pure Storage jobs" href="http://www.purestorage.com/company/jobs.php">jobs page</a>.  As you can see from the listings, we&#8217;re looking for both kick-ass developers as well people with more QA/tools/scripting skills.  And we definitely are willing to help people fresh out of school learn, as long as they have some experience with Linux.</p>
<p>If you&#8217;re interested, you can let me know or just apply directly from the jobs page.  Good luck!</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2011/02/16/want-to-work-with-me/&via=rolanddreier&text=Want to work with me?&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2011/02/16/want-to-work-with-me/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>New testbed installed</title>
		<link>http://digitalvampire.org/blog/index.php/2011/02/11/new-testbed-installed/</link>
		<comments>http://digitalvampire.org/blog/index.php/2011/02/11/new-testbed-installed/#comments</comments>
		<pubDate>Fri, 11 Feb 2011 22:56:00 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[infiniband]]></category>
		<category><![CDATA[rdma]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=163</guid>
		<description><![CDATA[Since I changed jobs, I left behind a lot of my test systems, but I now have a couple of test systems set up. Here is the rather crazy set of non-chipset devices I now have in one box: $ lspci -nn&#124;grep -v 8086: 03:00.0 InfiniBand [0c06]: Mellanox Technologies MT25208 [InfiniHost III Ex] [15b3:6282] (rev [...]]]></description>
			<content:encoded><![CDATA[<p>Since I changed jobs, I left behind a lot of my test systems, but I now have a couple of test systems set up.  Here is the rather crazy set of non-chipset devices I now have in one box:</p>
<pre>$ lspci -nn|grep -v 8086:
03:00.0 InfiniBand [0c06]: Mellanox Technologies MT25208 [InfiniHost III Ex] [15b3:6282] (rev 20)
04:00.0 Ethernet controller [0200]: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] [15b3:6750] (rev b0)
05:00.0 InfiniBand [0c06]: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] [15b3:673c] (rev b0)
84:00.0 Ethernet controller [0200]: NetEffect NE020 10Gb Accelerated Ethernet Adapter (iWARP RNIC) [1678:0100] (rev 05)
85:00.0 Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]
86:00.0 InfiniBand [0c06]: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] [15b3:6274] (rev 20)</pre>
<p>(I do have a couple of open slots if you have some RDMA cards that I&#8217;m missing to complete my collection <img src='http://digitalvampire.org/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> )</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2011/02/11/new-testbed-installed/&via=rolanddreier&text=New testbed installed&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2011/02/11/new-testbed-installed/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Aloha Means Goodbye</title>
		<link>http://digitalvampire.org/blog/index.php/2011/01/21/aloha-means-goodbye/</link>
		<comments>http://digitalvampire.org/blog/index.php/2011/01/21/aloha-means-goodbye/#comments</comments>
		<pubDate>Fri, 21 Jan 2011 18:00:00 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[infiniband]]></category>
		<category><![CDATA[personal]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[job]]></category>
		<category><![CDATA[topspin]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=135</guid>
		<description><![CDATA[Today is my last day at Cisco. A little more than 10 years ago, in January 2001, I joined a small startup called Topspin Communications.  We weren&#8217;t saying much publicly about what we were doing, but the idea when I joined was to build a super-high-performance box for dynamic web serving, with web app blades, [...]]]></description>
			<content:encoded><![CDATA[<p>Today is my last day at Cisco.</p>
<p><a href="http://digitalvampire.org/blog/wp-content/uploads/2011/01/topspin.jpg"><img class="aligncenter size-full wp-image-155" title="topspin" src="http://digitalvampire.org/blog/wp-content/uploads/2011/01/topspin.jpg" alt="Topspin logo" width="300" height="72" /></a></p>
<p>A little more than 10 years ago, in January 2001, I joined a small startup called Topspin Communications.  We weren&#8217;t <a href="http://web.archive.org/web/20010518162329/http://www.topspin.com/">saying much publicly</a> about what we were doing, but the idea when I joined was to build a super-high-performance box for dynamic web serving, with web app blades, TCP offload blades, storage blades and SSL blades.  I was in charge of the SSL blade.  However, early 2001 was when it became clear that the bubble was well and truly bursting, and it started to become clear the we weren&#8217;t going to have enough customers if we actually built our box, so we abandoned that product.  Shortly after this decision, I got a call from the salesman from the company whose encryption chip we had selected for the SSL blade, telling me that they had decided not to build the encryption chip after all.  I remember thinking how upset I would have been if he had called a week earlier, when we were still planning a product around the chip.</p>
<p>After a few months of flailing around searching for a product direction (<em>not</em> the most fun time in Topspin&#8217;s history), we decided to <a href="http://web.archive.org/web/20010923053422/http://www.topspin.com/">focus on InfiniBand</a> networking gear.  Initially, we focused on connections from servers on an InfiniBand fabric to existing Ethernet and Fibre Channel networks, and thus was born the IGR &#8212; InfiniBand Gateway Router &#8212; aka Buzz (&#8220;To InfiniBand and Beyond&#8221;):</p>
<p style="text-align: center;"><a href="http://digitalvampire.org/blog/wp-content/uploads/2011/01/buzz.jpg"><img class="aligncenter size-full wp-image-139" title="Topspin IGR" src="http://digitalvampire.org/blog/wp-content/uploads/2011/01/buzz.jpg" alt="Buzz" width="540" height="360" /></a></p>
<p>This first chassis was pretty far from being a real product: it had only 1X IB ports (2 Gbps!) and was built using Mellanox MT21108 &#8220;Gamla&#8221; chips &#8212; pretty far from a shippable product. Heroic hardware reworks and software hacks were done just the get the system booting; for example, somehow I added enough IB support to PPCBoot for the line cards to load a kernel from the controller over InfiniBand directed route MADs.</p>
<p>Still, it was enough to get companies like Dell and Microsoft to take us seriously (which helped us raise another $30 million in the summer of 2002).  Keep in mind that this was during the time that everyone thought InfiniBand was going to be huge, and Microsoft was planning on having IB drivers in Windows Server 2003.  In fact we lugged some prototypes and emulators built on PCs up to Washington State to do interoperability testing and debugging with the Windows driver developers, and even watch Windows kernel developers at work.</p>
<p>When we were designing the next version of this box, one big decision was what 4X IB adapter chip to use inside.  The choices were to play it safe with IBM Microelectronics, or to gamble on a startup, Mellanox, who was making bold performance promises.  Luckily, we chose Mellanox, since the &#8220;safe&#8221; choice, IBM, canceled their IB products after struggling to make them work at all.  Mellanox&#8217;s first spin of their chip worked &#8212; it was an amazing experience to have a real 4X adapter that &#8220;just worked&#8221; in our lab after all the screwing around with half-baked 1X products that we had gone though (although we did spend plenty of time debugging the driver and firmware for that Tavor adapter).</p>
<p>We worked hard on getting to a real product, and in November 2002, we were able to introduce the Topspin 360, which had 24 4X IB ports, 12 standard IB module slots (each could hold either a 4-port 1G Ethernet gateway or a 2-port Fibre Channel gateway) as well as one very cool bezel design:</p>
<p><a href="http://digitalvampire.org/blog/wp-content/uploads/2011/01/360.jpg"><img class="aligncenter size-full wp-image-142" title="360" src="http://digitalvampire.org/blog/wp-content/uploads/2011/01/360.jpg" alt="Topspin 360" width="540" height="360" /></a></p>
<p>In engineering, we followed the 360 with the &#8220;90 in 90&#8243; challenge and built the Topspin 90 in only 90 days.  I was able to get IPoIB working on the 90&#8242;s controller, in spite of having only a primitive IB switch and no host adapter available.  The Topspin 90 was introduced in January 2003:</p>
<p><a href="http://digitalvampire.org/blog/wp-content/uploads/2011/01/90.jpg"><img class="aligncenter size-full wp-image-154" title="Topspin 90" src="http://digitalvampire.org/blog/wp-content/uploads/2011/01/90.jpg" alt="Topspin 90" width="484" height="438" /></a></p>
<p>The engineering team spent the rest of 2003 building the Topspin 120 24-port switch (another switch chip to get IPoIB working on), and a new 6-port Ethernet gateway.  The Ethernet gateway was pretty cool &#8212; for the first 4-port Ethernet gateway, we used a PowerPC 440GP along with a Mellanox HCA and some Intel NICs and did all the forwarding between Ethernet and IPoIB in software.  Between PCI-X and CPU bottlenecks, we were a bit performance limited.  The 6-port gateway used a Xilinx Virtex 2 FPGA with our own InfiniBand logic, and did all the forwarding in hardware, so we were able to handle full line rate of minimum-sized packets in both directions on all 6 Ethernet ports&#8211;and in 2003, 12 Gbps of traffic was an awful lot!</p>
<p>Somewhere along the way, it became clear that operating systems (aside from borderline-irrelevant proprietary Unixes like Solaris and HP-UX) would not include InfiniBand drivers out of the box; Microsoft dropped their plans for IB drivers, and the <a href="http://infiniband.sourceforge.net/">open source Linux project</a> stalled.  It became clear that if we wanted anyone to buy InfiniBand networking gear, we would have to take care of the server side of things too, and so we started working on a host driver stack.  Luckily, at the very beginning of our InfiniBand development in 2001, we made the decision to use Linux on PowerPC rather than VxWorks as our embedded OS.  That meant we had a lot of Linux InfiniBand driver code from our switch systems that we could adapt into host drivers.</p>
<p>At first, we distributed our drivers as proprietary binary blobs, which meant a lot of pain for us building our drivers for every different kernel flavor on every distribution our customers used, and which also meant a lot of pain for our customers who wanted to mix and match IB gear from different vendors.  Clearly, for IB to work everyone had to agree on an open source stack, and after a lot of arguing and political wrangling that I&#8217;ll skip over here, the OpenIB Alliance was formed, and we started working on InfiniBand drivers for upstream inclusion in Linux.</p>
<p><a href="http://digitalvampire.org/blog/wp-content/uploads/2011/01/openib.png"><img class="aligncenter size-full wp-image-156" title="openib" src="http://digitalvampire.org/blog/wp-content/uploads/2011/01/openib.png" alt="OpenIB Alliance" width="410" height="205" /></a></p>
<p>The starting point of all the different vendor stacks that got released as open source was not particularly good, and although a lot of the community was in denial about it, it was clear to me that we would have to start from scratch to get something clean enough to go upstream.  Around February 2004, I was trying to optimize IPoIB performance, and I got so frustrated trying to wade through all the abstraction layers of the Mellanox HCA driver that I decided I would try to write my own drastically simpler driver, and I started working on something I called &#8220;mthca&#8221;.</p>
<p>By May 2004, I had mthca working enough to run IPoIB and I decided to <a href="http://osdir.com/ml/drivers.openib/2004-05/msg00377.html">announce it publicly</a>.  This led to another series of flamewars but also enough encouragement from people I considered sane that I continued working on a stack built around mthca, and by December 2004 we had something <a href="http://lkml.org/lkml/2004/12/28/37">good enough to go upstream</a>.  That was really the start of a lot of great things, and I&#8217;m really proud of my role helping to maintain the Linux stack; today we have iWARP support, eight different hardware drivers, IPoIB, storage protocols, network file protocols, RDS; InfiniBand is used in more than half of the Top 500 supercomputers, etc.  And I don&#8217;t think any of that happens without IB support being upstream.</p>
<p>On the hardware side of things, we continued building things like the Topspin 270 96-port switch (1.5 Tbps of switch capacity!), switches for IBM BladeCenter, and so on.  In April 2005, Cisco bought Topspin, and when the deal closed in May 2005, I officially became a Cisco employee.  The Topspin IB products became the Cisco SFS product line, and for a brief glorious time, Cisco sold IB gear.</p>
<p>Unfortunately (for the SFS product line, at least), the IB market didn&#8217;t grow fast enough to become the billion-dollar market that Cisco looks for, and so Cisco decided to stop selling IB gear.  We went from <a href="http://newsroom.cisco.com/dlls/2007/prod_111307d.html">announcing new products</a> to announcing that we <a href="http://www.cisco.com/en/US/prod/collateral/ps6418/ps6419/ps6421/eol_c51_489153.html">wouldn&#8217;t sell those products</a> (and I don&#8217;t think an SFS 3504 ever actually shipped to a customer).  In fact, I personally gummed up the works a bit by putting in an internal order for an SFS 3504 as soon as it was orderable; a year later, the guy responsible for winding down the SFS product line had to track me down and have me cancel the order, which was the last one still on the books.</p>
<p>After we stopped working on InfiniBand stuff, we were bounced around between a few Cisco business units until we ended up working on x86 servers for the Cisco UCS product line.  For the past few years, I&#8217;ve been helping Cisco build rack servers while continuing to be the InfiniBand/RDMA maintainer for Linux.  I&#8217;ve helped build cool products such as the <a href="http://www.cisco.com/en/US/products/ps10922/index.html">Cisco C460 server</a> (some amusing things about the C460 project were debugging UEFI/BIOS  that made memtest86+ insta-reboot at a certain memory location, and figuring out why Linux wouldn&#8217;t boot on an x86 system with 1TB of RAM).  Cisco is a fun, rewarding place to work, and it&#8217;s amazing to still work every day with so many people from the old-school Topspin team, who have taught me so much over the years and become good friends along the way.</p>
<p>But since the Cisco acquisition, I&#8217;ve always missed the rush of working at a startup (hence my <a href="http://digitalvampire.org/blog/index.php/2010/12/23/missing-the-point-on-startups/">cri de coeur</a> defending startups), and starting on Monday I&#8217;ll finally get back to that.  My new company is using InfiniBand, and continuing to maintain the upstream stack is part of my official job description, so nothing should be changing about my free software activities.  If my next job is half as good as Topspin, it should be an awesome ride.</p>
<p>My new company is still trying to keep things on the down-low, so I&#8217;m not going to put a link on my blog.  I can say that we still want to hire more great Linux developers, so if you&#8217;re interested, please get in touch with me!  We&#8217;re looking for people to work in-person in downtown <a href="http://en.wikipedia.org/wiki/Mountain_View,_California">Mountain View, CA</a> (really downtown&#8211;not off in the Shoreline wilderness near the Googleplex, but actually in the same building as the Mozilla Foundation, near the train station, restaurants, etc).  As I said, working remotely isn&#8217;t an option, but if you aren&#8217;t currently in the area and want to move to Silicon Valley, we can help with relocation and visas (if you&#8217;re good enough, of course <img src='http://digitalvampire.org/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> ).</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2011/01/21/aloha-means-goodbye/&via=rolanddreier&text=Aloha Means Goodbye&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2011/01/21/aloha-means-goodbye/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Missing the point on startups</title>
		<link>http://digitalvampire.org/blog/index.php/2010/12/23/missing-the-point-on-startups/</link>
		<comments>http://digitalvampire.org/blog/index.php/2010/12/23/missing-the-point-on-startups/#comments</comments>
		<pubDate>Fri, 24 Dec 2010 04:18:26 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[hacking]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=111</guid>
		<description><![CDATA[I&#8217;ve been thinking about Ted Ts&#8217;o's recent posts about whether it&#8217;s possible to do engineering or work on technology at startups. I&#8217;m not going to argue that you can&#8217;t work on technology at Google or another big company (although articles like these do point out the difficulties). It would be easy to pick on Google&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been thinking about Ted Ts&#8217;o's <a href="http://thunk.org/tytso/blog/2010/11/29/google-has-a-problem-retaining-great-engineers-bullcrap/">recent</a> <a href="http://thunk.org/tytso/blog/2010/12/06/working-on-technology-at-startups/">posts</a> about whether it&#8217;s possible to do engineering or work on technology at startups.  I&#8217;m not going to argue that you<em> can&#8217;t</em> work on technology at Google or another big company (although articles <a href="http://www.nytimes.com/2010/11/29/technology/29google.html">like</a> <a href="http://techcrunch.com/2010/12/17/google-takes-another-big-step-to-retain-employees-autonomous-business-units/">these</a> do point out the difficulties).  It would be easy to pick on Google&#8217;s failures and point out how many of their successes were actually acquired by buying a startup, but what I really wanted to talk about is how (IMHO) Ted is misunderstanding startups.</p>
<p>Ted&#8217;s central point seems to be:</p>
<blockquote><p>But if your primary interest is to doing great engineering work, then you want go to company that has a proven business model.</p></blockquote>
<p>Phrased so broadly, that&#8217;s bad advice.  The reasoning that leads Ted to that bad advice starts with two contradictory misunderstandings of startups:</p>
<blockquote><p>These days, the founder or founders will have a core idea, which they will hopefully patent, to prevent competitors from replicating their work, just as before. [...] most of the technology developed in a typical startup will tend to be focused on supporting the core idea that was developed by the founder.</p></blockquote>
<p>and</p>
<blockquote><p>Because if you talk to any venture capitalist, a startup has one and only one reason to exist: to prove that it has a scalable, viable business model.</p></blockquote>
<p>In my experience, startups typically start with the founders deciding they&#8217;ve found a problem they can solve better, cheaper or faster &#8212; but it&#8217;s rare for founders to have an idea that&#8217;s developed enough to patent the whole thing. Ted I think implies that at a startup, the founders have figured everything out and everyone else is just filling in the details of the idea.  To me, that seems completely backwards: if you go to a big company with an established business model, then almost certainly you&#8217;ll be working within the outline of that model (<a href="http://en.wikipedia.org/wiki/Disruptive_technology">Innovator&#8217;s Dilemma</a> and all that); at a startup, you&#8217;ll have to help the founders figure out just what the hell your company is supposed to be doing. And that gets to the second quote: a startup is an exercise in adapting the technology you&#8217;re building until you find the right business model.  In other words, nearly every startup will get it wrong to start with and have to change plans repeatedly; the hope is that the technology you build along the way is valuable enough that you can survive until you find the right way to make money.</p>
<p>To give one example from personal experience, when I was at Topspin working on InfiniBand products, early in the InfiniBand hype cycle (around 2001 or so), we thought that every OS would soon ship with InfiniBand drivers, so we focused on building switches and other networking gear, without worrying about the hosts that would be connected to the network.  It turned out that the <a href="http://infiniband.sf.net/">first open source project</a> for a Linux InfiniBand stack fizzled, and Windows also gave up on InfiniBand, so we ended up having to build an InfiniBand host stack &#8212; fortunately the embedded software from our switches already had most of the ingredients, and so we were able to pull it off by reusing our embedded work.  (That Topspin host stack ended up getting released as free software, and it became one of the ingredients that went into the current Linux InfiniBand stack &#8212; and I ended up as the InfiniBand maintainer for the Linux kernel, while working for a startup)</p>
<p>So as I said before, I think it&#8217;s bad advice to suggest to someone that &#8220;real&#8221; engineering can only be done at a large company.  Certainly there are huge differences between working at a big company and a small company, and I do believe that there are &#8220;big company people&#8221; and &#8220;small company people.&#8221;  If your goal is to spend nearly all your time making incremental improvements in ext4, sure, it&#8217;s probably easier to do that at a company that is a big enough ext4 user for that work to pay off; on the other hand if you&#8217;d rather work on something that you&#8217;re making up as you go along and where your decisions shape the whole future of the company, then a startup is probably a better place for you. Similarly, Ted&#8217;s assertion</p>
<blockquote><p>For most startups, though, open source software is something that they will <em>use</em>, but not necessarily develop except in fairly small ways.</p></blockquote>
<p>misses the real distinction.  There are plenty of startups where open source is the main focus (<a href="http://www.cloudera.com/">Cloudera</a>, <a href="http://www.riptano.com">Riptano</a> and <a href="http://www.strobecorp.com/">Strobe</a> are just a few that spring to mind; and I don&#8217;t mean to dis all of the others that I&#8217;m not namechecking here), and there are gazillions of big technology companies that are actively hostile to open source.  So really, if you want to get paid to work on open source, make sure you go to an open source company; the size of the company is a completely orthogonal issue.</p>
<p>To summarize my advice: if you think you might be a small company person, don&#8217;t let Ted scare you away from startups.  Oh, and happy holidays!</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2010/12/23/missing-the-point-on-startups/&via=rolanddreier&text=Missing the point on startups&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2010/12/23/missing-the-point-on-startups/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Transition to Linode complete</title>
		<link>http://digitalvampire.org/blog/index.php/2010/12/07/transition-to-linode-complete/</link>
		<comments>http://digitalvampire.org/blog/index.php/2010/12/07/transition-to-linode-complete/#comments</comments>
		<pubDate>Wed, 08 Dec 2010 05:36:52 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[personal]]></category>
		<category><![CDATA[vps hosting]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=114</guid>
		<description><![CDATA[I recently moved the VPS that hosts this blog from Slicehost to Linode.  Both are very nice hosting providers that give you full control over a Xen virtual machine, including root access to the distribution of your choice and a slick web control panel, but right now at least, Linode gives you roughly twice the [...]]]></description>
			<content:encoded><![CDATA[<p>I recently moved the VPS that hosts this blog from <a href="http://www.slicehost.com/">Slicehost</a> to <a href="http://www.linode.com/?r=0bef5ba60ece1ea681ccf5c82f59cca72ee19fcb">Linode</a>.  Both are very nice hosting providers that give you full control over a Xen virtual machine, including root access to the distribution of your choice and a slick web control panel, but right now at least, Linode gives you roughly twice the RAM as well as substantially more storage and bandwidth for the same price as Slicehost.</p>
<p>The main point of this post is really just to include my <a href="http://www.linode.com/?r=0bef5ba60ece1ea681ccf5c82f59cca72ee19fcb">Linode referral link</a> &#8212; if you&#8217;re going to sign up for Linode anyway, why not use my link and save me a few bucks on hosting?</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2010/12/07/transition-to-linode-complete/&via=rolanddreier&text=Transition to Linode complete&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2010/12/07/transition-to-linode-complete/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Two notes on IBoE</title>
		<link>http://digitalvampire.org/blog/index.php/2010/12/06/two-notes-on-iboe/</link>
		<comments>http://digitalvampire.org/blog/index.php/2010/12/06/two-notes-on-iboe/#comments</comments>
		<pubDate>Tue, 07 Dec 2010 06:33:48 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[infiniband]]></category>
		<category><![CDATA[rdma]]></category>
		<category><![CDATA[iboe]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=107</guid>
		<description><![CDATA[I want to mention two things about IBoE.  (I&#8217;m using the term InfiniBand-over-Ethernet, or IBoE for short, for what the IBTA calls RoCE for reasons already discussed) First, we merged IBoE support on mlx4 devices into the upstream kernel in 2.6.37-rc1, so IBoE will be in upstream kernel for the 2.6.37 release &#8212; one fewer [...]]]></description>
			<content:encoded><![CDATA[<p>I want to mention two things about IBoE.  (I&#8217;m using the term InfiniBand-over-Ethernet, or IBoE for short, for what the IBTA calls RoCE for reasons <a title="Rocky Roads" href="http://digitalvampire.org/blog/index.php/2010/04/19/rocky-roads/">already discussed</a>)</p>
<p>First, we merged IBoE support on mlx4 devices into the upstream kernel in 2.6.37-rc1, so IBoE will be in upstream kernel for the 2.6.37 release &#8212; one fewer reason to use OFED.  (And by the way, we used the term IBoE in the kernel)  The requisite libibverbs and libmlx4 patches are not merged yet, but I hope to get to that soon and release new versions of the userspace libraries with IBoE support.</p>
<p>Second, a while ago I promised to detail some of my specific critiques of the IBoE spec (more formally, &#8220;Annex A16: RDMA over Converged Ethernet (RoCE)&#8221; to the &#8220;InfiniBand Architecture Specification Volume 1 Release 1.2.1&#8243;; if you want to follow along at home, you can download a copy from <a title="Specification Download" href="http://infinibandta.org/content/pages.php?pg=technology_download">the IBTA</a>).  So here are two places where I think it&#8217;s really obvious that the spec is a half-assed rush job, to the detriment of trying to create interoperable implementations.  (Fortunately everyone will just copy what the Linux stack does if they don&#8217;t actually just reuse the code, but still it would have been nice if the people writing the standards had thought things through instead of letting us just make something up and hope it there are no corner cases that will bite us later)</p>
<ul>
<li>The annex has this to say about address resolution in A16.5.1, &#8220;ADDRESS ASSIGNMENT AND RESOLUTION&#8221;:<br />
<blockquote><p>The means for resolving a GID to a local port address (i.e. SMAC or DMAC) are outside the scope of this annex. It is assumed that standard Ethernet mechanisms, such as ARP or Neighbor Discovery are used to maintain an appropriate address cache for RoCE ports.</p></blockquote>
<p>It&#8217;s easy to say that something is &#8220;outside the scope&#8221; but, uh, who else is going to specify how to turn an IB GID into an Ethernet address, if not the spec about how to run IB over Ethernet packets?  And how could ARP conceivably be used, given that GIDs are 128-bit IPv6 addresses?  If we&#8217;re supposed to use neighbor discovery, a little more guidance about how to coordinate the IPv6 stack and the IB stack might be helpful.  In the current Linux code, we finesse all this by assuming that (unicast) GIDs are always local-scope IPv6 addresses with the Ethernet address encoded in them, so converting a GID to a MAC is trivial (cf <code>rdma_get_ll_mac()</code>).</li>
<li>This leads to the second glaring omission from the spec: nowhere are we told how to send multicast packets.  The spec explicitly says that multicast should work in IBoE, but nowhere does it say how to map a multicast GID to the Ethernet address to use when sending to that MGID.  In Linux we just used the standard mapping from multicast IPv6 addresses to multicast Ethernet addresses, but this is a completely arbitrary choice not supported by the spec at all.</li>
</ul>
<p>You may hear people defending these omissions from the IBoE spec by saying that these things should be specified elsewhere or are out of scope for the IBTA.  This is nonsense: who else is going to specify these things?  In my opinion, what happened is simply that (for non-technical reasons) some members of the IBTA wanted to get a spec out very quickly, and this led to a process that was too short to produce a complete spec.</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2010/12/06/two-notes-on-iboe/&via=rolanddreier&text=Two notes on IBoE&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2010/12/06/two-notes-on-iboe/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Was it something I said?</title>
		<link>http://digitalvampire.org/blog/index.php/2010/06/03/was-it-something-i-said/</link>
		<comments>http://digitalvampire.org/blog/index.php/2010/06/03/was-it-something-i-said/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 21:42:27 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[hacking]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=100</guid>
		<description><![CDATA[I saw that OpenBSD 4.7 was released a couple of weeks ago.  I tried to help, I really did. I used to have a fanless 600MHz VIA system with a cheapie Airlink 101 Wi-Fi card that I used as a home wireless router.  I ran OpenBSD on it for a few reasons &#8212; at the [...]]]></description>
			<content:encoded><![CDATA[<p>I saw that <a href="http://openbsd.org/47.html">OpenBSD 4.7</a> was released a couple of weeks ago.  I tried to help, I really did.</p>
<p>I used to have a fanless 600MHz VIA system with a cheapie Airlink 101 Wi-Fi card that I used as a home wireless router.  I ran OpenBSD on it for a few reasons &#8212; at the time I started, the OpenBSD wireless stack was ahead of Linux; their security obsession appealed to me; and not using Linux <em>everywhere</em> seemed like a fun thing to do.  It all worked pretty well, except that the wireless interface sometimes got stuck while forwarding heavy traffic.  For quite a while, I survived with hacks similar to <a href="http://marc.info/?l=openbsd-misc&amp;m=124087086023018&amp;w=2">this nutty crontab entry</a>.</p>
<p>Eventually, though, I said to myself, &#8220;Self, you&#8217;re a kernel hacker.  You should be able to fix this driver.&#8221;  And indeed, after a couple of evenings of hacking, I figured out what was wrong and came up with a <a href="http://marc.info/?l=openbsd-misc&amp;m=125895269930106&amp;w=2">patch</a> that improved things immensely for me.  The problem was that the driver was not written with a system as slow as mine in mind, and it got confused if more than one interrupt happened before it got a chance to service the first interrupt &#8212; you can read the patch description for full details.  Of course, being a good free software citizen, I sent my patch to the OpenBSD mailing lists so that it could be applied upstream.</p>
<p>Here&#8217;s where things went wrong.  I never heard from the author of this driver &#8212; I got no reply when I reported the original bug, and no replies to any mail I sent about my patch.  I did get several reports from other users who had the same problem and found that my patch fixed things for them as well, and finally another OpenBSD committer wrote, &#8220;<a href="http://marc.info/?l=openbsd-tech&amp;m=126377050811426&amp;w=2">Then if no one objects I&#8217;ll commit it tomorrow.</a>&#8220;  Unfortunately, at this point the original driver author did seem to get interested &#8212; he sent private email to this committer (not copying the mailing list or me) objecting, and so we ended up with, &#8220;<a href="http://marc.info/?l=openbsd-tech&amp;m=126420338617838&amp;w=2">Objections were made.  Apparently this patch only works for AP and does funky stuff to the hardware.  So back to the drawing board on this one.</a>&#8220;  As I said, all of my attempts to work directly with the driver author to find out what those objections were or how to improve the patch were ignored.</p>
<p>At this point I gave up on getting my patch upstream (and when I upgraded my wireless network to 802.11n, I chose a MIPS box running OpenWrt).</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2010/06/03/was-it-something-i-said/&via=rolanddreier&text=Was it something I said?&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2010/06/03/was-it-something-i-said/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Rocky roads</title>
		<link>http://digitalvampire.org/blog/index.php/2010/04/19/rocky-roads/</link>
		<comments>http://digitalvampire.org/blog/index.php/2010/04/19/rocky-roads/#comments</comments>
		<pubDate>Mon, 19 Apr 2010 23:16:42 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[infiniband]]></category>
		<category><![CDATA[rdma]]></category>
		<category><![CDATA[dcb]]></category>
		<category><![CDATA[iboe]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=97</guid>
		<description><![CDATA[I saw that the InfiniBand Trade Association announced the &#8220;RDMA over Converged Ethernet (RoCE)&#8221; specification today.  I&#8217;ve already discussed my thoughts on the underlying technology (although I have a bit more to say), so for now I just want to say that I really, truly hate the name they chose.  There are at least two [...]]]></description>
			<content:encoded><![CDATA[<p>I saw that the InfiniBand Trade Association announced the &#8220;<a title="RoCE" href="http://infinibandta.org/content/pages.php?pg=press_room_item&amp;rec_id=663">RDMA over Converged Ethernet (RoCE)</a>&#8221; specification today.  I&#8217;ve <a title="RDMA on Converged Ethernet" href="http://digitalvampire.org/blog/index.php/2009/03/25/rdma-on-converged-ethernet/">already discussed</a> my thoughts on the underlying technology (although I have a bit more to say), so for now I just want to say that I really, truly hate the name they chose.  There are at least two things that suck about the name:</p>
<ol>
<li>Calling the technology &#8220;RDMA over&#8221; instead of &#8220;InfiniBand over&#8221; is overly vague and intentionally deceptive.  We already have &#8220;RDMA over Ethernet&#8221; &#8212; except we&#8217;ve been calling it iWARP.  Choosing &#8220;RoCE&#8221; is somewhat like talking about &#8220;Storage over Ethernet&#8221; instead of &#8220;Fibre Channel over Ethernet.&#8221;  Sure, FCoE is storage over ethernet, but so is iSCSI.  As for the intentionally deceptive part: I&#8217;ve been told that &#8220;InfiniBand&#8221; was left out of the name because the InfiniBand Trade Association felt that InfiniBand is viewed negatively in some of the markets they&#8217;re going after.  What does that say about your marketing when you are running away from your own main trademark?</li>
<li>The term &#8220;Converged Ethernet&#8221; is also pretty meaningless.  The actual technology has nothing to do with &#8220;converged&#8221; ethernet (whatever that is, exactly); the annex that was just release simply describes how to stick InfiniBand packets inside a MAC header and Ethernet FCS, so simply &#8220;Ethernet&#8221; would be more accurate.  At least the &#8220;CE&#8221; part is an improvement over the previous try, &#8220;Converged Enhanced Ethernet&#8221; or &#8220;CEE&#8221;; not only does the technology have nothing to do with CEE either, &#8220;CEE&#8221; was an IBM-specific marketing term for what eventually became Data Center Bridging or &#8220;DCB.&#8221;  (At Cisco we used to use the term &#8220;Data Center Ethernet&#8221; or &#8220;DCE&#8221;)</li>
</ol>
<p>So both the &#8220;R&#8221; and the &#8220;CE&#8221; of &#8220;RoCE&#8221; aren&#8217;t very good choices.  It would be a lot clearer and more intellectually honest if we could just call InfiniBand over Ethernet by its proper name: IBoE.  And explaining the technology would be a bit simpler too, since the analogy with FCoE becomes a lot more explicit.</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2010/04/19/rocky-roads/&via=rolanddreier&text=Rocky roads&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2010/04/19/rocky-roads/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>First they laugh at you&#8230;</title>
		<link>http://digitalvampire.org/blog/index.php/2009/11/20/first-they-laugh-at-you/</link>
		<comments>http://digitalvampire.org/blog/index.php/2009/11/20/first-they-laugh-at-you/#comments</comments>
		<pubDate>Fri, 20 Nov 2009 20:44:27 +0000</pubDate>
		<dc:creator>roland</dc:creator>
				<category><![CDATA[infiniband]]></category>
		<category><![CDATA[rdma]]></category>

		<guid isPermaLink="false">http://digitalvampire.org/blog/?p=94</guid>
		<description><![CDATA[I found this article in &#8220;Network Computing&#8221; pretty interesting, although not exactly for the content.   Just the framing of the whole article, with Microsoft is touting the fact that they&#8217;ve managed to achieve performance parity with Linux on some HPC benchmarks as an achievement (and putting up a graph that shows they are still [...]]]></description>
			<content:encoded><![CDATA[<p>I found <a title="Microsoft Windows HPC Beta On Par with Linux" href="http://www.networkcomputing.com/servers-storage/microsoft-windows-hpc-beta-on-par-with-linux.php">this article in &#8220;Network Computing&#8221;</a> pretty interesting, although not exactly for the content.   Just the framing of the whole article, with Microsoft is touting the fact that they&#8217;ve managed to achieve performance parity with Linux on some HPC benchmarks as an achievement (and putting up a graph that shows they are still at least a few percent behind), shows how dominant Linux is in HPC.  Also, the article says:</p>
<blockquote><p>The beta also reportedly includes optimizations for new processors and can deploy and manage up to 1,000 nodes.</p></blockquote>
<p>So in other words Microsoft is stuck at the low end of the HPC market, only usable on small clusters.</p>
<div style="float: right; margin-left: 10px;"><a href="http://twitter.com/share?url=http://digitalvampire.org/blog/index.php/2009/11/20/first-they-laugh-at-you/&via=rolanddreier&text=First they laugh at you...&related=rolanddreier:&lang=en&count=horizontal" class="twitter-share-button">Tweet</a><script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script></div>]]></content:encoded>
			<wfw:commentRss>http://digitalvampire.org/blog/index.php/2009/11/20/first-they-laugh-at-you/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

