WORKSTATION AND SERVER VENDOR Silicon Graphics International (SGI) represents how a once fiercely proprietary company has been able to leverage open source for High Performance Computing (HPC), much to its benefit.
Following multiple bankruptcies, a change of its iconic logo and replacing 'Incorporated' with 'International', SGI has learned the hard way that the time for going it alone is long gone. It many ways it has realised long before some larger companies that, rather than fight a losing battle against the open source movement, it should embrace it.
Perhaps it is this acceptance of open source that led SGI's CTO, Dr Eng Lim Goh to ditch a fancy conference setting and instead talk with a bunch of technologists at a Greater London Linux User Group (GLLUG) meeting held at University College London (UCL).
Goh's belief in his firm's technology and that of the open source community was given away by the title of his talk, "Linux is supercomputing". Underlying that claim was the fact that SGI has managed to get the standard Linux kernel, available from the kernel.org repository, to work on systems having 4,096 cores.
Jokingly, Goh referred to the system as a "standard PC, just bigger". While his liberal interpretation of what is a 'standard PC' is debatable, it is a nod towards how HPC vendors, even the company that bought HPC icon Cray back in 1996, have been forced to embrace standard off-the-shelf hardware components in order to remain competitive.
The 'standard PC' Goh was referring to is the firm's Altix UV system, which is a collection of Nahalem EX blade servers. To show off the system, which was located the company's offices back in California, Goh dispensed with unconvincing Powerpoint slides and meaningless graphs, instead using Secure Shell (SSH) to login to the system remotely and show it off in real time.
Once logged in, no mean feat given the restrictions on Internet access at UCL's campus, Goh showed the expectant crowd the configuration of his system. As it was running Linux (188.8.131.52), Goh had to forgo 3DMark, SYSmark, Sandra and the usual suspects, instead opting to type 'more /proc/cpuinfo'. As the system had 2048 cores, the output was somewhat longer than produced by a typical desktop PC. To show the audience that it was indeed a 'standard PC', Goh then typed 'lspci' to show the many PCI buses and devices connected to them. After 10 seconds, he decided to interrupt the output, to the amusement of the enthralled audience.
Goh paused for a moment for questions, with one chap asking how long the machine took to boot up. Laughing, Goh admitted that when SGI initially designed the machine it took around 30 minutes because four terabytes of RAM "take a while to check". However since then, SGI has managed to bring that time down to a mere 15 minutes, though Goh wouldn't say how.
After Goh reveled for a moment in the mix of gasps and laughter from the audience, he suddenly proclaimed that playtime was over and that it was time to run some benchmarks. Bringing up a remote desktop viewer to display a simple resource usage meter showing CPU and RAM utilisation, Goh compiled some trivial code. While the code merely consisted of four 'for loops', its aim was to show the allocation of memory, all four terabytes of it. To do this, the code generated a four-dimensional array, commonly used in research for simulation.
At this point most demonstrations would leap to a video showing how this worked perfectly in a controlled environment, but Goh asked the crowd, who understood the purpose of the code, to dare him to compile and run the code. Surprisingly, no one decided to take up the challenge leaving Goh to simply go ahead and do it regardless.
Pointing to the resource meter, Goh explained that the code essentially was a memory allocation exercise to show how the firm can use a maximum of 16TB of RAM as a single memory space. Why stop at 16TB? Goh answered his own question by stating it was the limitation of the 44-bit virtual memory addressing in Intel's x86-64 Xeon chip. He claimed that the chipmaker will be increasing the virtual address length to 46-bits in 2012, which will allow for even larger physical memory configurations.
After making a quip that the hard disk capacity would not allow core dumps, Goh decided to show how Linux enables fine grain control over processes, allowing particular cores to be engaged. The impressive aspect of the demonstration wasn't that the tasks completed without any trouble but rather the simplicity with which Goh was able to apply basic tools and code, in real time, to harness the power of a supercomputer.
Goh was keen to mention that whatever SGI does to the Linux kernel, it does after making sure that any changes it makes will be accepted by the Linux kernel maintainers. Apparently getting the number of cores limit increased to 4,096 was something that he and his colleagues had to personally convince Linux founder Linus Torvalds to accept. According to Goh, convincing Torvalds was "tougher than designing the hardware".
Strip away all the demonstrations and geeky remarks and you're left with Goh's belief that Linux represents supercomputing on the desktop. What Goh evangelises is not SGI's HPC machine but rather the ability for anyone to freely download and install, on a £200 netbook, the same operating system that his firm is loading onto its 4,096 core, 16 terabyte beast. There are few operating systems that can claim such versatility, and most of those that can are open source.
Loading a machine with a Linux distribution won't make it a supercomputer, nonetheless it provides a great insight into how a free, open source, community led project has outshone well funded competitors. Some talk about how Linux should try to emulate closed source operating systems such as Microsoft's Windows and Apple's Mac OS X, but that simply invites the question of "why?", as neither of those systems can boast acceptance on such a wide variety of hardware.
The history of HPC is littered with names and companies, some more exotic than others from the staid DEC and IBM to the leading edge Cray and SGI. Perhaps it is a sad state of affairs that the former fashionistas of the HPC world ended up as mere footnotes, only mentioned with nostalgia, but that underscores the power of open source software run on commodity hardware.
For SGI, the firm has realised that the days of film studios paying big bucks for its hardware are long gone. In a bid to stave off a third bankruptcy, the company has finally embraced commodity hardware and become a fully fledged member of the open source community.
Like any company, it has to generate income and using open source software and commonly available hardware to do that might be controversial in some quarters of the computing world, however for the moment at least, it is keen to give back to the community and abide by its rules, and so its relationship with open source seems to be one of mutual convenience. µ
What could possibly go wrong...
Committee clams firm failed to implement 'adequate security'
Meme Ban means Meme Ban
It's anonymous data at first but the NYT figured out how to make it personal