SAN JOSE: DESPITE CHIP DESIGNER Nvidia's insistence that its GPU Technology Conference (GTC) was really all about CUDA and research applications, the conference was punctuated by hardware announcements.
Nvidia's GTC is an important showcase for the firm, a venue where its CEO Jen-Hsun Huang can spend 150 minutes on stage talking about his company and how great it is. However for all of the firm's speeches and the many interesting talks given by the firm's engineers, as well as those delivered by academics and industry figures, this GTC will be remembered for the 15 minute roadmap presentation during which Jen-Hsun Huang talked about the firm's future GPUs and system on chip (SoC) processors.
Huang's roadmap revelations didn't contain many deep technical details however they did reveal how the firm intends to address two key issues. The first of Huang's admissions was that Nvidia's 'Logan' Tegra SoC will finally become a GPGPU and run CUDA, and second was the admission that memory bandwidth to feed the GPU is the biggest hurdle facing improvements in application performance.
Earlier this week I wrote about how Nvidia needs to bring GPGPU computing to its Tegra SoC processors if it wants to move from the research and high performance computing (HPC) realm into the far larger and more profitable consumer market. As if in response, Nvidia's Kayla test board is set to appear in March and will allow developers to use a Tegra 3 SoC processor to drive a GPGPU of choice.
Seco, the company that will manufacture the boards, already had a development Kayla board on its stand connected to a Geforce GTX Titan graphics card. Of course developers can use more modestly priced graphics silicon, including the ability to run AMD Firepro graphics cards if they want.
However what Nvidia's Kayla testbed and more specifically Logan chips show is their growing independence from big, powerful CPUs. Effectively what Nvidia is saying is that you can get access to a Geforce GTX Titan's 4.5 TFLOPS of single precision floating point computing power with a relatively 'wimpy' ARM CPU core.
After Huang talked to the public for 150 minutes he rushed to tell analysts and journalists that the firm is investing $600m a year in its Tegra SoC product line, compared to $1bn for the rest of its products combined. Nvidia's skewed investment strategy shows how important Tegra is to its future because it will bring the firm's GPUs to three of its most important markets - consumer electronics, automotive applications and commercial datacentres.
Nvidia's presense in the consumer electronics market is already well known and while some might argue that Tegra hasn't cornered the market, the firm has scored some impressive wins, most notably Google's Nexus 7 tablet. However the company has done very well in the automotive market, where the Audi Group and BMW have chosen Tegra chips.
While the idea of Nvidia's Tegra chips powering car entertainment systems might sound like overkill, it is easy to underestimate the impact of high resolution Google Maps and Google Streetview coupled with impressive heads-up display graphics. I spent 30 minutes driving around San Jose in a BMW 745i with all the bells and whistles that Nvidia's Tegra chip enabled and if you have $106,000 to spend on one you will get the same 'infotainment' experience as you might on a tablet, making most existing in-car information and entertainment systems look decidedly Soviet Bloc.
Nvidia also told me that it will slowly move into other automotive market areas such as driver assistance and active safety management, which means it will sell more chips with longer product cycles to recover investments.
As for servers, just like AMD, Nvidia stands to gain a lot from having an ARM SoC that has a full-fledged GPU, with the only missing link being a memory controller that supports ECC memory - a relatively minor challenge in the grand scheme of things.
Huang's admission that memory bandwidth to the GPU is limiting performance reflects a reality that many software developers already know about. The firm is taking a gamble with stacking DRAM on the same substrate as Volta but the benefits will be worth the risk if Nvidia, SK Hynix and TSMC come together to pull it off.
Huang claimed Nvidia is shooting for memory bandwidth in the region of 1TB/s, which is four times what the Kepler based Tesla K20 GPGPU card achieves now. Should Huang's figures not turn out to be pie in the sky, that will be a very impressive improvement and one that should help the firm gain a marked advantage in the HPC and server markets, especially if it can increase memory capacity at the same time.
For this reporter Nvidia's GTC this year was an interesting procession of talks on how to optimise code to take advantage of Nvidia's hardware and highlighting some of the pitfalls and challenges of GPGPU computing. It should be noted that Nvidia's GPUs are not the only ones to suffer from memory bandwidth limitations, as GPGPU accelerators from AMD and Intel as well as FPGA boards all face the same problems.
Nevertheless, understanding software developer challenges is not what GTC 2013 will be remembered for. Rather, Huang's 15 minute roadmap presentation is what the industry will look at when it wants to account for Nvidia's steadily growing industry momentum. And for Nvidia, that momentum will come in the form of Tegra. µ