AT ISC 2012, Intel said specifications for its Xeon Phi accelerator card are yet to announced because it is still working with its customers to "understand what the final production configuration will be".
Intel announced that its Many Integrated Core (MIC) architecture will be rebranded as Xeon Phi and even showed off a working board running Linpack. However the firm would not be drawn on the card's specifications, instead opting for vague terms such as "greater than 50 cores" and "8GB plus of GDDR5 memory", while not even hinting at clock speeds.
When The INQUIRER asked Rajeeb Hazra, VP of Intel's Architecture Group why details were so vague given the firm's claims it will be putting Xeon Phi into production by the end of the year, Hazra said the company is still trying to work out a balance between cost, performance and memory. "Part of the reason for us is we are still working with the customers to understand what the final product configuration will be. Whether it is 52 core or 64 cores or 59 cores or 53 cores, there are multiple factors that determine everything that has to do with performance, but not just confined to it. Things like yields become important, we're looking at memory capacity, cost, performance needs and what the memory ecosystem at that point is going to provide. So it's about keeping options open and not misinforming people early if we were to make changes, we are working continuously with our customers."
Intel said its Xeon Phi boards will have at least 8GB of GDDR5 memory, which is a third more than current generation Nvidia Tesla cards. Hazra told The INQUIRER that local memory will be what determines the overall performance of Xeon Phi, which makes it all the more surprising that the firm didn't provide more details on how much local memory the cards will have and how fast it will be running.
Hazra said, "The amount of local memory for Knights Corner [Xeon Phi] is going to be critical in the efficiency of that part. At the end of the day if you don't have enough memory locally on the card, you cannot fit data and workloads into it and you're going to need to have the communication going back and forth between the host. Providing the right amount of memory, more memory, and clearly you don't want to make it infinitely large because of cost, and eventually you block into that memory and more does not help you. But we do think local memory capacity and the reliability of the local memory is going to be pretty important."
Intel's decision to choose GDDR5 memory, said Hazra, was a combination on price, performance and capacity. "For now GDDR was the right solution both from a capacity standpoint and perhaps more important from a performance standpoint. When you have these many cores and FLOPS packed into a chip you have to feed it or you are going to waste the compute power. And compared to DDR, GDDR was the right bandwidth solution for that part. [...] It was a technical decision based on giving it the right memory subsystem and right now for the higher memory bandwidth that this card needs, GDDR was the right solution," said Hazra.
Although Hazra said GDDR offered the right capacity, last week Sumit Gupta, senior director of Tesla GPU Computing at Nvidia said the firm will put more memory on its accelerator boards once it can get hold of higher density memory chips, adding that its Kepler GPGPUs can address 1TB of memory. As for why Intel could go beyond the 6GB found on Nvidia's Tesla boards, Hazra said, "We have a long deep relationship with memory vendors and we work with them to assure ourselves there will be enough capacity of parts at the right memory density we need for what we think would be a very exciting product. We were driven by how much memory bandwidth and power you need to make Knights Corner an effective compute engine for the class of workload that it's targeted at. That was driving concern, less so than what the competition is going to pack."
Hazra's comments about memory being an important performance determiner is something that AMD and Nvidia have long had to deal with, and it highlights one of the major problems with accelerator boards for workloads that deal with large datasets. His comments make it all the more surprising that Intel didn't come out with more details of capacities and bandwidth of the GDDR memory on the Xeon Phi, but given that Intel can get its hands on higher memory densities, then AMD and Nvidia shouldn't be too far behind. µ
Sign up for INQbot – a weekly roundup of the best from the INQ