WE'VE ALL talked a lot about Barcelona delays, under-deliveries and all the plagues including the infamous "TLB errata".
Try to find Barcelona-based servers around your corporate neighbourhood - you may as well look for a needle in a haystack. Even with these mounting headaches, AMD still managed to keep substantial presence, if not lead above Intel, in the top of the top, the large scale supercomputing arena.
When you look, for instance, at the more recent US government projects, you'll see that a proportion of AMD-based systems is still high, not just because of the Hypertransport and memory bandwidth scaling with each added CPU. The previous Opteron generation had an advantage over Intel platform in 64-bit code when using large page sizes - like in Global Linpack benchmarks, for instance. The Nehalem generation might solve this problem, though.
Also, according to our best friends amongst that user base, there are some known issues in the SSE3+ instructions, not adopted by AMD that cause system crashes with large page files - the US Gov't has a fix, but I'm not so sure of Chinese and Russian supercomputer users relying on the same chip having detected or solved that problem, for that matter. In a few cases, when using a given HPC code on both Intel and AMD 64-bit machines, the AMD64 code had to be tweaked to run on EM64T, even though, in say generic Windoze or Linux, there are usually no problems.
Finally, when, at the end of 2008, you look at the top of the Top 500 list during the next SuperComputing conference, you may see several near-petaflop-class AMD-based US government systems appearing there - names like "Roadrunner", "Baker" and "Ranger" should mean something to those in the know.
However, the accumulated delays and performance target misfirings did affect the confidence of the rest of the world, it seems. We talked recently about AMD's Asian supercomputing wins and their delayed fulfilment. One of those was a large, 150+ TFLOPs central Korean research supercomputer at their famous KISTI institute nestled among the hot springs.
The original AMD-based spec that Sun Microsystems won the deal with some 9 months ago, revolved around many 2.5+ GHz Barcelona quad core CPUs. As you know, a 2.5 GHz Barcelona would, with 10 GFLOPs peak per core, provide 40 GFLOPs peak per chip, or some 4,000 chips (over 500 eight-socket systems) to come close to the performance goal. The above mentioned HPC performance benefits with total bandwidth, SMP scaling and large pages and so on would have a factor in deciding, too - especially since all that happened before Intel's 45 nm stuff was out there.
This would have been the largest supercomputer at the time in Asia, and a major statement of confidence in AMD platform, keep in mind.
We all know what happened next, both with the performance and the deliveries. And, the inevitable followed - the last I heard recently was Sun staff were busily replacing AMD with new 45 nm Intel specs, and the Koreans will, now, have Intel "eight-brain chips" - to use BBC Teletubby-IQ wonderspeak, "thinking" inside their, and probably Asian, largest supercomputer in these few months.
To me, this loss, or should I say a fateful AMD-to-Intel "switch", is more than just a few thousand of chips going the Intel way rather than AMD. The perception that AMD just went overboard with the CPU screwups, and the huge site switch of this proportion is bound to have a snowball effect among many high-teraflop and near-petaflop projects going in North Asia, the world's fastest growing and already second biggest HPC market. So, unless Barcelona B3 changes this suddenly, this is going to be the first switch among many.
Also, this win may embolden Intel to fix the remaining performance drawbacks they had vs AMD in the supercomputing arena. Nehalem or its 32 nm successor, Westmere, may have that done, just like the interconnect and integrated memory were solved. Worst come to worst, just resurrect the EV9 Alpha as QuickPath-based "computation accelerator" for those Nehalems, and voila - you got perfect flat memory model, superb code performance and all the bandwidth in the world, coupled with a proven X86 front end.
Once (not "if") these are solved, what is there to stop Intel from aggressively attacking this last niche of AMD's domination? Yeah, there isn't much real hardware profit in there, but the exposure, the "mine is bigger/faster than yours" statement of strength, and, of course, the government user influence worldwide, is, to paraphrase MasterCard ads, priceless.
Not to forget, South Korea was one of the sites of investigation of alleged Intel malpractices, and, in this situation, AMD getting a deal and misdelivering it, will not help their case ultimately.
Actually, Sun could have stuck with AMD till the end, and possibly kept the deal for the DAAMIT gang, if AMD listened to that same Sun and offered a Barcelona MP version with all four originally planned HT links at HT3 speed, rather than dumbing it down to socket-compatible three HT1 links as "other" (read: more Intel-centric) Opteron vendors requested. The added bandwidth and scaling efficiency in those 8-socket Galaxy boxen may have just convinced the users to wait a little more.
This way, by the time "Montreal" and such CPUs are supposed to come out from AMD sometime in 2009, Intel will already have 4-link QPI on Beckton 8-core Nehalems ready - and, the advantage they could have had for a year, will be nowhere...
As for Sun Microsystems, they are investing big time in supercomputing this year, including hiring tons of people and designing more HPC-centric systems. The problem is, many more of those may be Intel-based this time. µ
Tags: Amd
How much did AMD loss from all this what was the original deal worth?
the above mentioned computers could possibly run crysis at decent framerates.
Doesn't Intel still use a memory controller that consumes 20watts engaged or not?
This is not good news for AMD, that is for certain. I am an AMD fan (well more about having some competition so it isn't just an Intel world). After reading so many "articles" here that read more like Blog posts by teenages, it was really refreshing to read a professional, well written article on this subject. Kudos.
FYI Intel has gone a long ways in closing the gap with AMD with the 5400 series Xeon's and chipset. The 1600Mhz dual FSB closes the bandwidth gap, low latency FB DIMM close the latency gap. Combined with high clock speeds (3.2Ghz+), large caches, and SSE4 a 5472 or 5482 easily clobbers the best AMD has to offer in all but the most latency sensitive tightly coupled applications.

These latency sensitive applications is keeping AMD treading water in that many of the high end clusters use esoteric low latency conntections like IB hanging off a HT link for their MPI network - and Intel still doesn't have a real solution to this. PCIe 2.0 gives us the bandwidth, but the latency is still far worse. 

If you take the 5400's performance advantage and add in QuickPath to allow a low latency MPI network and AMD has no hope to compete - even with a Barcelona at nice clocks. And HT3 is not enough to swing it back.

Look at the render farms - they are all jumping to Intel as they are not tightly coupled applications - each frame is rendered seperately from the others - so they are CPU bound and Intel gives them a big boost.

In our render farm a single 5482 outperforms 2 AMD 2222SE's!

Hope this helps

SG
Great read....well written
I don't really think it should surprise us anymore if AMD loses a major contract. AMD has failed to execute properly now for several years. There new core should have been out years ago instead of just arriving. It seems, strictly from an observational point of view, that instead of continuing to innovate and stay ahead, they rested on there laurels and somehow didn't expect Intel to ever catch up. I think everyone can agree that AMD has made some very large mistakes in strategic planning and that they are now paying for it heavily.
It's clear from re-reading Nebojsa's article of 27 Sept last, "AMD Barcelona: HT3 Turn Off?" that Sun is responding to a betrayal of trust on the part of AMD. Their HPC architecture may be such that a hobbled 3xHT2 Barcelona has little or no advantage over a DIB Xeon in the overall scheme. Sun has a contract committment to fulfill and they might just as well institute a platform switch now rather than wait for CSI which may also have teething problems.