RIGHT NOW, as we all know, the CPU landscape looks pretty bleak for AMD if comparing the current products: Core 2, whether 65 nm or 45 nm, wipes the floor with the corresponding AMD offerings, whether in mobile, desktop of workstation & small server areas.
Oh yeah, I didn't mention the big MP boxen with four or more CPU sockets: that's where Intel either doesn't have an offering, or the existing Caneland platform isn't exactly tuned for maximum memory and interprocessor performance compared to Opteron's multiple HyperTransport channels and dedicated dual DDR2 per CPU.
In fact, the AMD HT2 platform, with three links per Opteron 8xxx series processor, has sufficient scalability to implement an eight-socket box, although it does get saturated at that point with multiple hops needed between "far" CPUs, accentuating the NUMA remote memory access penalty.
If, as originally expected, AMD went ahead to enable all four HT links in the Barcelona at full HT3 speed - yes, requiring the new socket early - we could have some really naughty eight-way boxen with incredible scalability now, leaving Intel's Caneland platform in the dust on many things despite the anaemic 2+ GHz AMD CPUs.
For whatever reasons, we are still stuck in the "compatibility mode" here. Good news is that, besides Sun Micro and a small gang of Taiwan vendors, another Tier 1 vendor is now offering an 8-way, 32-core, Barcelona box; the very top Tier 1, HP.
Take a look at the DL785 - it is a big 7U rack machine (the prefix 7 is for the height in U, for instance the 4-socket DL585 is 5U or 8.5+ inches high). The modular board system, not unlike the one pioneered in the old AlphaServer ES40 a decade ago, manages to squeeze all the CPUs and a LOT of memory in there: each processor has eight DDR2 DIMM sockets, each with up to an 8GB DIMM in there. Multiply 8 x 8 x 8 and you get a 512 GB maximum memory capacity, so large it can even hold many animal and plant complete genomes right in memory - the only machine in this class to pack that much RAM.
Add to that up to 16 SAS drives, 11 PCI-e slots (3 of them full 16x and another three 8x, so you could do a CrossFireX 3-D setup here, too), redundate PSUs and fans, and remote diagnostics.
The price to pay for this extra capacity? Well, it's much larger than Sun's older 8-way Opteron unit, the Andy Bertolsheim designed "Galaxy" masterpiece in just 4U height.
Of course, if the HT link scarcity is not a problem, you could use this thing to make larger supercomputer clusters with less nodes, i.e. simpler interconnects, to reach a specific TFLOPs performance target.
Nice machine overall, but what limits it? Besides, of course, the 2.3 GHz Barcelona 8356 Opterons' clock speed, which only a "Shanghai" CPU upgrade would address at year end? And, why only now, and not two years ago with Opterons at the performance leadership peak?
I had a quick chat with the HP people in charge here in the Far East on the above, and why they only recently decided to get the big box out. The answer I got was that the box is a performance and feature leader in its class, many commercial and virtualisation consolidation clients want it now - no mention of HPC here, where those extra inter-CPU hops can affect performance negatively.
Faced with the "Beckton" 8-core MP Nehalems some five quarters from now, HP will still give a level playing field to both Intel and AMD in this high end space. Of course, it does assume smooth "Shanghai", "Montreal", "Istanbul" and other major city delivery by AMD in the meantime.
They also believe that, even with minor hitches - and those current AMD hitches weren't exactly small - most AMD-based clients will still prefer to stay in that same AMD comfort zone. Of course, those AMD customers who needed leadership performance for their use would more likely be in a serious " discomfort zone" by now.
My point is simple: this HP box shows that AMD still has some five quarters of large 4-to-8 socket box performance and memory capacity leadership in certain areas (HPC FP in those cases where too few HT links don't cause slowdowns, as well as some memory-bound commercial apps), but that is most likely to be over, full stop, once Becktons come out.
I really want to believe that AMD will use this time to speed-up delivery of a 45 nm "Shanghai" product and maximise its presence in this small but profitable segment. µ