As we mentioned in both the earlier articles, AMD's big potential screwup is 65nm, Intel's is core delivery. For this article, we will assume they execute perfectly, meaning they do what they say. Depending on your mastery of the ancient art of fanboism, you may think one is more likely to face plant than the other, but factoring that in is an exercise for the reader.
2006 dawns with AMD is a commanding lead in the server space in terms of power use and computational ability, they simply are in a different class than Intel's woeful offerings. Intel uses double or more power, and routinely gets stomped in benchmarks. It is so bad that AMD's server Dual Core Duel, handily won by AMD, was greeted with a yawn because it was obvious that in a fair fight, Intel didn't stand a chance.
Intel is ahead on a lot of platform features, Integrated Lights Out (ILO) is pretty much standard on any real Intel server, but fairly rare on AMD. Same with things like service processors, and a lot of RAS features. The people that need these things, and it is not an insubstantial market, will almost have to go Intel, but Sun is adding features and closing the gap fast. Don't expect this lead to last for long.
So, if wattage and number crunching are you thing, AMD is your chip. If features are your desired goal, well, AMD is still so far ahead that it is probably worth it to risk not having them unless you simply can't live without a certain thing.
The end of 2005 sees AMD with the rather creaky E-Step (ES) Opterons and Intel with 'Round Rock' (Paxville) CPUs as the flag bearers. Paxville has one purpose in life, to keep Dell in the fold and allow the clueless to buy a dual core from Intel if their checklists demand it. Anyone with a brain or the ability to run a benchmark will pick AMD here. It isn't a surprise that Intel didn't supply review units, they knew the results.
The main weakness of Intel is power use, Poxville is around 150W while AMD has a booming business in 89W parts, and 68W are becoming more popular every day. For rack dense configs, you can pack twice the CPU density if you go AMD, and AMD does more per core as well. No contest.
The first salvo in the 2006 server wars will be Dempsey and Blackford. This is the one that will bring Intel back into the game, briefly. Blackford takes the dual independent busses of the older Twin Castle 4S chipsets to the 2S level. Basically, this doubles the available bandwidth to each socket. Dempsey takes this a step further by adding bus arbitration logic to the chips themselves, so they present one load to the bus rather than two.
Blackford will take the woefully bus bound Xeons and pretty much remove that as a problem. Could they use more bandwidth? Sure, but 2x 1066 is so much better than 1x 800 that it is laughable. Add in the cache snoop eliminating filter that will probably not see the light of day till Blackford II later in the year, and you have a winner.
Remember those RAS features I was talking about? Blackford brings an absolutely huge one into the mix, memory retry. If you get a transient memory error, Blackford will not only flag it and crash like older chipsets, but it can also retry until it works. Damn nice feature to have, and from what I am told, it works very well. Add FB-DIMMs in, and you have capacity to burn, speed and lower pin counts.
Blackford will be a winner for sure, not the self-inflicted wounds that characterised everything that touched a Xeon in 2005. It will quite simply get Intel right back into the game, not a win, but really close.
One of the most important things this will do is protect the bottom line of Dell. In case you haven't noticed, Dell has not been doing all that well of late in the server market, AMD is gaining marketshare in chunks that must give Intel execs sleepless nights. Most of that pound of flesh is coming from its best customer. To combat this, Dell has a program that will subsidise Intel based servers when bidding against HP Opterons, and probably other makers as well. It works like this, Dell will subsidise the servers to the price/performance level of an AMD rack, not just price. This will eliminate pretty much all profits and then some from the sale, but it keeps the customer. I wonder where that loss is being made up, and by whom?
Now, if Dell had to do this for more than a few months, Intel Inside would be history, and I mean more than just the logo. Shipping boxes out the door with $100 bills taped to the top is not a good way to please investors. This is where any further Intel slip will cost both sides dearly, neither can afford the current state of affairs, metaphorically or financially.
So, the Dempsey and Blackford combo will east this when it hits the street in late March or so, and it can't come soon enough. Intel went to the extraordinary effort of sending out seed units weeks after weeks after the Paxville launch to deflect criticism and show there is a light at the end of the tunnel. The Dempsey systems showed quite clearly that even six months early, they are in at least as good shape as I was hearing.
This euphoria will last a few short weeks, because AMD will come out with the F-Step (FS) Opterons in early Q2. We already mentioned that they will be lower power and have DDR2 in the desktop articles, both of these will play a big role on the server front.
Lower power comes from a tweaked core, not a smaller process. The FS cores won't set any records for lower power, but it will be a noticeable decrease. How much? Current cores are not near the family TDP numbers for power consumed, and there is enough yield headroom to put out a line of 68W parts at real volumes. Picture the peak of the current yield curve a bit above the 68W limit. FS will probably move that down to the point where the 68W parts could become the largest bin, so think more towards 5W of savings rather than 20.
DDR2 is also interesting, it lowers power use dramatically, which is where the FS probably makes a good chunk of the savings. It also increases bandwidth, a good thing, while increasing latency, a bad thing. To top it off, it allows AMD to add precious DIMM slots to the server mobos, coupled with increased density of the memory itself, you can potentially have much larger memory configurations. As icing on the cake, the voltage differential between the memory controller and the rest of the chip will be lessened, easing one of AMDs worst frequency scaling headaches.
The other FS features are the things that server people care about. 1400MHz HT will be a big one that is useful across the board, from 2 to 8 sockets. It will ease the scaling problems AMD faces between 4 and 8 sockets, and help a lot on non-local memory accesses. There is nothing really bad to say here.
The next big one is a revamped crossbar with 4 CPU ports, can you say quad core? I knew you could. They have been shown behind closed doors for a while, but look for a public debut at CeBit with 90nm parts, but I don't think they will productize it until 65nm after mid-year. In fact, they probably will sit on it until needed in the real world rather than pulling a Paxville. Unfair comparison really, AMD QC parts will be the real thing, Paxville wasn't. If Woodcrest lives up to the hype, look for earlier QC parts from AMD, it is a finance choice, not a technical one, do they want to burn the wafer area when they are capacity constrained ?
One way they will burn more wafer area is 4MB L3 caches, in this case shared between cores on the die. Yup, AMD is going to take the plunge and after three or so years, up the caches. About $(*&$% time, people. That will once again ease the bus burden and boost performance quite a bit, not to mention helping a bit with scalability.
There will also be new RAS extensions to the architecture, but they may lag the FS cores by a bit. These will not be a simple clone of the things Intel has done, it will be a simple clone of the things IBM has done with Power, and a bit more. AMD has shown that they can extend x86 in a meaningful way, and they are about to do it again, but this time it will be in a much more focused manner.
Why would you need more RAS and greater scalability, surely X86 doesn't play in the markets where that matters? Well, they didn't until now, AMD is shooting for 32 sockets gluelessly this time around, and to play there, you need some pretty reliable parts. A year and a half ago, I was told RAS would come when it was needed, and in the middle of 2006, it will be needed. Coincidentally, that is when it will come.
By the end of 2006, the X86 invasion of the high end will be pretty well complete. The chipping away at segments where people don't take them seriously will be about complete, and AMD will have a solution for everything from $100 laptops to 2048 socket servers. Yes, there is at least one 2000+ socket single system image box imminent using Opterons, won't that be good news for the HPC set.
With FS, AMD will regain the lead on pretty much all fronts. The 2.6GHz dual cores will be at 2.8 or so by then, and they will simply outclass the aging P4 based Xeons. The honeymoon will only last for a month or two, and then comes Woodcrest.
Woodcrest is the Merom based Xeon, and it will have all of the inherent goodness of that core, saddled to the well past it's prime platform of the current chips. A revamped Blackford will be just about enough to fed this 2.66GHz beast with it's 2x 1333MHz FSB. If they don't make it and stick with 2x 1066, the chip will be far less competitive.
If it wasn't saddled to this platform, it would annihilate AMD, but in this case, it will be simply very competitive. In a weird twist, because of the radical architectural change, it will be an Int monster, and a slightly lesser FP monster. The P4 based Xeons were the other way around, losing to AMD on Int, winning on FP, at least until AMD waked away from Intel on all fronts. That said, performance should meet or beat AMD on most benchmarks at the two socket level, but 4 is another matter entirely.
At two socket four core (2S 4C), the Q3/06 lineup will be a 3.0GHz FS vs 2.66 Woodcrest. If you take a Sun Fire X4200 with dual 2.4GHz Opterons as a base, you start with 71.9 SpecINT_Rate and 64.7 for SpecFP_Rate. Linear scaling of the scores by 3.0/2.4 gets you about 90 for Int and 81 for FP. Add in 10% for DDR2, HT speeds and cache, and you are at 99 for Int, 89 for FP. It probably won't scale linearly, so expect a bit less than that.
Woodcrest, if it comes out at 2.66/1333, will beat that handily on Int, and most likely at least tie it on FP. I would expect low hundreds in FP_Rate and high 80s on Int_Rate. If Intel comes out with the cores on time, they will at least be caught up, and when AMD goes to 3.2 as expected in Q4 and Intel hits 3.0, the gap will widen in Intel's favor.
The trick is that not all workloads mirror Spec, and Intel compilers are far and away better at producing Spec number than anything AMD can bring to the table. The Q3 parts will be more or less a tie, Q4 will be Intel by a bit on the 2S 4C front, and will probably remain in Intel's favor until K8L hits in 2007. It won't be a huge lead though, and there are enough variables at play to make choosing a server more of a case by case thing.
At the 4S level, well, there is still no hope for Intel. The platform they are playing with scales in a woefully inadequate fashion, as do the memory controllers. 4+ sockets are quite simply a write-off for Intel, and will remain so for the entirety of 2006, and most of 2007. The flag bearer for this wing of Intel will be Tulsa, a dual core P4 derivative with 16MB of cache. It will be hot, 160W+, bound to the same old bus, and outdated.
Q3 also brings 65nm parts to AMD, and this is where the cache will probably be boosted to the full 4MB, but that has been taken into account above. The shrink will once again be 'dumb', so the benefits to AMD will be slightly lessened power consumption and the ability to ramp clock. It shouldn't affect the balance of power in a meaningful way.
Q4 takes Opterons to 3.2, Woodcrest to 3.0, and as I said, Intel to the lead once again. The only noteworthy happening will be the introduction of Cloverton. (Please note, I spelled it Clovertown originally, but the correct spelling is Cloverton. If you didn't realize it, the -on ending is for 4S chips -own is 2S in Intel-speak.) I think this will be a mistake of near epic proportions, or is it not nearly EPIC proportions? Either way, it's only mission in life is to prevent AMD from winning the PR race to quad cores, something that they can do on a whim anyway.
To accomplish this, they will sacrifice performance for cheap headlines, something that cost them way to much credibility in 2005. It will work well in a few workloads, but for most it will likely be pretty shatteringly bad. It looks like management needs a reboot on this issue, but if it hasn't come after this many years, there is no hope.
The close of 2006 will see Intel in the lead by a little on most workloads at the 2S 4C level. At the 2S 8C level, if there ends up being a race, AMD will crush Intel, something that will continue at the 4S 8C and get downright abusive at 4S 16C. If I had to pigeonhole the year for a soundbite, I would say AMD almost all the way. Should they have anything in reserve that is not on the roadmaps I have seen, Intel will be in trouble.
Intel will probably try to claim the lead, but reality will say otherwise, and the bigger the box, the more they will lag. The first half of the year will be a cakewalk for AMD, especially if you add power use into the equation. The second half will be a fight though, and by that time, AMD will be seen as a serious player by even the most jaded observers. That more than any number is what AMD needed on the server front, and they have finally earned it. ยต