The problem comes when someone tells you one of those way out things, and they are backed up by others. Blackshirts travel in packs. When the Blueshirted Throatwarbler (Intellus midmanagus) backs it up, you know you are onto something. The slide in the keynote was only icing on the cake.
That slide showed Tanglewood, the Itanium chip about three cores hence, as having 8 cores, and were told it had 7x the performance as the current Itanium 2. I got a good laugh out of this one, and speculated that the slide preparation team was hitting the crack pipe a little hard, four cores (see here seemed much more reasonable. The 16 cores speculated by CNet were so far out it wasn't funny.
Well, those ex-Alpha boys are actually doing it, Tanglewood will have 8 cores. The slidemakers must have had inside information or something, because they were dead on. The weird part is what follows, and with the sheer number of people at IDF who told me the same story, I tend to believe it.
The weird part is that the cores on that chip are all cut down from the current one, losing a pipeline or two. Looking at the current architecture, they seem to be a bit FP heavy, and light on integer units, so I would go with down an FP pipe, or maybe a FP and an Int pipe. Either way, there will be less paths for instructions to follow.
Those cores will all share a cache, and a relatively small one at that. The thinking now is that that will share 16MB, or 2MB per core, with 32 not being out of the realm of possibility. Not enough in my opinion, but what do I know?
Not weird enough for you? How about slowing down the cores when you use them all? If the chip is supposed to run in the 4+GHz range with 2 cores, and you turn them all on, the 8 core beastie will shuffle along at half that, or just over 2GHz. Estimates say that if the 2 core runs at a theoretical benchmark of 500, the 8 core will double that to 1000, with heat being among the primary reasons for throttling. Another songbird says main memory bandwidth and low cache are the culprits.
Want more out of the what are you talking about?' stuff? Try HP wanting to incorporate their advanced math libraries into the silicon. Instead of compilers doing the heavy lifting, the silicon will. Isn't that exactly the opposite of the main reason they developed this VLIW architecture variant in the first place? Some of the thoughts along this line, and the sources were beginning to squabble here, one even getting catty and calling a Blueshirt a bitch', was to use the multiple processors as a vector engine instead of underperforming discrete units. The shared cache would do well here, as would those HP libraries.
That, in a nutshell is the current version of Tanglewood, or at least the version as it was last week. Since it is still three years or so out, much may change, there is no silicon yet. Tanglewood is going to be an interesting thing to watch when it gets nearer. If there is one thing we can be sure of, those Alpha guys, and their spinoffs are not ones to take the timid, safe route.
Sign up for INQbot – a weekly roundup of the best from the INQ