We won't be pushed out of the ring by a Sumo wrestler - AMD's Jerry Sanders III
CHIP FIRM Intel gave details of its Nehalem, Larrabee and Dunnington in a set of briefings to hacks in a conference call today. But the firm remains coy about Larrabee, the great threat to Nvidia and to Dammit, for crying out loud.
Stephen Smith, VP of Digital Enterprise, and Ronak Singhal, principal engineer of the same group, presided over the calls.
Nehalem will ship this year, in the fourth quarter, said Smith, and be a scaleable architecture supporting between two to eight cores, with simultaneous multithreading providing 4 to 16 capabilities. He said: "This is the most scaleale design Intel has created."
The firm claimed that it will give four times the bandwidth of the highest performing Xeon, support for 8MB of level 3 cache, and Quickpath interconnects screaming at 25.6GB pr second. It also includes an integrated memory controller and optional integrated graphics and will be used in an entire range of machines from notebooks to servers.
Smith claimed its design is better than its competitors, that is to say AMD, and before that the doomed DEC Alpha chip which Intel is still forced to fab.
Other capabilities include support for DDR-3, 1066MHz and 133MHz memory, SSE4.2 instructions, 32KB instruction cache, 32KB data cache and two instructions for translation lookaside hierarchies.
So what’s Quickpath? The Itanium Tukwila and Nehalem architectures will givescaled shared memory, Numa, with a memory controller in each CPU and connecting processors and other components with the high speed interconnect we’ve been rabbiting on about called CSI (common system interface). Goodbye to the front end bus. It’s a point to point system, as used by AMD chips and by the nearly dead Alpha chip.
Singhal said the processor, with its out of order window, can look at 128 instructions to decide which one to execute at any one time. It will give faster caches which will benefit graphics applications. SMT lets each core process two threads simultaneously.
Intel is not as close to producing Larrabee products as we anticipated, however. It said that the multicore architecture will include a high performance wide SIMD vector processing unit which will support a set of vector instructions including floating point arithmetic, vector memory ops and conditional instructions. It will demonstrate products based on Larrabee later in the year. More significantly, it will support industry APIs including OpenGL and Direct X with Larrabee products.
Smith said that as Larrabee does have the ability to produce a discrete graphics card, it will compete with products from Nvidia and AMD/ATI.
Smith reckons that we think of graphics as the common concept of a rasterised based approach to displaying graphics, with an underlying graphics pipeline. " It's not too good if you have to tackle larger problems," he said.
The firm said Larrabee will scale to multiple teraFLOPS, and has a freshly designed cache architecture. Intel is investing in software to take advantage of Larrabee's features. Larrabee, said Smith, is based on the concept of having many smaller IA cores, each using the X86 instruction set but with additional vectors constructions. There is global cache per core, which can be shared across the multiple cores. Products will probably start shipping in 2009.
Intel is calling all of this “Visual Computing”, applicable to gaming, high def video and audio. Intel is suggesting that there will be new forms of gaming controllers that will understand how humans – that’s us folks – to let people become characters in games.
Dunnington is to be made available for expandable MP servers, supporting six cores, based on 45 nanometre high-k tech, and a single virtualisation pool that support live virtual machines for both 65nm and 45nm chips and servers.
As usual, Intel had some futureware to announce in the shape of “advanced vector extensions”, available in 2010 in microarchitecture codenamed Sandy Ridge, and with specs made public this April at the firm’s Developer Forum. AVX will have energy efficient features, will be backward compatible, and support vectors from 128 bits to 256 bits wide. µ
* Cough!
"DDR-3, 1066MHz and 133MHz"

Thats a bit of a jump ;-)
Nice design. I'm sure the good
'people' at M$oft are busily working
to make sure their newest OS will drag these
processors down to pentium 90mhz
performance while at idle. Maybe they will include two onscreen clocks in the widgets,
and these will each use 2000Mhz to make sure they never show the same time as the one on the taskbar.
25.6 GBps (B reads "byte") means 25.6 x 8 = 204.8 Gbit/s. That would really be a lot of data, in fact about 20 times what a typical backbone optical fiber is carrying. Shurely you meant 25.6 Gbit/s in the first place?

AT adds: Intel says GBps...
I didnt see any mention about flash on these cores. That would really be something and a big surprise but the inq is posting fud!! Only mention of 16Mbs of cash which is alot bit its not flash!!! I was so excited to read this article but alas its just crap!!! Imagine 8core 16threads, 16megs of cash L1, L2 ,L3 cash 1gig of flash on a processor!!!!!
Nope, he's right about the 25,6 GB/s number.
Remember, we're talking about very fast connections between a CPU and its North bridge here.

Currently, I believe the 12,8 GB/s is the norm with two DDR2 PC6400 800MHz modules in dual-channel mode.

So it's a nice x2 jump.
From todays' opteron & Xeon processors, these produce 4X higher server scores, Remember there are 3 channels of faster memory, plus HT plus picture shows 2up.It creams AMD by full 1/3 Faster, if AMD 4 core even works by next year.
Thomas Drashek
The Intel powerpoint included in the article CLEARLY says

"25.6 Gb/sec max bandwidth"

NOT 25.6GB as stated in the article.
Himbeerkuchen: 25.6 GByte/s is correct, this is the max aggregate theoretical bandwidth of a full-width (2 quadrants) QuickPath link:

6.4 (GTransfer/s) * 20 (bits) * 0.8 (accounts for 8b/10b encoding) / 8 (bits/bytes) * 2 (directions) = 25.6 GB/s (per link)

Or 12.8 GB/s per direction because QuickPath links are dual-simplex. FYI full-width (32x32 bits) HyperTransport 3.0 links have an even higher bandwidth of 41.6 GB/s per link or 20.8 GB/s per direction.