The Inquirer-Home

POWER7 vs Nehalem-EX

Preview The 8-core battle at the high end
Mon Sep 07 2009, 13:23

LAST WEEK we covered some details about the upcoming IBM POWER7 processor, which is expected to be the second shipping 8-core general purpose server CPU after Intel's Nehalem-EX.

And no, Sun's Niagara with its ultralight cores is not a general purpose CPU, so it doesn't count.

Just like Intel's ultra high-end server offering, POWER7, IBM's flagship CPU for 2010, is a huge die, large cache monster, immensely powerful on its own yet capable of being very well connected to many of its siblings to compose very large, well scaled multiprocessor systems.

How do these two processors compare? Well, both are 45nm process behemoths with 8 cores per die, each with out-of-order execution and some degree of internal multithreading. The Nehalem-EX is expected to have 8 cores with 2 threads each, running at anywhere between 2.66GHz and 3GHz at launch in the next 4 months, while the POWER7 will have 8 cores with 4 threads each, running at up to 4GHz at launch sometime in mid-2010. So, POWER7 should be faster and more powerful from the raw hardware resources point of view, but at the cost of being half a year later to market.

Looking at each core, the Nehalem-EX core can process up to 4 instructions - some simple, some complex - per cycle, and 4 floating-point (FP) operations per cycle. Not bad at all for what is the most powerful X86 core in business today. POWER7 can do up to 6 simple instructions per cycle, and up to 8 FP operations per cycle if running 4 fused multiply-adds. Again, the raw power of the POWER7 core is somewhat higher. But then, so was the POWER6, yet it fared badly in benchmarks.

The caches? Both are really cache-rich, so to say. Nehalem-EX's 8 cores have a shared pool of 24MB L3 SRAM cache with a fast kilobit-wide ringbus between the different cache segments to speed up access. On the other hand, POWER7 has 32MB of L3 eDRAM cache for its 8 cores. In either case, each processor core has its private low-latency 256KB L2 cache too.

How about memory? Nehalem-EX has 4 buffered DDR3 channels per chip, where, using on-board buffers, every channel splits into two actual 64-bit DDR3-1333 DRAM paths. If the buffers had the abilities like FBD AMB (Advanced Memory Buffer) chips, you might be able to do simultaneous read and write transactions on each channel, effectively doubling the bandwidth. Either way, you're looking at some 50GBps of memory bandwidth per CPU chip, not bad at all.

In the case of POWER7, though, there are two 4-channel DDR3 memory controllers, for a total of 8 channels of memory and a claimed 100GBps total memory bandwidth. Now, this definitely cannot fit into the rumoured common G34 socket with AMD's Magny-Cours or Bulldozer CPUs, as those only have 4 memory channels.

Neither would POWER7's proprietary multipath 360GBps (yes, GigaBytes not gigabits) connections to neighbouring CPUs, up to 32 of them, fit into the nearly 4 times slower 4-channel HyperTransport 3 setup on the AMD G34 socket. The Nehalem-EX 4-channel QPI interconnect, if running at 6.4GTps, would give you above 100GBps bandwidth to the other 4 neigbouring CPUs - yes, also three times slower than the POWER7, but still far from slow in reality. Also, the Nehalem-EX's symmetrical north-south-east-west QPI arrangement can scale to hundreds of sockets without extra glue logic. Look at the SGI - sorry, Rackable - UtraViolet and such systems coming soon.

Now, last but not least, the instruction set architecture, probably the most important point. POWER7 continues on the old POWER ISA architecture path, including the PowerPC-specific Altivec extensions that were in the POWER6. While PowerMac is no more, IBM still has sizable markets in mainframes, minicomputers and of course servers and clusters for the new CPU.

On the other hand, Nehalem-EX is, simply, 64-bit X86. A straight win there, whether you like the X86 or not. Everything runs, all the vendors have to use it, and there'll be a myriad of support chipsets, peripherals, software, drivers, apps, and of course every operating system out there, minus AIX and VMS, I guess. You'll even have dual-processor extreme workstations, some overclockable, with the dual "Beckton" Nehalem-EX CPUs for 16-core Skulltrail-followon monsters to appease gamers' wet dreams and engineers complex visulisations. Just like their server counterparts, many of these will be easily upgradeable to the expected "Eagleton" 12-core 32nm chips with 36MB cache a year or so later. Unfortunately, I don't think that we'll ever see a POWER7 workstation.

Why not? Well, I think workstations are important to enable access to a given architecture to as many developers as possible, resulting in more optimised and tuned code, and of course more apps at the end. Whatever raw performance gains POWER7 has, there will always be more effort put into X86 chip code tuning and optimisation.

Finally, the price. It's too early to talk about POWER7 prices, but, if the current trends are anything to watch, expect a Nehalem-EX to be at least 3 times cheaper than the POWER7 per total system CPU unit. I won't be surprised to see an even larger price differential.

That's all for now. As more details emerge, look for more coverage here. µ

POWER7 pros:
- absolute raw performance - CPU, memory, I/O
- immense scalability within the 32 socket limit
- committed large vendor behind despite a mostly single-platform environment (Power Linux didn't take off as expected).

Nehalem-EX pros:
- it is the fastest X86 chip at launch, and it is X86 so everything runs, workstation or server
- near-limitless scalability without custom wizardry, most of it easy to reach even with Windows
- much cheaper and comes out half a year earlier.

 

 

Share this:

Comments
Let us see

Wow, If only I had the money to purchase either one of those for my web server. Let us see how these chips play out though, specs arent everything. BTW my site is, www.wiseserpent.com/tech , little bit of free advertising heh..

posted by : dekoy, 30 December 2009 Complain about this comment
POWER7 and mac??

I don't think the Intel chips are glue-less scalability. The IBM/NCSA Blue Waters project can perhaps scale up to 65536 processor chips or 524288 cores (if they have more money).

How would macosx 10.6 would perform on POWER7? Unless Apple have a POWER7 system in their secretive labs, we would never know. But you can run some benchmarks between a mac and a POWER7 system. And not just some specmarks or linpack benchmarks. Run a real application such as a web server, java app, database app, KDE desktop, etc..

posted by : John, 26 December 2009 Complain about this comment
Nehalem-EX is 1000W

Nehalem EX platforms are 1000 watts.
http://blogs.techrepublic.com.com/itdojo/?p=987

Ouch!

High utilization of all processors and associated support chips will be critical for making this an energy efficient platform. Today, even with virtualization, platform utilization remains low at 25% to 35%.

Rack heat/power density is going to be an issue.

posted by : RSP, 09 September 2009 Complain about this comment
I wonder

I wonder how the Power 7 chip would have performed on Snow Leopard if Apple would have stayed with that chip.

posted by : Regulas, 08 September 2009 Complain about this comment
Maximum physical memory?

How much physical memory can you (architecturally) get on each of the chip families? IE ignoring number of slots on the motherboard, just based on the architecture and the number of pins (or equivalent) on the chipset?

For each of Power7, Nehalem, AMD's latest, and the latest Itanium?

Maximum physical memory is still important to some people. Not many, but to them if a system can't have the physical memory they need, it just isn't a player, so long as someone else's system *does* offer that amount of memory.

That's one reason Itanium is still around. Once that selling point goes away...

posted by : Mr N. O'Body, 08 September 2009 Complain about this comment
Arbitrary criterion for ommission

That's a bit arbitrary not counting the Niagara. Admittedly the T1 shared one FP unit, but the T2 had separate FP units per core (and, indeed, two integer processors per core).

Whatever others might think, the Niagara is most certainly a general purpose 8 core processor albeit one with relatively poor single thread performance. It's not suitable where single thread performance matters, but it's most certainly a general purpose CPU.

posted by : Steve Jones, 08 September 2009 Complain about this comment
Nehalem not glueless scalability

Sorry that is wrong. Nehalem won't be gluelessly scalable to thousands of sockets as you claim. That's totally ridiculous. Firstly, the hop count of such a simply connected system will grow significantly, and links will become saturated quickly past tens of sockets. Secondly, in glueless form it supports a broadcast snoop cache coherency protocol which does not scale past tens of sockets either.

posted by : Nick, 08 September 2009 Complain about this comment
I remember before AMD...

I remember the days before AMD made a stir. Back when you'd pay $600 for a Pentium 200 with MMX instructions. That was in 1998 dollars. Then AMD scared Intel into submission with its glory days of the first generation Athlons. $250 for highest-end desktop processors. That was great.

But AMD and ATI can never lead, not for long. The two companies always squander their gains and it wasn't a surprised that their combined value was less than the sum of the parts. Now AMD can only barely compete at the low-end and we're stuck with paying $200 for Intel motherboards because of Intel's notorious licensing costs.

I do wish AMD would become a viable contender, but I'm not going to be the one to bankroll it. Having your flagship processor barely comparable to low-end few-year-old processors is not encouraging. I wish you luck AMD and I want you to do well, but I won't buy from you until you get your act together.

posted by : BB, 08 September 2009 Complain about this comment
U Intel sucker...

My dear Intel sucker...oh I am sorry lover.

No wonder you are Intel fan. Don't be stupid and brain less. How can you imagine this world without AMD today? You should give every due credit to AMD for keeping Intel in their bay. Else Intel would charge 1000$ for stupid P4...

posted by : Not an AMD Fan tho, 08 September 2009 Complain about this comment
At mr Intel lover

you are a true loser wake up and think just how much you would be paying if intel dominated the market, as for pc's I jumped off the upgrade wagon years ago, my next upgrade will be a console (ps3 Slim)these days I only own an old sepmpron 2600+ socket A, but I know that when I do upgrade it won't be intel
c'mon guys how many of you really only use your computers for games, anyway upgrading just to play a game seems a waste of time, thats why I'm buying a console they have heaps more titles!

posted by : Hotdog3c, 08 September 2009 Complain about this comment
@intel lover

No you win the prize for the most stupid least thought through commend of the month.

If you don't like AMD then JUST BUY INTEL, and thank AMD for the fact you can buy an intel cpu for a 5th of the price it would cost when intel would not have a competitor. Without AMD I bet you would be able to buy a brand new just released 1.7Ghz Pentium 4 now for the sweet price of 1000$.

No competition no incentive to innovate and milk the crowd for all its money.

I hope your acne clears with age when you become 18.... In other words grow up and grow a brain to think with...

posted by : Rob, 07 September 2009 Complain about this comment
Intel is building block company

Intel doesn't compete with IBM per se. In fact IBM will be launching its own Nehalem-EX systems! Intel is a building block company and I guess AMD is as well. That means IBM can use both Power7, Nehalem-EX and AMD's Magny-Cour for different purposes and markets. The fact that these 3 cpu's fits into completely different price brackets makes it a no-brainer.

posted by : Tomas, 07 September 2009 Complain about this comment
who uses this type of chip?

I guess POWER7 is more for things like supercomputers that crunch numbers for scientific computing. Things like meteorological, geological, particle physics, molecular dynamics, physical simulations, and other various hard science applications. All these applications have very proprietary software running, not some standard commercial package that anyone can get. I question the usefulness of generic benchmarks on a processor like this.

posted by : jason, 07 September 2009 Complain about this comment
Nice, but dual CPU not needed for gamers

Perhaps a few insane wealthy gamers will look for a Nehalem EX, but that'll only be due to single core performance rather than multitasking ability.

As it is, no games use more than about three cores, and the majority manage with two. Unless they start implementing real time raytracing games, it's no use.

Of course, for visualisation and modeling it's a completely different situation. The fact that it'll be possible to get a 16 core machine off the shelf will be incredible.

posted by : Peter Kay, 07 September 2009 Complain about this comment
wellthen

No amd = $1000 for a cpu...

posted by : wellthen, 07 September 2009 Complain about this comment
Mr.

I read somewhere that the Power 7 is a 250 W power consuming monster - while the Nehalem - EX will be consuming much lower than that.

posted by : Ravindra, 07 September 2009 Complain about this comment
aboutus
Advertisement
Subscribe to INQ newsletters
Advertisement
INQ Poll

Authorities in several countries raided Megaupload recently, shut down all of its services, seized hundreds of servers and arrested several of its executives on criminal charges.

Do you think the move was justified?