Jump to content
The Inquirer-Home

Yet more silicon revealed from the Microprocessor Forum

Part One: Desktop and Server CPUs
Thursday, 16 October 2003, 07:55
THIS YEAR the Forum, held as usual at Fairmont Hotel in the boring old San Jose, lasted for 4 days from 13 to 17 October, with the main event just on 14 and 15 October.

With less attendees than usual (the conference hall and lunches were not as crowded as before) and with main players, Intel and AMD, not announcing anything this time round, why bother going there? Well, in the absence of big cats, there were other vendors strutting some very interesting stuff and, while there are mice among these, so there are elephants too. Here's what happened on the first day, focusing on desktop and server CPUs (at least that's what I focused on).

Sun's Greg Papadopoulos, Sun Exec VP and CTO, opened the conference with a keynote talking about the evolution of single-chip microprocessors into single-chip microsystems, as well as 'Throughput Computing' focusing on aggregate performance when handling multiple threads rather than single-thread performance (at which Sun is NOTORIOUSLY poor, of course, look at any scientific or technical benchmarks there, huh). The overall idea is that there are diminishing returns in perennially trying to increase the individual CPU core speed and complexity (obviously, SPARC is the one with plenty of such problems, to say the least) in terms of much higher power, price and heat generated, as well as more time wasted waiting for memory.

So, many small cores, running 'New Age' parallelized workloads with many small threads, will basically use this parallelism to hide the memory latency, well, Intel's Tanglewood Itanium with some Alpha spice (hopefully that combo doesn't produce a tasteless swill) might be looking down the same direction. Interestingly, both windowed, complicated in-order architectures (SPARC and Itanium) seem to give up on further per-CPU performance improvements, but focus on many small cores instead? Anyway, these cores would use some kind of very fast interconnect - and surrounded by a sea of shared DRAM.

Kevin Krewell from Microprocessor Report covered the current state of CPUs, from desktop to the server side, covering things which many The INQUIRER readers know already - like for instance, Prescott's 'light bulb' consumption and static leakage currents. Or, directions toward large caches and dual core or multi core CPUs.

Here's a quick summary of each CPU announcement. More detailed stories on each of them over the next few days.

Efficeon
Transmeta completely rearchitected their CPU architecture, creating a 256-bit, 8-instruction-per-cycle, ultra-low power VLIW device, and supposedly even beating the unbeatable Centrino ULV in power consumption figures!

Finally supporting the full Pentium4 instruction set including SSE2, Efficeon has integrated single-channel DDR400 controller, integrated AGP 4X and integrated Hypertransport for a choice of south bridges (same ones as on AMD Athlon64-M platform - Nvidia Nforce 3 Go comes to mind).

A further evolved 4-step code morphing software supports the Efficeon, with several levels of profiling and optimization. The caches are now also at 1 MB for L2, while L1 instruction and data caches are at 128 and 64 KB respectively. The baby fits on just 119 mm2 in 0.18 um (the TM8600) or 68 mm2 in 0.09 um (the TM8800).

While the claimed 7 W Thermal Design Power limit on Efficeon will accommodate a 1100MHz CPU, compared to 900 MHz on Pentium M, Transmeta also claims overall better performance per cycle for Efficeon vs Pentium M (not to mention Pentium 4-M). That seems to be only valid for specific routines like Linpack DP MFLOPS, or RSA and AES integer jobs. In generic, Intel-optimised benchmarks like ***Mark benchmarks, the two are in pretty much a tied battle position.

Finally, the 9 times lower WinXP idle standby power (0.18 W vs 1.45 W) vs Pentium M, as well as choice of standard or ultrasmall packages, round up the Efficeon feature set. While TSMC makes it in the 0.13 um process, look like Fujitsu is the one to produce the next 0.09 um shrink of Efficeon, expected to reach 2GHz next year.

VIA
VIA is firmly on its 'low-cost, low-power' fanless desktop CPU path and it continues with the new C5P Nehemiah CPU - a simple 1.4GHz CPU with a sped-up 200MHz version of PentiumIII bus, dual-CPU enablement, plus, watch out, hardware security acceleration including AES encryption in hardware.

Smaller than a US 1-cent coin, the C5P package enables dual-CPU Mini-ITX integrated PC boards for the first time - well, at least now you can handle two threads at the same time, properly, without HyperThreading?

Again, the claimed power figures show this thingie consumes roughly 30% less power than Pentium-M at the same clock. C5P will be followed by C5I Esther, a 2GHz CPU with both PentiumM and VIA FSB options, SSE2, better instruction execution performance, and SHA security in hardware, to tape out early next year.

IBM POWER5
Expected to be the next performance leader for awhile, POWER5 was for the first time publicly described in more detail. While the frequency and memory bus bandwidth were not talked about (INQ readers already know that POWER5 should top out at roughly 2GHz next year, and that memory bandwidth per chip may come close or beyond 20GB/s). The stuff shown was reasonably impressive.

The new CPU adds SMT with software thread priority control, a larger cache (1.92MB on-chip L2), and a host of other improvements. For instance, an on-chip memory controller and separate L3 cache and memory buses for improvement in both bandwidth and latency. The net execution rate per cycle is also quite a bit higher.

The MCMs now are more integrated - a single POWER5 MCM has four chips (8 CPUs) plus four 36 MB L3 cache chips, and allows for easier back-to-back with another MCM to make a very fast and compact 16-way system at full bandwidths systemwide. In fact, so compact that you could fit 16 of those in a single rack, and connect them with something like, say, Quadrics, for a nice little 2 TFLOPs supercomputer with 2TB RAM in that same rack. Finally, one more MP link on chip allows now for 64-way single SMP system in the same footprint as current 32-way POWER4+.

A biggie article about POWER5 is being readied - Watch out in a couple a days!

Fujitsu SPARC64 VI vs Sun UltraSPARC IV
Just look at the Fujitsu SPARC64 VI, and if you're Sun, you'll wonder what is your design team (and TI process team, by the way) doing! It is a fantastic CPU, with dual 2.4GHz cores, 6MB ultra high bandwidth on-chip cache, fast buses, proper out-of-order execution with fast FP as well, and and scalability that far exceeds the UltraSPARC IV - in itself, just a patch up of USIII - where two USIII cores without any speedup (still at 1.2GHz) and without memory speedup either, are put together on a single chip.

The Fujitsu chip is a real Godzilla of a CPU, with 690 million transistors (or call it Chipzilla, maybe, the real one?) done in 0.09um copper process, similar process geometry is claimed for the Sun UltraSPARC IV as well. In reality, the dual-core, multithreaded (each core is 2-way SMT too) SPARC64VI should be at least twice as fast as dual-core UltraSPARC IV.

Anyway more on that and POWER5 on the way, gotta catch the plane! ยต

Share this:

Comments

There are no comments submitted yet. Do you have an interesting opinion? Then be the first to post a comment.

Advertisement
Subscribe to the INQ Newsletter
Sign-up for the INQBot weekly newsletter
Click here to sign up Existing user
Advertisement
INQ Poll

Windows 7 impressions

How is windows 7 working out for you?