Jump to content
The Inquirer-Home

Intel faces performance struggle for two hard years

The Roadmap to Recovery: Part I 10GHz was dreaming the impossible dream
Friday, 15 October 2004, 12:49
THE FIRST STEP on the road to recovery is admitting you have a problem. Intel is not admitting anything, and until the end of 2006, it has a big problem. There are several factors, both technical and managerial that will make Intel relatively uncompetitive over the next two years in all areas apart from mobile parts.

Intel has officially confirmed to the INQUIRER that it won't ramp clock speeds at the expense of features. But the real question is why AMD can do both, and Intel can't.

Scales fall off Intel's Eyes
Intel said that the Pentium 4, also known as the Netburst core was all about one thing, clock speed. In a modern computer, you can either design for many instructions executed per clock (IPC) and low clock speeds, or few instructions per clock and high clock speeds. Most companies took a middle ground, balancing the two. If you cannot get high clocks or high IPC, your product probably will never see the light of day. If you can do both, you will be very rich, but no one has done this in the X86 world yet.

alt='fea1'Years ago, when the architecture of the P4 was being developed, someone, somewhere, made the decision to prioritise clock speeds over everything else. The design goal was to deliver MHz numbers that no one could possibly match even if they wanted to. If anyone could hit this goal, it was Intel. It had the best semiconductor engineering and manufacturing capabilities of any company.

The engineers were, very likely, given impossible goals, and told to meet them. The Pentium 4 was meant to push every technical boundary there was, and push it hard. The design philosophy was meant to last from just over 1GHz to 10GHz, but not on a single core. There were at least three cores planned: the first codenamed Willamette, the current core Prescott, and a now cancelled core, Tejas. Undoubtedly there were others, such as Nehalem but they were never destined to see the light of day, and will vanish under the waves of history.

Just over a year ago, it became apparent that Intel was not meeting its internal goals. There were problems, and it did not take much digging to find them. Intel is a company based on extensive and meticulous planning and it doesn't react well to sudden changes.

Moore's Law of Diminishing Returns
The first signs of trouble started with the 90 nanometre Pentium M CPU called Dothan. In the middle of last year the roadmaps went from showing Dothan as a 21 Watts part with a 533MHz FSB (front side bus) to a 31 Watts part. Just before its initial introduction date last autumn, it reverted to a 21 Watts part, but slower, and without the 533MHz front side bus. The 533 parts are now slated to consume 27 Watts. The chip was delayed by two quarters until May this year. Intel swept the bus change under the rug, and successfully defined the chip by its cache.

The other sign of trouble was more subtle, and concerned the release schedule of the Pentium 4 variants. For the Willamette and Northwood cores, there was a clock bump about every quarter. Moving from 1.8GHz to 3.2GHz took two years and happened in around seven steps, or an average of just under one release a quarter. If you count increases in the FSB, Intel made just over one a quarter.

alt='fea2'But things changed with the jump from 3.2GHz to 3.4GHz. Officially, that took over seven months, and there was a problem with availability. The 3.4GHz P4 was unavailable except in very small quantities until the summer of 2004, nearly a year since the last release. The 3.6GHz Pentium 4 was officially launched in late June 2004 and was not available until September of the same year. The upgrade interval went from three months to about three quarters, and 3.6GHz parts are still not exactly overflowing on the shelves.

The 3.8GHz parts are theoretically due next month and the 4.0GHz parts are now cancelled. That puts the clock growth on the Pentium 4 line at 800MHz in two years, or two speed bumps a year. If the 3.8GHz part does not come out then, that's a 20% speed increase in two years.

There are other factors that chip away at the viability of the whole Netburst concept. Semiconductor physics plays a nasty role. With each new process, the window of speeds shrinks. If you were able to get a given design to go from three to nine clock units on the older process, the newer ones only allow four to seven that window is narrowing. Each process tightens the noose a little. While there is some leeway, the ability to release a new chip at a higher clock rate is decreasing. If you add in heat, and less time to fix problems, that makes things harder.

While Moore's Law specifies transistor count, it is commonly considered to be about clock speed too. Doubling that every year and a half means 20% is only a quarter's growth. Something is desperately wrong at Intel.

Marchitecturally Challenged
So what's wrong? We think we have most of the answers, but there are undoubtedly more hidden away.

The main problems are management, competition and technical. Management is the most to blame, competition has exacerbated the problem, and a perfect storm of technical problems compounded to finish the architecture, and for the next two years, Intel's competitiveness.

Many years ago Intel made the decision to emphasise CPU clock rate over everything else. Rather than taking a balanced performance approach, and to listen to the engineers, marchitecture triumphed over engineering.

alt='fea3'Management at Intel was known in the past for doing the right thing for the right technical reasons. Things were done on merit, and not to win slideshow beauty contests. For over twenty years, Intel did the right thing technically, and it worked out in the marketplace because it was the best. It is easy to sell when you have the best thing going.

But the Pentium 4 represented a sea-change. Technical merit took the back seat to other concerns. From the outside, it looks to me like the Pentium 4 was designed to hit a number that sold well, not to be the best. This was the critical failure that is going to devastate Intel. Mike Magee came up with the term marchitecture, meaning marketing driving architecture, for a reason.

This mistake set in motion a series of goals that proved unattainable to even the brilliant engineering teams at Intel. There was no backup architecture, and more management decisions put the incredibly good Pentium M out of the running. Now emergency steps are being taken to take advantage of the Pentium M is that are too little, too late.

AMD stops shooting itself in the foot
The other Intel problem is AMD. It has recovered from the series of self-inflicted wounds that were Palomino and Thoroughbred A, and is once again pushing Intel. When the Athlon came out years ago, Intel was pushed to the wall and the Pentium III did not have what it takes. The Pentium 4 in Northwood guise did, and Intel grabbed the ball and ran so fast that AMD didn't realise what was happening. AMD had a long standing habit of tripping over its own feet when they it tried to run, and Intel just strolled on, laughing all the way to the bank.

alt='fea4'But the K8 core gave AMD a processor that worried Intel. When it ramped raw MHz faster than Intel with a core that was not supposed to ramp fast, it was clear that something was very wrong.

Intel no longer had the luxury of time, and the engineers had to produce and do it right the first time. There was no time for a plan B. If there were problems, it would mean slipped launch dates, and nothing in the rabbit's hat to pull out and make marketing look good.

The technical problems are the real killer. The Willamette and Northwood cores had several problems, most notably that they were probably the most aggressive circuit designs ever attempted. Elements on the bleeding edge that theoretically shouldn't have worked were made to work well. Northwood was an incredible success, and allowed it to claw back marketshare.

The Domino Theory
The cost to make such parts was immense. A big problem was the use of self-resetting domino circuits, which are very timing sensitive. There are pulses that have to arrive at a certain point at a certain time for the circiut to work. That in turn drives the next one, and the next. If one fails, they all go, and since it is not a function of clock speed but more trace lengths, it does not cause the chip to have a low maximum frequency, it just makes it fail.

If you want to make it work, you have to change the trace lengths between the transistors, pretty much a manual job. Part of the problem is finding the parts to change. Most test equipment changes the characteristics of the circuit enough to make the reading nonsensical, so bug hunting is more black magic than science. Then you have to move the transistors a little bit, a nip here, and a tuck there.

Multiply this by a few million transistors and you have gainful employment for a lot of engineers. Move one too much in one direction, and you have problems with the surrounding transistors. It kept a lot of people very busy. By most accounts, the team size for the Netburst cores was three to four times that of a Pentium M core team.

This labour intensive, fragile and cutting edge process that succeeded so brilliantly in the past was not the way of the future, and it was a potentially huge impediment to progress.

Prescott was designed to use a more relaxed and robust circuit methodology and ceded some performance for a lot of forgiveness. Part of the change was a breathtakingly long pipeline built for speed. On the low end, it would take a larger penalty for a branch mispredict, and instruction throughput had potentially a 50% longer latency, but scale to immensely high clock rates.

There were other problems with this architecture including a huge transistor count, it consumed vastly more power, and needed twice the cache to keep up with its predecessor. But it was easier to design for, and it would ramp, boy would it ramp. All was forgiven because the light at the end of the tunnel, immense clocks to satisfy the marketing boys and girls, looked feasible.

90 Nano Engineers
Despite what Intel said at the time, there were problems with the 90 nanometre process. Many were solved, but even at the coming out party for it at Fall 2003 IDF, it was clear that done did not mean done right. Some problems did not surface at the press conference. When the Dothan chips came out and Intel hit the quoted 21 Watt envelope, many took that as a sign that the 90nm process was back on track. Prescott's ravenous power consumption was blamed on the transistor count, the pipeline stages, star alignment, or some other crackpot theory of the day. These things all contributed, but a management decision that was the biggest problem.

alt='fea5'It did the marchitecture thing, and gave at least one other group a lot of say in the process. Instead of a 90nm process finely tuned to putting out the best CPU in existence, a compromise was forged and that compromised the microprocessors.

Excessive leakage and power consumption made the chips less attractive to potential buyers, especially for Xeons where density, not destiny, is a real problem in server rooms. Prescott used too much power.

A self imposed power cap was put in place. Gone were the days of picking a clock and that determined the power that was used. Power was set in stone, and you had to get creative and do the engineering to fit the MHz into those limits. This can be done, but the problems is that AMD won't give Intel time to work things out. No Plan B this time.

Another narrowing of the frequency box is the design of the Pentium 4. The multipliers on the clock are fixed as are the FSB settings. The steps you have are the steps you have got. Changing them means a lot of work on the controlling PLLs, a long, hard and unpleasant process.

That means a finite and inadequate number of steps you can design a CPU within an ever decreasing window of workable clock speeds. If you replaced the PLLs, it gives a little more play, but takes time to do. ยต

Part II of this article, which covers future road maps, will appear later today

Share this:

Comments

There are no comments submitted yet. Do you have an interesting opinion? Then be the first to post a comment.

Advertisement
Subscribe to the INQ Newsletter
Sign-up for the INQBot weekly newsletter
Click here to sign up Existing user
Advertisement
INQ Poll

Christmas computer sales

Will you be buying a new computer this Christmas?