The only problem [Nvidia has] is that at some point your eyes don't get any better - Bob Colwell, former chief architect, Intel
AT THE INQ we have recently discovered the joys of Twitter, especially when it leads us to interesting snippets we can really tear into. Case in point, a post twittered by AMD's Senior Veep and CMO, Nigel Dessau, concerning ACP vs TDP power measurements.
It is not the first time AMD has tried to convince the world its ACP measurement (or 'fake-a-watt' as we here at the INQ fondly call it) is the way to go, but after reading Nigel's blog, we decided the discussion needed some INQput.
Dessau is right when he starts off by saying tools should reflect real-world conditions, but we tend to disagree when he continues that ACP (Average CPU Power) is the real test of these conditions.
In many ways, ACP is an arbitrary definition conjured up by AMD which no other player within the industry has accepted. Instead, the big industry players like Intel, Sun, HP and IBM have settled on the suite of SPECpower benchmarks run by a committee of industry players hailing from all those firms and even AMD. This, for the most part, promotes TDP (Thermal Design Power), a measurement Intel favours and which AMD considers inherently biased.
But before we get into things, it's important to point out there is a difference between Intel's and AMD's version of the benchmarketing tool.
AMD TDP shows the worst case power draw a particular chip can experience when it's operating at max voltage.
A chip can easily draw a lot of power, but usually only for very short periods of times (like several microseconds). If enough power isn't provided, bits and bobs get lost along the way and calculation errors start cropping up, which is really bad news. So, one would need to be able to supply that much power to the CPU at any given moment, even though CPUs can't draw max current for extended periods – even, say , 1/1000th of a second – making it all very difficult. Over 1/1000th of a second, the CPU could draw between 75-150 watts, but average power usage might be 110W.
When a firm is designing heat sinks, it only really cares about those longer periods of time, while people interested in the actual power, really care about every microsecond.
Intel has a spec for the maximum power of a CPU, it also has TDP, for its heat sink/cooling guys to worry about and adds a thermal diode to shut down the CPU if it starts overheating.
AMD, which only recently began using thermal diodes, has had to be more conservative in designing heat sinks, because the chip could actually overheat. Thus, the firm has had to keep its TDP more conservative than Intel's, hence the reason AMD would rather not talk about it and use a different metric.
AMD uses a blend of different workloads to get ACP, whereas what anybody really cares about is average power draw on the workload and peak power draw/cooling needs. ACP is just an average. It depends on process technology, the temperature the CPU is operating at, ambient temperature and more, making it mighty difficult for someone outside of AMD to calculate.
Meanwhile, SPECpower_ssj2008, is fairly unambiguous although AMD dislikes it because it feels it favours Intel. Well, the truth is, it does favour Intel a bit, but it wasn't purposefully set up that way. The reason is that the benchmark really loves cache and Intel has always had larger, faster caches than AMD. Shanghai has really gone a long way towards catching up with its big 6MB L3 cache, but the fact is, it still lags Nehalem.
SPECjbb2005, as well as being one of the better SPEC benchmarks (focused on server workloads) is also relatively easy to run, so SPECpowerSSJ_2008 took a workload similar to SPECjbb2005 as its starting point. AMD may feel this is unfair, but there really is no other benchmark it would make sense to use as a starting point for power measurement. TPC-C, uses too much disk, SAP 2 tier uses disks and is fairly complex (and requires SAP), SPECcpu would be nice except it only focuses on the CPU rather than the whole system and SPECweb is much too complex to set up and run.
Like it or not, there is an industry standard benchmark for power already and it's not clear ACP really adds any useful information above and beyond SPECpower_SSJ, so it would be more helpful if AMD supported that (trying to improve it where it saw fit), rather than defining another confusing acronym about power.
David Kanter from Real World Tech told the INQ, "It's absolutely true that AMD's TDP is a more conservative measure than Intel's TDP and the two cannot be compared. But ACP definitely cannot be compared to TDP". He went on to note, "AMD's ACP might be more realistic than TDP, but there is already an industry standard benchmark for average server power consumption from SPEC, which is more relevant to end-users".
To Dessau's credit, he does say that, "testing a workload at various utilisations and at idle while measuring at the wall is really the way to go to determine how much power a system will truly draw". But he also uses Wikipedia (that bastion of all ACCURATE knowledge) to quote: "TDP values between different manufacturers cannot be accurately compared". Oh? And AMD's ACP, used only by AMD, can be accurately compared, because it only has itself to compare to! Makes sense to us.
Also, if AMD really wants to argue 'fair', it should look back to the good old P4 days when it started using goofy performance ratings, because it felt that Intel's higher frequency was a marketing problem.
At the end of the day, the ideal is to have a 'power number' which can be easily reproduced and is relevant to the end-user.
Unfortunately for AMD, the most relevant is how many watts a system draws from the wall, and not some blend of calculations no one else officially recognises. µ
Nigel Dessau's Blog
Sign up for INQbot – a weekly roundup of the best from the INQ