Deliberate understatement or a real slowdown?
Nvidia vs AMD GPU war - June 2008 battle round
AFTER SOME 18 MONTHS of GPU doldrums - no, I don't consider an April 2008 GeForce 9800GTX a real improvement over November 2006 GeForce 8800GTX, and same for HD3870 over the 2900 series - we're finally about to experience a quantum leap in the 3-D card performance. As you all know by now, the middle of next month should see the rollouts of both ATI HD4870 and Nvidia GTX280, the new top-end cards for both vendors.
Compared to the ~10% real 3-D speedup for either Nvidia or ATI lines over the past year, these two should each bring upwards of 50 per cent speedup compared to their previous brethren, with the HD4870 bringing even more percentage-wise: simply, it compares against a slower predecessor.
Talking about the HD4870, a deep throat says it seems to almost double the performance of HD3870 in some cases: no wonder when you combine a ~10% faster clock, above 850 MHz with 50 per cent more shaders clocked independently this time, twice as much texture throughput and upwards of 80 per cent higher memory bandwidth using the GDDR5. Not bad at all - this is a much-welcomed medicine, knowing that ATI has suffered a black eye in the performance battles against Nvidia for quite a while now.
In the computational matters that matter, whether HPC supercomputing simulations or home multimedia Photoshop filters, the users will be glad to know that, for some $350 with a single H4870, they can call their machine a " teraflop PC". Yeah, it is 1008 peak GFLOPs on the GPU - but keep in mind, it's a single precision FP figure, and the real IEEE double precision FP will more likely perform at around a third of this peak performance. Even then, it will work only if most of the data is in the graphics memory, not 10x slower going " abroad" to the system memory via high-latency PCI-E.
Now, Nvidia's entry seems even more advanced in some ways: 512 bit memory to give huge bandwidth even when using the current GDDR3, 240 complex shaders, CUDA programming with IEEE double precision FP. With Nvidia's huge marketing machine and such, I expected them to ramrod the ATI entry.
Yet, that Editor's Day a few days ago seem to have put a bit of damper on the GTX280. Instead of a more competitive ~700 MHz GPU clock I expected, where it would pretty much have no competition, the final ~600 MHz GPU clock, taken into account with other performance parameters as well, seems to put the part dangerously close to the HD4870. Still faster, of course, but possibly not by much.
I'll refrain here from DX10, OpenGL, 3DMark Vantage and so on, but look at the computation again. If the clock speeds are right, we'd end up at around 933 GFLOPs for the GTX280 in single precision FP math (no double precision numbers yet). Slower peak number than the HD4870? Yes!
Does it matter in actual performance? Probably not, since the library and application optimisation will determine which GPU steams ahead. Does it matter ego-wise, when you're speccing and creating visualisation or GPU computation clusters, where "teraflop per card" means a lot to impress the ministers signing off the budget? Most likely, yes.
Now, all the explanations for the comparatively slow clock rate - large die of nearly a square-inch, 65nm process instead of 55nm, complex structure with a zillion shader processors and plenty of on-chip memory with wide outside memory controller - do make sense on their own.
However, I feel there could be another dimension to it - those slides could also be a deliberate understatement and the card might end up faster than the early numbers say. First, it would accomplish lulling the ATI gang into a bit of "I strike luck" complacency when the HD4870 launch comes, both performance and driver tuning wise. A couple of days later, Nvidia would announce the GTX280, and, well, maybe there'd suddenly be another higher clock speed there - one that would not only break the teraflop barrier, but also up all the rest of the competitive benchmark results on the first day?
Now, we don't think it will end up MUCH faster anyway, there could always be an "Ultra" edition with 256 shaders and higher clock, before the 55nm shrink comes in. But then, HD4870X2 is not far away, another two months or less, and it is expected to have roughly the same power and price as the GTX280. So, it may be good for Nvidia to do that bit of extra speed-tuning push for the GTX280 anyway, pre or post launch. This time, the competition won't back off so easily...

Comments
Arrrgh, Baby Great Whites.
As I await My Exploding Harpoon for Great White Next year w/ two Thousand Stream processors, These havfa Do. Arrggggh. Vivisecting great white Babies is all Fun.First I felt GT X280 would be clober band. Then Specs started changing with time of Day. Of Course there powerful. PhysX is still big question & every photo group is of differnt game card, So reserve opinion till it gets here.
It seems ATI 4870X2 being X2 has ability to double its numbers on Vantage. Not too Shaby. Terraflopper is Long Way from 100-200 gb of bandwidth presently touted. So its ludicris increase, Not to be dismissed lightly, if it all works. These baby great whites will grow & soon oshin will be Full of Plunder. Arrrgh. Expose Em & weep, Fellows. Expose Em & Weep.
Drashek
Observation and speculation.
Perhaps the new GTX280 will still be a lower than anticipated stock clock but overclock well at the factory for those 3rd party companies who build them.This allows the 3rd party companies to charge even more for a card that can perhaps easily out clock its stock specs, espeically with better than stock cooling.
If thats true, I think nvidia would be doing a huge favor for all the different 3rd party suppliers out there.
Nvidia apparently treats thier suppliers well, they certanly have enough of them. Atleast when compaired to ATi.
Double precision
Double precision on CUDA would be very nice indeed. So would a Linux SDK for 38x0/48x0 from AMD.Lousy Propaganda
After reading this article, it is quite obvious that the author is trying to cast some doubt on the readers mind. Obviously, ATI's upcoming cards are going to walk all over the new Nvidia's, but acording to this stupid guy, Nvidia is not showing its true performance. That is another load of cr@p coming from this joker. Don't you think it will be smarter for Nvidia to show the real performance of their card if it was indeed faster. They might come to market late, but hey, people will wait one month, even two months, for the better card. Mr. Author, nice try, but you ain't fooling no one with your lousy article.-NoFool
Correction:I violated NDA Completely.
Once again numbers change to rational: GTX260 now is defineately 896 mb of memory. isn't that great, Reporters: How Can Ultie_Tom pretend to invent everything when everything is being reported helter skelter?The big chip called the GTX 280 works at 602MHz for the core, and it comes with 1GB of GDDR3 paired up with a 512-bit memory interface and clocked at 1,107MHz. This big chip has 240 stream processors and 32 ROPs with Shaders running at 1,296MHz. This big boys rated TDP stands at 236 Watt, and it should priced at US$600+ at launch.
The GTX 260 works at 576MHz for the core, which is surprisingly close to the GTX 280 clock. This card will have a 448-bit memory interface and it will come packed with 896MB of GDDR3 memory clocked at 896MHz. This chip will have 192 stream processors and 28 ROPs, with Shaders clocked at 999MHz. This card will have a TDP of 182 Watts with a launch price tag of US$449.
Probably these specs are repeat now on most fronts, yet even author states NOT sure of ANY facts . So Go Figure.
Lets Me Ask This: Why 999 mhz/s on shaders, wouldn't nice Ghz/s sound more completed? 896 mb & 896 Mhz/s?Funny. Maybe too much attention of too little importance, yet still wish someone find out about built in HDMI audio & PhysX "compatible" statement made last week. This is trouble when all i have is what is written by bunch o' imposters.
Everyone of these stories since friday comes on with due to NDA writer wasn't there. Somethings NOT There, Bubas', probably ALL Around: Charlie or Rocky or Nebojsa or Slobadan or Andreas(all differnt details) whomever, or maybe thier all same Rat.Just commenting on all research being done to finalize this product.One new number 9800GTX is 41.6 gt/s and gtx280 is 48 gt/s or !15% better, SO THERE.
drashek
BTW I use drashek to get credit with search engines, due to "Dignity to Poor" requirements on My Time, at this time.I may change names about, yet facts remain same, at least.Hahaha, barf.
You need to invest in AMD-ATI while it´s cheap.
I´m running on fumes at the moment, but I´ll do my best to make sense.I have a friend who is a computer programmer and an amateur mathematician. He used his skills to make a prime number finding program that knocks the socks off the cpu-based ones. But where most people are programming for cpus, he made his number crunching program to use GPUs. But here´s the thing...
While his program performs better than 4-5 Intel quadcores(q6600s) if they have no latency(obviously not possible in real life) the number crunching part of the graphics cards(I´m tired, so I´ll simply state I´m referring to the parts of the gpu that aren´t GDDR-based) are only busy about 1-2% of the time. Why? Because the GDDR3(not sure if that´s the latest iteration) is so far behind what the rest of the card can do that there is almost no point in upgrading anything other than the GDDR3 for a lot of applications.
You basically have a supercomputer in your computer, but it can´t be utilized because the memory portion can´t keep up.
He´s still working on his research, but if his work ever gets fully published(he´s still working on it), the prime number research niche will go through a major upheaval.
He´s also, by sheer coincidence, discovered some stuff that sheds new light on the speed of light speed limit and how to deal with it. According to him, it can´t be beat, but you can take ¨shortcuts¨ similar to the whole Star Trek warp drive concept. It´s not precisely the same, but subspace is a real concept. And superspace, too, though I don´t think Star Trek deals with superspace. I mean, why would you want to go to a place where light is slower? ;)
Not bad
The new ATI cards do look impressive on paper, but it's hard to think back to a time when an ATI card didn't flat out beat the equivalent NV card on paper, yet didn't do nearly as well in games.Still, nice to see a GPU related article on here that for once isn't full of charlie's bias and offers some objectivity.
double precision
The CUDA guys were talking "dual precision" just around the corner 18 months ago. I've been waiting that whole time, but if nvidia doesnt so something brash like software disable it on the consumer D10U, its going to be truly devastating. The future is exciting!Can't trust ATI
After spending $300 on a 512MB ATI X1950 Pro card, then having it Not Actually Work under Linux...ATI/AMD can go and firk themselves. With a Very Pointy Cactus. Repeatedly.
Next PC is Intel CPU and something-non-ATI in the graphics card department. Probably Nvidia.
Good one!
"He´s also, by sheer coincidence, discovered some stuff that sheds new light on the speed of light speed limit and how to deal with it. According to him, it can´t be beat, but you can take ¨shortcuts¨ similar to the whole Star Trek warp drive concept. It´s not precisely the same, but subspace is a real concept. And superspace, too, though I don´t think Star Trek deals with superspace. I mean, why would you want to go to a place where light is slower? ;)"hahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahahahahahahhahahahahahahhahahahahahahhahahahahahahahhahahahahhahaha