EARLIER, WHEN DISCUSSING the possible integration of CPUs and GPUs on the system level, at least for high end systems, before the chip-level "sharing the die 'til we die" CPU-GPU on-die marriage happens, we looked into the most efficient ways of executing that integration.
Obviously, PCI Express, even in its v2 and upcoming v3 versions, has too much latency and protocol baggage to most efficiently link CPUs and GPUs for working together on common tasks. Merely seamless coherent memory access is usually a big problem.
Now, AMD as a company does have one huge - albeit very temporary - advantage here right now: all the pieces of the puzzle are in place for quite a while. Namely, a reasonably fast (but could be faster, please) CPU core in those 6-core Istanbul Opterons and their coming desktop equivalents, complemented by a very fast HyperTransport 3 low latency, high speed - up to 25.6GBps per link for version 3.1 - interconnect protocol between CPUs and I/O, as well as performance-leading GPUs in the ATI Radeon R800 family. Both the CPU core and HyperTransport are stable, proven old-timers, with stuff like FPGA accelerators and ultrafast supercomputing cluster network links already being hosted on the Opteron HyperTransport combo for years. There's even the HTX slot spec for I/O cards hitting directly into system memory.
As mentioned before, Nvidia lacks both the CPU and interconnect portions here. If it had an Alpha license that would have solved both issues, and as a bonus, propelled it to the top of the CPU performance pack fairly quickly, assuming it had a reliable CPU fab partner. A hint - IBM Microelectronics and Global Foundries did share technology that was once used by IBM Micro to fab the last batch of Alphas some 8 years ago.

Now, if AMD, for instance, released a HyperTransport flavour of the R800 family chip that adds an MMU or at least a very fast DMA (Direct Memory Access) mechanism to access the main memory and CPU resources directly via HTX, and maybe had another bunch of HyperTransport links to add other GPUs the same way, it could - on top of the existing fast local GDDR5 memory - have far faster, nearly seamless access to the rest of the system resources and not just system memory. Even wrapping it as a kind of co-processor to the CPU could work, in the same fashion as the old 80287 was to the 80286 some 25 years ago.
With the HTX scalability and a sufficient number of links, each CPU could drive a bunch of GPUs, scalable only by the total system bandwidth required. Remember, with each new GPU here, you add to the total interconnect bandwidth and total memory bandwidth, as you can see on the graph. Due to less contention between them, not only CrossFire gaming framerates would scale better, but so would OpenGL engineering apps as well as, of course, the computational codes.
All this looks beautiful, but you may ask, why temporary? Well, Intel did work hard to use the base of the former Alpha EV7 / EV8 interconnect to create its Quick Path Interconnect. While it is still far from HyperTransport's level of maturity, and even without any slot spec for card expansion, QPI has Intel and its market share behind it.
Then, Intel will - one day soon, hopefully - have Larrabee out there, up and running. If the original early Larrabee focus on workstation and computational graphics is kept, then so much so Intel has more drive to enable even the initial Larrabees to talk to their CPU brethren directly via QPI, not just over the slower PCI. On top of that, since Larrabee has an X86 ISA front-end anyway, it'll be that much easier to treat it as a co-processor right away, and offload the code portions that it can run on its own.
I think that Intel will have all these running, one way or another, in about a year's time. Imagine the same graph as above, but with Intel Xeons and Larrabees instead. Whether Intel decides to go with the QPI link on Larrabee from the start or later, it's Intel's decision, but nothing prevents it technically. In that sense, before it risks being pushed into the corner again later, AMD should use the chance while it has it. When Intel gets it, it surely will. µ
Now that is some fine Inquirer journalism, not the usual ironical BS about nothing important. Keep on the good work.
They were thinking about this in the initial fusion tech discussion. Well, all kinds of accelerators, graphics / GPGPU must have slipped their minds. They later dumped this for a reason known probably only to them.
So I don't really expect AMD to do this even if it worked like you said. And that it would.
Much More coherrent than article 1. Scan Line Interface may Be Missing word.
Scalable Line Interface should be increased from 2 to 3 sections per screen, as 24 screens isn't practical, push that excess into one screen.
With each card running less pixels better, SLI might break thru ALL dumpy Intiiala & Really Change performance.
By Giving U of Minnesota to Sir Hector Ruiz Be same thing. Call it: Ruizopolis or San Ruiez.
drashek
In the current adolescent fear-of-incompatibility, what -will-people-say IT world this money-driven phobia is halting the previous almost childish groundbreaking concepts.
Where are the good few men that dare to go double or nothing, boom or bust ? But no, it's Industry Standards (tm/r/c) that 's behind the wheel.
With people like Hector at the rudder worrying more for making bucks on insider trading then technology its half a miracle Istanbul actually left the gates of Santa Clara.
Who cares if he's left the building; the damage has already been done; and that's why there are NV cards on PCIe on my dualsocketed Opteron board. With an Ageia PhysX card to boot; that was the only relevant development the last 36 months.
Blerch. Shame on you IT sector.
a nice Quad core surrounded by 128 mini me Pentiums, all with HT, i have a 6600 just now and its fine, im going to break tridition and not upgrade till that thing arrives.
I wonder if Windows will recognise the Pentiums as CPU cores, and if so can windows display 264 cores in task manager?
Why is anyone taking Larrabee for granted? It is still pretty much vaporware and considering Intel's "superior abilities" in GPU engineering, it's first generation is going to be a failure compared to existing offers from 'alternative vendors'. There are just too many ways to FCUK it and I bet Intel is not going to get it straight the first time Larrabee is around.
Taking this vaporware as competitive to even current GPUs is probably not very wise and that's why this INQs article is not right. At least the 2nd part. Should have used a better judgement.
in BSN zine, intresting article, although 32 nm roadmap is ok, some question as to when DX11.1 will be ready, Over year, maybe 1 &1/2.
drashek
http://www.brightsideofnews.com/news/2009/11/2/amds-next-gen-gpu-manhattan-and-northern-islands-use-32nm-process.aspx
More articles like this please.
Not that other junk to make the various fan boy factions run amok.
I'm slowly losing my desire to come to this website.
The battle between CPU and GPU vendors is endless, as each wants to maximize its share of user spending. In the end, any efforts to harmonize are kind of like multifunction printers - i.e. not very good at anything. On-die integration is even a mixed bag - less choice in CPU vs. GPU performance, competition for memory bandwidth, and extremely high power consumption.
As someone who has used Larrabee I can confirm it already is a QPI device, and uses a QPI to PCI-e bridge like the new i5/i7. This is Intel's silicone building block concept.