The Inquirer-Home

On Hammer, nails, pins and needles

Speeds, cubes, cabbages, kings
Mon Oct 22 2001, 18:07
WE HAD SOME PRETTY interesting letters stemming from an article we wrote yesterday about Hammer, Clawhammer, and the shape of AMD chips to come.

AMD's Clawhammer: the HT mystery

Nathan Brookwood, a highly respected semiconductor analyst at Insight 64, said the following:

"I'm pretty confident that Clawhammer will have two HT ports - one for I/O and the other to support a DP configuration. Remember, HT is targetted at desktop and workstation configurations, and DP capability is a requirement in the WS market.

"I doubt that Clawhammer will be just a de-spec'ed Sledgehammer. Sledgehammer is aimed at 4 and 8-way servers, and needs a large (1 MB) L2 cache in those configurations. Thus, I estimate there will be two versions:

"1) Sledgehammer: 1 MB L2 and 3 HT channels
2) Clawhammer: 256K or 512K L2 and 2 HT channels

"AMD hasn't revealed their thinking on this to me, so I feel free to speculate."

Nathan Brookwood also told the INQUIRER he thought that Hammer "looked like a potential winner. All AMD need to do is (1) make it work at the speeds Weber hinted at at MPF, and (2) find a credible server supplier willing to wrap a system around it and take it to market. Sounds like a piece of cake to me."

He added this very interesting snippet: " AMD hasn't provided any specific clock speed goals for the Hammer, but they illustrated the chip's timing and pipelines with a 2 GHz clock, and this led most observers (including me) to assume this is the target frequency when they introduce the product. That would give Hammer a faster clock than Athlon XP or Itanium (even McKinley), but not as fast as the likely P4 or Xeon clock frequency a year from now. All the more reason why we need to find a better yardstick than MHz to measure these devices."

Here's what another reader had to say:

"The Hammer has 3 HT connections, apparently to connect up to 4 CPUs in a square architecture (one of them connects to the rest of the > world). Imagine a square; each corner connected to two other corners. In this configuration, no cpu is more than two links from another. For example, to send a message from one corner to the opposite corner, you would have to use an intermediary. If the corners are labelled A,B,C and D, then A connects to B and C, B connects to A and D, etc.

"The idea is that if one of these connects fail, the best you can have is a line or two cpus. I imagine CPUs can fail in several ways, but as far as we know from popular lore, bad lines in the die make the CPU run slower. But as for deliberately disabling the HT link, that would be logical. I think it would all boil down to some statistical equation though as far as failed Hammers becoming Clawhammers.

"I think you also have to consider WHICH HT failed, as the pins which connect each would be in a physically different spot on the chip, and you would need a "Left hand" and "Right hand" CPU to be able to connect them together, or some such combination.

"Imagine the three HT's are on pins 1-8, 9-16, and 17-24. Let's say two way chips only use the first two ports. But what if the second port failed? You couldn't use the chip anymore. If the third port failed it wouldn't matter, since it's unused for two ways.

"But I think the original intent of the writer was that the chip failed in some other way as a full MP unit, therefore you disable a HT to make it a two way.

"So, my conclusion is, if it failed for MP use in the first place, why should it still be able to be used in a "lesser" MP application? The only apparent case is if it failed in a very specific way, that is, the correct HT area of the die failed, which seems unlikely.

"That's as far as the argument can be taken without knowing more about the chip making process."

JC (not the one from the Website), wrote:

"Your reader recalls incorrectly. Each HT port uses a lot of pins, in excess of 200. Think about it, it is unidirectional, so 32 bits is actually 64, 32 in each direction. It is also differential, so each unidirectional bit is actually two pins, and the signal is the difference between the two pins.

"So that bumps the number to 128. In addition, each 32 bit port can be used in 4 bit groups, and there needs to be control lines and clocks for those groups. As a result, the 3 HT ports with 128 bit integrated memory controller likely has close to 900 pins, not a cheap package by any stretch of the imagination.

"No, the ClawHammer is likely to have a single HT port. 8 bits of that port would be used to for I/O and 24 bits for SMP. One of the processors would hook to a south bridge like the nForce chip (it uses an 8 bit HT port), and the other processor would interface to a HT->AGP chip. Now nVidia has made noises about putting an HT port on their graphics chips, maybe this is why?"

Our original correspondent, CD, said, responding to the second of the letters above, said: "I should have put more detail into it. The one point I would pick on in his argument is that a testing bad HT chunk of the die does not necessarily lead to a left or right CPU. I am thinking of something like a 'golden bridge' on the CPU package that determines which HT link is connected to what pins.

"That is, if bridge XYZ is open, then HT1 goes left, and HT2 goes right. I envision this more as an option that is read at boot and implemented in firmware rather than a mechanical link through that bridge. I don't know if this is feasible, or it is worth it to salvage a small # of chips, but it couldwork. If the HT link takes up 1% of the die, and you have an 80% yield, you are looking at 2% of 20% as potential 'downgradable' chips. Probably not worth it. I agree with the rest of it.

"I would also like to publically (well privately anyway) state in very ambigous terms that if the hammer line differs in ways other than # of HT links (and the usual speed/cache/price) stuff, I will eat one. It would be stupid not to keep a single core and play with the outlying bits. The verification costs would be huge. If you verify the worst case, and then only use, or cut out the rest, you have a line for the cost of a chip. Why not, esp on a tight budget?

"Doing stupid things like limiting it to a 64b memory interface, like has been rumored on Ace's, would, err, must resist, nope, can't, hammer performance. This CPU is about bandwidth and low latency, if you fsck with the memory access, you are hitting yourself in the worst possible place. Look at the Celeron and what bus speed did to the 700+ MHz ones.

"P.S. When I first saw the diagram of the 8-way hammer, I thought 'Gray code'. Genius. What am I talking about? http://hissa.nist.gov/dads/HTML/graycode.html

This would kick ass, reduce latency a ton, and 'self route' the CPU-CPU signals. All it would take is one HT link for each dimension of the hypercube they constructed. Then I looked closer. Nope. Sigh. Maybe in hammer v1.1.

"

Share this:

Comments

There are no comments submitted yet. Do you have an interesting opinion? Then be the first to post a comment.

aboutus
Advertisement
Subscribe to INQ newsletters
Advertisement
INQ Poll

Facebook starts selling shares

Will you buy Facebook shares?