WHEN WE TOLD YOU about the 'bad bumps' in the Apple Macbook Pro 15-inch models the other day, we expected it to end there.
But as luck would have it, Nvidia pointed us to a much deeper problem that not only affects at least some of the Macbook Pro notebooks, but likely every other high Temperature of Glassification (Tg) underfill chip Nvidia makes.
Technical Background
To understand this article, you really need to understand the problem, so please read the technical three part series (Part 1, Part 2 and Part 3) explaining what the problem is and where it occurs.
Nvidia's current problem stems from its half-hearted response to its earlier problem by only changing the underfill. Nvidia said that's what it did, both near the end of our initial Macbook article and in a later Cnet article here.
In that, Nvidia's Mike Hara said, "Intel has shipped hundreds of millions of chipsets that use the same material-set combo. We're using virtually the same materials that Intel uses in its chipsets." Note the word 'virtually'. The problem with this statement - other than his analogy being misleading and not addressing Nvidia's chip design problem - is that 'virtually' in this case means Nvidia missed a key coating component in its revised chip engineering design. It is NOT the same material-set technology as Intel, AMD, ATI and everyone else we talked with uses. Unfortunately for Nvidia, the coating material it left out is critical for the life of the chip.
Before we break out the electron microscope again, we feel the need to point out some of the things that Nvidia managed not to talk about in its purported explanation of the fix. It is sad to have to point this out, but underfill does not crack, bumps do. The bumps that cracked did so for a long chain of reasons that are explained in my earlier three-part article linked above.
Nvidia changed one of the steps in the chain, and seemingly none of the others. This might change the frequency of the bumps cracking, for either good or bad, or it might not. It might also introduce a new and much more serious failure mode, and that is what we believe Nvidia did.
Underfill is basically a glue that surrounds the bumps, keeps them from getting contaminated, and keeps them moisture free. It also provides significant mechanical support for the chip that is crucial for enabling it to withstand structural stresses, which are primarily caused by repeated heating and cooling cycles during operation.
There are two properties of underfill, Tg and stiffness. Tg is the Temperature of Glassification, which means the temperature at which it loses all stiffness. Instead of thinking about it melting, think about it turning to jelly. Stiffness is how hard it is before it melts.
One unusual property of underfill is that its Tg is related to its stiffness. If you want it to glassify at a higher temperature, it will be stiffer to start with. Lower Tg, softer initial stiffness. When making a chip, you have to balance between making the underfill so elastic that it effectively does nothing and so hard that it rips the chip apart on first power up. If you do things right, you make it as stiff as you can, but not too stiff. If the underfill is too soft, it won't provide enough structural support to relieve the strain on the bumps; too hard, and it will damage the underside of the chip itself.
Passivation Layers
Let's move back to how a chip is made. You all know about a silicon wafer - it is a 300mm silicon disc that you essentially draw pretty patterns on. Modern chips have multiple layers of metal that make transistors drawn on the silicon, and on top of each other. You can see some of this in the microphotographs below.
Modern chips have multiple metal layers, eight is pretty common for devices like CPUs and GPUs. To prevent the layers from shorting each other out, there is a layer of insulation deposited between them - this is called the passivation layer. The resulting chip is a relatively thick hunk of silicon with a 16-layer or so sandwich on top that goes metal/passivation/metal/passivation and so on. It ends up looking like a Roman aqueduct in a cross-sectional view.
An Intel 90nm CPU sliced
In a very simplified explanation, the more insulating you make the passivation layer, the faster the chip can work. This means low-K materials like Black Diamond are really useful, but they are also very fragile. You might have eight of these layers, and they have holes punched through to allow communication between the layers. The structure isn't all that strong to begin with, and the holes don't help. On top of the sandwich, you have an outer coating, usually Silicon Nitride (SiN), which is basically a hard ceramic shell that protects things.
Remember, these devices are called flip-chips because when they come out of the fab, they are flipped over, and the bumps go on what was the top. This is then covered with underfill and soldered to the substrate, the green fiberglass thing that most people think of as a 'chip'. The former top during fabrication is then the bottom after packaging, and the underfill touches the substrate and the SiN layer.
Because the SiN layer is pretty stiff, any strain on it will be transferred into the layers of the chip itself fairly directly. If there is too much strain, the layers of the chip peel apart and you have what is called catastrophic inter-layer delamination, and that kills the chip even deader than cracked bumps.
This means you have to change the passivation material to a stronger substance to take the stress. Unfortunately, the passivation layer isn't just an option you can readily change out on an already designed chip. Different choices in the passivation layer have cascading effects in the chip design and manufacturing process. This is complicated by the fact that there aren't that many viable choices to begin with. What you end up with is a limit on the stiffness of the underfill. This is why Nvidia didn't just crank up the underfill Tg a year ago - it has very serious consequences, most of them fatal to the device, and there are limited underfill options for a given passivation layer material.
A good analogy is a light bulb and a steel plate - light bulbs are fragile, steel plates are not. If you hit a light bulb with a hammer, you get lots of little pieces, but a steel plate will shrug it off. If you put a steel plate on top of a light bulb, carefully, and hit it with a hammer, you will not damage the plate, but the bulb will shatter just as if you hit it directly. This is very similar to how the strain within a chip assembly gets transferred, and the chip is basically a multi-layer light bulb and steel plate sandwich.
Polyimide Layers
Luckily for chipmakers, there is a third option that allows you to have a fairly stiff underfill and not tear things apart. It is called a polyimide layer (PI), and it is a relatively thick - we are talking µm here - coating that you put on top of the last passivation layer. The PI layer is kind of rubbery. It absorbs some of the strain so the passivation layers don't have to, and it also distributes it over a wider area.
In essence, the PI layer simply protects the chip more. This allows you to use a stiffer underfill and not tear things apart. Notice I said stiffer, not solid steel. If you go too far with a stiffer underfill, you will transfer too much strain, and the chip will still die. The PI layer gives you a bit more leeway, taking more stress off the bumps, but you still have to choose very carefully and test the results to an amazingly high degree.
In the Cnet article, Hara said Nvidia changed the underfill, and we will assume that he meant Nvidia stiffened it, not made it softer. Softening it would only increase the problems they had with bump cracking, and while we may not hold Nvidia engineering in all that high regard, we can't assume they are abjectly stupid. So, Nvidia changed the underfill to a more 'robust' version, and didn't change anything else. We actually believe this story, mainly based on the parts we have dissected.
All is well, right? Ride off to the coffee shop in the sunset with your new Macbook happily working, Nvidia chips not dying in large numbers. However, there is only one tiny problem with that ending.
The Problem In Pictures
Remember when we said that Nvidia engineering wasn't abjectly stupid? Scratch that. Remember when we said we were going to break out the electron microscope? It's time. Remember the part about the PI layer being necessary for stiffer underfills? Guess what?
A test chip with a SiN layer
What you are seeing is the top of the bump, where it contacts the chip. The round light grey part on the bottom is the bump, the darker gray on the top is the the silicon itself. The spotty stuff above the top yellow line is the transistor and passivation layer sandwich - the aqueduct - and the dark grey area on the right is the underfill.
This chip, a materials test part, has no PI layer, just a SiN coating. You can see that the SiN coating is not even 2µm thick - it is the dark line that crops the top of the bump and ends at the pad on the chip.
For those of you who have been paying attention, you may notice some clumping in the bump material - it is eutectic, not high lead, and the clumping is a result of enthalpy. This is a thermal test chip, not a production part, used for heat cycle testing. That is why the bump material clumped, repeated heat cycles.
A test chip with a PI layer

This next one has the same major components, but you will notice the SiN layer is much thicker, 5 or more times, almost 10µm. That is because it not only has the SiN layer, but it also has a PI layer to absorb stress. This chip is also a test vehicle, and has eutectic bumps and a higher Tg underfill. We can conclude from this that a typical PI layer is 5µm or more thick, and a SiN layer is visibly thinner. Things may change depending on the fab, materials used, and intended use, but the rough thicknesses won't change much.
The bump from a Macbook Pro 15-inch 9600 GPU

Last up, we have an close up of the bump from the Macbook Pro's G96/9600 GPU. It is a high lead bump with, according to Nvidia, a higher Tg underfill. This means that the SiN layer should be under 2µm thick. Check, it is. Then the PI layer should be another 5+µ or so. Che.... Hey, wait a minute, there is no PI layer! No, really, it is not there.
Yeah, you are thinking right, Nvidia simply forgot the one critical layer to make its much vaunted, and on the surface correct, high Tg underfill work. To that, all we can say is that it does indeed seem so. If anyone has a better explanation, and several packaging engineers I talked with did not, feel free to chime in, my email is at the top of every article.
What this looks like is that Nvidia traded a bump cracking problem for an inter-layer delamination problem. Both lead to a term that semiconductor people call catastrophic failure, something you don't need an engineering degree to understand.
According to multiple packaging people contacted about this story, all of whom want to remain anonymous, this is a much worse problem than bump cracking. Phrases like "abject stupidity" and "how the [fsck] did they miss that" were tossed around, but still, they did.
In these conversations, several scenarios were put forward to explain it. None of them posit that it won't be a problem, they all say that it will, they were simply grasping at straws to say how Nvidia missed this one.
The first scenario theorizes that Nvidia had a bunch of high lead wafers sitting in inventory. When it first learned about the problem, it stopped bumping the chips because it knew where the problem lay, just not why. When the engineers got the go-ahead to restart the line with high Tg underfill, they had to use up a few months worth of wafers. Because a PI layer can't be applied after the wafer is fabbed, they were stuck, so they crossed their fingers and hoped someone like me wouldn't notice. I did, and if everything we hear is true, Macbook Pro owners and a lot of others will also eventually notice, as well.
The next theory is slightly more plausible - that Nvidia didn't have time to properly test. A heat cycle test of packaging material takes about three months to do, and you can't really rush it. If the first new parts started rolling out of the fab on July 1, 2008, the first day of Q3, and it takes about three months to set up and qualify a new fab process, that means the fab had to go into production setup on the first day of Q2.
Subtract out a further three months to thermal stress test the solution and Nvidia had to have started that around the first day of Q1/08, meaning that its engineers would have had to flip the switch on testing with a New Year's hangover. If the bump cracking problem was discovered in the fall of 2007, maybe even late summer, there was only one quarter to figure out what the problem was, research alternatives, and make test structures. There could not have been time for a second round of tests unless Nvidia knew about the problem far in advance of what HP and Dell admitted to.
The most likely way this would have played out is that Nvidia tested the structures, and none worked out well. Its engineers gritted their teeth and took the most promising option, no PI. The other scenario is that Nvidia didn't figure it out early, and was rushed to come out with a 'fix' because Jen-Hsun had to file an 8-K and let the public know. Not having an answer and a fix in hand would not have been compatible with executive egos, so the engineers came up with an answer, but they couldn't definitively say that it would work.
In either case, the length of testing time required is probably what bit them. It is a long and intricate process to stress test chips like this correctly. Nvidia has shown with the initial bad bumps problem that it botched that across multiple generations, so why should we give them the benefit of the doubt this time? The more interesting question is, when did it know what?
Next up, we have the long shot scenario, that Nvidia packaging engineers, if they actually have them rather than outsourcing everything, simply missed an entire branch of science. They all took a class on semiconductor engineering, but they all slept through that day. And didn't read the book.
One last thing to toss into the mix, cost. The PI layer is expensive, it adds about $50 to the cost of a wafer. Wafers from TSMC on a high end process cost about $3,000 to $5,000 depending on a lot of details. Adding the PI layer increases the cost of silicon by a noticeable amount, and adds to the defect rate.
For cards that sell to big OEMs for $30 or so, silicon can't be more than a few dollars of the total. Adding 25 cents to the cost of a chip is a big deal, it can mean the difference between profit and loss for the entire run. One engineer suggested that Nvidia might have shot down the PI layer on cost grounds, but we don't buy that. They weren't that desperate, were they?
Analysis
What does this mean? Unlike what Nvidia has been implying, we have never stated that the 'bad bumps' in the Macbook Pro 15-inch would cause a failure. We simply stated that it is using the same material that caused failures in the older Macbooks, several HP and Dell lines, and likely many more that Nvidia has not admitted to publicly. The consumer has a right know this about the products they are buying, and Nvidia steadfastly refuses to tell them.
This time, we see a potentially much more serious problem, and no doubt it will be explained away with pseudo-science and sound bites. Tame journalists and bloggers won't bother to question the science, won't understand it, and will take the easy, canned explanation at face value. No problem will ever be admitted to, and the problems that Macbook and other computer owners encounter will be something else, a rare anomaly, a one-off, trust them. Really. Apple did.
Once again, this is not saying that the Macbooks will fail, or that the one you have will fail. We are simply stating that, according to all the packaging experts we talked with, none of them could come up with a scenario where this is not a massive problem. Once again, time will tell.
Rebuttal
In the best of half-hearted PR speak, the Nvidia rebuttal (see Cnet link above) claims my initial investigation of the 'bad bumps' was "already flawed." Nvidia won't say how my analysis was flawed, but it tosses that out in an attempt to tarnish the evidence. It also won't say what parts are affected, so there is no way to tell for sure. If I am so wrong, why cover it up?
As for all high lead bumps being bad, that is simply not true, not once did I say that. I stated that given a chain of engineering failures, bad choices, and inadequate testing, these parts are failing. There is a long chain of events that causes the failures. Read the three part technical explanation linked above for more.
Nvidia is claiming that it changed the underfill material, and had Dawn sprinkle a little green fairy dust on them, and all is better. Every engineer I talked with disagrees. It is clear that they missed a critical step in making these chips, so changing a single step in the chain will very possibly make matters worse.
If you look at what the higher Tg underfill does, it moves strain off the bumps, and puts it on the SiN layer, which transfers it to the fragile passivation layer. Nowhere has Hara said that Nvidia attempted to reduce the strain that causes the failures in the first place, much less accomplished that goal. In fact, he admits the opposite, unless I misinterpret the statement, "What we did was, we just simply went to a more robust underfill." This is a band-aid, applied by a fairy, sprinkled with pixie dust. Sadly, it does not appear to be a thoroughly engineered fix.
Hara said, "The material set (combination of underfill and bump) that is being used is similar to the material set that has been shipped in 100's of millions of chipsets by the world's largest semiconductor company (Intel)." In saying that, Nvidia was right, it is similar. Similar is NOT the same, and the devil truly is in the details. He is right that every semiconductor manufacturer that uses a high Tg underfill uses a similar recipe, but all of them that I talked with, every single one, also uses a PI layer. Period.
The Man Behind The Curtain
Last up, Nvidia is strongly hinting, like in this Gizmodo article, that there are some mysterious, nefarious forces behind my reporting, and that electron microscopes are hard to come by. The implication is that I couldn't pull The Big Picture Book of Science out of a paper bag with a map, flashlight and guide dog.
It may be true that I am not up on the latest techniques at the cutting edge of electron microscopy, but my years of college - going from chemical engineering, to chemistry, to biology, to genetics - weren't a total waste. Reading the output from a spectrograph isn't that tough when you have been holed up in a lab doing similar work with related devices for years.
That brings up the crack about electron microscope scarcity. They really aren't that uncommon, it's just that Hara probably doesn't know where to look for one. I live quite close to the University of Minnesota, and last time I attended courses there many years ago, there were lots of them sitting around, some better than others.
Every major semiconductor design house has at least one electron microscope, likely many. They are indispensable research tools. How many does Nvidia own? I don't have a clue, but stories like this don't seem to imply that they are all that uncommon. In fact, I have seen dozens in tours of companies around the valley. In defense of Mike, he is an investor relations executive, and the SEMs at Nvidia are probably on a floor without executive washrooms.
Hara blames Nvida's competitors for being behind the story, and that is quite plausible on the surface. Really, Nvidia is cuddly, nice and honest, right? So who wouldn't like them? I mean, Nvidia openly declared war on Intel. It goes out of its way to antagonize AMD, treats the press like dirt, and plays its partners off against each other. A better question would be, at this point, who actually likes Nvidia? If you answer Joel Turnipseed, the guy in Iowa who lost all short term memory in a car accident in 2004, you might have the one.
One other thing that Hara doesn't appear to realize is that there are a few dozen teardown houses within an hour's drive of his office. Companies like Nvidia use them all the time when they want plausible deniability, a 'second opinion', or to dodge some trade secret laws. In fact, most semi companies use them regularly.
Some of them are public, others less so. A quick search for 'chip reverse engineering' should net you a dozen or so in very little time. To quote a friend from a large semi house, "The good ones don't have names."
What they do have, however, is a lot of expensive equipment, like the electron microscopes that are so craftily hidden at Nvidia headquarters. They also know how to use them well. One last thing, their business is quite 'peaky' - when a new chip comes out, they may tear it down, or tear down a few, and make a report. These reports sell for a lot of money, and that tides them over until a new part is released. In between busy times, some of them sit around bored, throwing darts at pictures of their former employers, while some stay busy 24/7. It simply depends.
What it comes down to in the end is that there is simply no shortage of companies, large and small, public and shadowy, that do teardown work. It really isn't all that hard. There is also no shortage of companies that dislike Nvidia - when a company sets out to piss everyone off, it often succeeds. The list of capable organizations with motives is not short, in fact it is very long.
Then again, it was my idea to begin with. When a company responds to an easy direct question with dodgy doublespeak, or answers another seemingly related question instead, alarm bells go off. Having solid information about the chips before you ask the question aids immensely in analyzing the PR/IR output. The bells went off this time, and the digging started. Several 'mad scientists' liked the idea, and agreed to help out as time permitted. It took two months, but the results were worth it.
Conclusion
In the end, what you have once again seems to be a massive engineering failure. This could, but not necessarily will, lead to inter-layer delamination failures. The Macbook Pro 15" GPU undoubtedly has the problem, and it is very likely that every Nvidia chip with high lead bumps and high Tg underfill does as well. We are still analyzing the eutectic bump parts, and will follow up with a report if we discover anything conclusive.
Nvidia is still stonewalling the first problem, and likely won't admit to this one unless they are forced by law to file an 8-K once again. Remember, the last admission was not voluntary. Once again, we will state the obvious: Nvidia has to come clean over this, admit what models are affected by the bump cracking, what computers the chips went into, and what chips are affected by this latest missing layer. Then the customer can decide. µ
Note: Apple was again called twice prior to publication and informed that there is a potential problem. Instead of calling us back to tell us that they knew about the issue, and had dealt with it, or would stand by their customers, Apple simply ignored us once again. Because of this, we award Apple the Steve Jobs Memorial Turtleneck for Pride and Arrogance (SJMTPA) for turning an opportunity to respond positively to this situation into mud. Own goal guys, zero for six!
Since you beat around the bush instead of answering the question, who's microscope did you use?
You starting off trying to be objective and unbias, but your article ended up being a ranting... sounding like that old ex who just would not stop calling my home. Dude, you really need to get laid.
hmm, nvidia owns factories? I thought they are strictly a design house.
if i was AMD i would fire the marketing department and just hire Charlie - brilliant reporting and god knows how many people have changed their mind on what to buy based on this reporting.
i know i changed my mind - my current nVidia will be replaced by an AMD with the next opportunity.
This is called making a difference.
Good work Charlie - dont hold back
Excellent work Charlie. The consumers have a right to know what they are buying and when they are being tricked. You're doing a great job. Keep up the good work.
I was quite ignoring the posts about NVidia Chip failures, up until Gefore 8600 card started showing artefacts in 3d mode a few days ago.
It is visible as a tiles of corrupted pixels in 3d rendering.Even the 3d image quality preview window in the Nvidia control panel is affected.
I removed the card cooler and cleaned off che GPU and found that the GPU is having strange lighter spot evident under the chip coating.
So in my book this is quite as the articles predicted.
The card have never overheated nor have been used extensively overvlocked.
What i find interesting is that there is no more publicity to this issue since the only place i found the issue mentioned is the INQUIRER.
I will try to get the chip pictured post on the web.
I have beem using mostly NVidia cards - 7600GT and Geforce 4Ti in my old rigs
and all are working non stop from day one.
It's quite unexpected failure to a card 9 months old.
and as a desgin house you have to have equipment to test and analyze your designs.
Even if one could accuse Charlie of ranting, or going after NV, and using a yet undisclosed "partner" to provide the equipment to do so. It doesn't change the evidence before us! If Nv is of the opinion that the evidence is false, untrue, incomplete or in any other way questionable, it is up to them to refute this. Though they need to support this by providing evidence of the contrary, just saying charlie is wrong doesn't cut it. NV has made many enemies, but they are likely to anger the most dangerous of them all, share-holders! They are obliged to provide information that could potentially affect the results of the company, and withholding so, might not be looked upon so kindly in these theys.
In the gentle art of making enemies, Nv does its job well.
Don't worry, that's exactly what they have done.
Thanks Charlie,
Was going to go with a MacBook Pro for Christmas but after reading your articles and having 2 x 8800 cards go tits-up on me, I went with an ATI based iMac instead.
I really think if he was pulling shit out of his ass and handing it to us like this he would have been sued for libel. Especially the claims he is making against Nvidia. Since that hasn't happened yet, I assume some of the stuff he is reporting must have some basis to them.
On related news my 8800gt failed after 10 months of service. Not really sure why it failed just all of a sudden it started showing tons of artifacts all over the place. Didn't overclock it, or anything for its entire life... I wouldn't think anything of it if it died within 30 days of owning such a product, but a 10 month old card dying makes me ponder if it was from the bad materials they used that were described earlier in his articles.
I'm a bit more convinced about this article than Charlie's previous rants....
- but it kindof negates the previous 'bad bumps' rant/article, as Charlie is finally admitting that the Bump aren't bad at all
- and that nVidia are using High Tg underfill
- and that the High-Lead + High Tg materials set is a workable solution (as used by Intel)
- but (according to Intel) it should have a polyimide layer to distribute the stress better.
- so I guess someone needs to show that the newer nVidia chips are suffering from catastrophic inter-layer delamination.
- or nVidia should come back & say that although they're not using a PI layer, there tests says it's ok ...
Hats off for the tenacity do what you did. Made for a good read if nothing else. Does anyone see the resemblence between Jen-Hsun Huang and Kim Jong-il?
Love you man, keep it up, ignore the haters.
Ok, I'm loving to see you raising continously the level of your research to actually call what you do journalism. Investigative journalism is what we need. That was just bloody brilliant. WELL DONE!
We need people that have the guts to do what is right and get the truth out there.
Charlie, I know you're just doing your job but thank you for doing it so well!
Dear Nvidia, time to come clean before this gets worse & every none techie knows about this.
Your comment about Black Diamond being fragile is right, but your comment about "holes" adding to that fragility is actually wrong.
Those "holes" are called vias and used for metal layer connections. TSMC and most vendors use a tungsten alloy for those vias, and that tungsten is far, far harder than the Cu or Al used for metalization.
If you're a smart chip maker, you generally use massive arrays of vias down several layers from top metal under your pads. This additional structure really helps low-k materials structurally (it's a pain for designers since it makes routing nastier). While this is most commonly a solution for wire bonded chips, it's also an issue in flip-chips with high thermal envelopes as a way of reducing thermal stress (been there, doing that).
We also don't generally call the insulating layers "passivation." That's reserved for the final SiN or other coating that's really DESIGNED to prevent moisture and other contaminate intrusion.
Finally, as to PI, you're right that's the best solution to the issue you can have, but it's also expensive. The Big Boys With Big Margins can use that and have for well over a decade (see IBM's C4 process back in the 1um days for an example). If you're more cost sensitive you find other solutions as I alluded to above. There are good ones, and then there are the ones Nvidia seems to have been using.
So, you're near the UofM, too? Small world ... or not, considering we're interacting via a system in the UK!
Great breakdown; hopefully, Apple wakes up and quits being a douche about the issue... I have had issues TWICE now with my Macbook, and I am just waiting for the third failure....
Just to say thank you for your work, it really means a lot when making hardware decisions.
I can reveal that ALL Dell M1330's (nVidia graphics) in our office have had their mainboards replaced now (100% failure rate); it's almost 1.5 years since we got them (we were among the first). How long before the replacement fails?
While Dell has publicly disclosed the problem, they still REFUSE to say whether the replacements are fixed. Just read their "blog". Is that the way to treat customers?
We're not buying any laptops with nV hardware, even (Intel) integrated is better than having 100% failure rates. The drivers are not as good but they are stable - it's not a gaming rig after all!
Vote with your wallet, lying companies do not deserve our money!
I'm not saying that charlie is making anything up, I'm sure it is all true, but I would care more about this if it was about anyone other than Nvidia. The reason being is that its all Charlie ever goes on about. He never takes the time to do this kind of reporting on anyone else and he only did it here because he appears to have some huge grudge against the company that he can't shake and seemingly won't be happy until he blows them up and destroys all the pieces left. Hey Charlie, I'm glad you finally knocked one out of the park on the Green Monster but it would have meant so much more if it had come from anyone other than you. Maybe it would help if you wrote about something other than Nvidia and how bad it is for once.
@ DarkElfa & ojeee, no matter how much evidence you gather and how many experts you consult, the Nvidia fanboi's will always try and attack your credibility. The trail of shit that Nvidia has left behind in the last year I'm glad i purchased a DAAMIT product.
It doesn't matter who's microscope he used, as indicated, they are readily available and he has access to one or several.
Good work Charlie, mark one for the consumer. You are a rare beast, and not many people are willing to put forth the effort and dedication that you do into your work. Keep it up.
Keep up the excellent work Charlie. I don't want to see nvidia fail. I just want them to fix the manufacturing problems. If they would just "do it" we could all put this behind us and nvidia could get back to making us all happy with speedy video cards.
But nvidia constant denial and doublespeak is really pushing me to ATI.
Anyone have more info on why Charlie hates nVidia so much?! There's got to be more to it than his journalistic integrity (?), or nVidia's arrogance (which $B isn't?). I sense a deep and tragic story of hot Geforce3 cards, drunken sex, lost-love, and broken hearts!
Thanks to your investigative journalism, I will NOT be buying ANYTHING with an nVidia graphics chip in them, for the time being at least. I know that you said the 9400 chip uses eutectic bumps and not lead bumps, which seems like a step in the right direction; however, this appears to be a systemic problem that stems from too much greed. Greed is good, that's why these companies are in business - to make money - but it is short-sighted and moronic to cut costs and/or engineer a beast that's too power-hungry just to get a leg-up on the competition. I have gone with ATI (AMD) for my desktop video, I will go with ATI for my next laptop purchase as well.
All you people are taking this man "Charlie" word like he is god or something. I have owened a Mac book Pro over a year old and I love it! Graphics are awesome and no problems. Nvidia beats all other companies in graphics so why would they put there reputation on the line by making crap...I would never go with anything other than an Nvidia and who am I? Just a hard core gamer who uses the products! Ask any real gamer who makes the best graphics cards and they will tell you its Nvidia...I bet they tell you this because the products work.
good investigative journalism, which you don't see much these days. i actually learned something from reading your article.
keep up the good work!
It's too bad political reporters in the MSM don't show this kind of tenacity. If they did, we probably wouldn't have had a war in Iraq.
Good Job Charlie. Keep up the great work.
Re: Mac Owner
You said:
"who am I? Just a hard core gamer who uses the products!"
yet, you use a Mac? Aren't hard-core gaming and Macs mutually exclusive?
Keep it up Charlie !!!
Your killing 2 hens with one hit in this article. NV bad engennering and raising the bar (in quality) of web articles.
Great article, excelent digging. I'm impressed.
@macowner
Hardcore gamer and Apple Laptop in the same sentence....sounds wrong mate. Very wrong. Hardcore gamers don't ussually use laptops. I doubt my Crossfire could fit in the Air.
I have to say I'm impressed. Charlie, I always thought you were too busy pushing the political and ideological side of things with Microsoft to actually do good journalism.
It's obvious you've done tons of digging, checking, interviews, research and testing to find out the truth while being slapped by nVidia and some of it's partners.
Kudos for that.
...artifacts horribly at any temp or speed.... terrible design. but i'm not going to rant on and on about it. chips happen, man. get with the times, dinosaur.
i might just have to find another tech site if i have to keep trudging through these rants to find informative articles. this does not inform me, it wastes my time.
So, Mac owner..... How's Far Cry 2 look on your Mac? Whats the frame rate like?
How's that DX10 look on a Mac? I bet you get better fps on Crysis too, eh? BTW, hows the SLI system working in your mac? I take it you can game on that beautiful 14ms response 30" cinema display! Go Gaming Mac's!!!!
.......btw, jobs dosent like gamers.
A.) gamers upgrade components too much (do you think nvidia will dedicate a team to make products and drivers just for mac's...not till mac has a bigger market share.)
B.) he cant justify charging you 5k for 3k worth of equipment. (gamers are "usually" intelligent and wont pay 2k for fanboi status)
Still waiting for those 1,000 layoffs you promised nvidia was having a couple months ago.
Hi Charlie,
thank you for giving us detailed information about this "problem". Your articles about the bumpgate helped me to convince my boss not to buy 150 pc with nvidia graphics next year. We go with DAAMIT 780G instead ;-)
I hope others are also doublechecking their investion plans ...
Has anyone got a definitive list of the suspect parts?
Charlie is jealous that he does not have the Physx that Roy 'The Boy' 'Terrific' Taylor does. Maybe he's born with it? maybe its mabelline. Roy Taylor says Ntwitia is the world’s most technologically advanced chip; so it is goodbye Intel and AMD. TWIMTBP Crysis can't be played at its maximum settings even on top-line PCs today. Charlie, you're having a Blue Peter moment. I get the same pocket money, but it seems that this year I haven't got as much, as everything seems to be getting much more pricey. Expect to get less for your money. Quality discounts. Nvidia plan to replace legacy 3 chip design with a two chip marriage. Settle down! Charlie, why do wish to continue to hold the Stupendous Incompetence accountable to account? So without an itchy trigger finger during violent disorder of massive engineering failure, the villains can sometimes get away, isn't'it? Innit? So Crysis are averted; seek solace in a quiet pint. BTW, Charlie, using millilitres might encourage more people to drink less. How daft is that? Just who is the Driver and Vehicle Licensing Agency looking to bump off? Know what I mean? I'm off to play "Chase the Lady". Thank crunchy it's Crimbo!
Apple's been gobbling up the laptop market for the past couple of years, seemingly based on the high quality of their product, part of which is their hardware (an especially important consumer perception now that they've gone Intel.)
If for no other reason then to maintain their reputation with their user base, once the current Macs begin to fail in a year or so, Apple is going to need to sue the living snot out of nVidia.
This will have a chilling effect on others considering using nVidia hardware, and may well kill the company off during the recession/depression...
You've come along way, but I hate to inform you semiconductor packaging / die system engineering is much more complicated than you make it seem.
The material set that Nvidia is using has been used with hundreds of millions of other devices. The problem is, Nvidia has an EXTREMELY large die. The material set they are using wasn't designed for use in a 500mm sq die, but a 50mm sq die. These are the largest die made by the foundry's which make their products, and probably the first time they've seen the problem.
Your terminology is also way off. The layers of the chip are called dielectric layers or inter level dielectric (ILD). The passivation is a single, uppermost dielectric layer (normally nitride). Most people cover it with a polyimide as you have discussed.
hey, its time to test the desktop parts
here is an idea: buy 2 nvidia gpus, then stress test one -for a few months- and keep the other unused. cut and compare the 2 using that bad ass microscope of yours!!! god work!!!
\m/(^.^)\m/
Is that you Drashek? I am pretty sure it is you because I understood nothing in your comment heh :P
Kudos to Charlie, he is one of the few decent people willing to do their job for real, without copying everyone else's stories on the interweb.
Charlie, You're a madman, yes this is true. but let me say, i'm a firm believer that madmen dont go off on tangents unless they know something! most people write off what these madmen have to say, calling them crazy and everything else, when in the end, these madmen were right on the mark, usually only lacking the means to convey their knowledge to others in a useful rational manner. but today, one madman figured it out.
I've been behind you on this 100% Charlie, from the word go, the proof is right under everyone's noses, like someone already said above, if you were wrong, nVidia would have sued your ass all the way to broke street and back to skid row, but they haven't, because such lawsuit would have brought out in court the very points you are making here, so they chose the silent route. Silence, my dear friends, says more than their words ever could.
My hat goes off to you Charlie Demerjian, you're a real man and a class act, not afraid to stand up to the big green monster. You've got balls the size of church bells, made of solid steel! don't let up on those bastards until they're on their knees crying like a baby.
Very informative and exceptional work Charlie! You've really done your homework and done some REAL investigative Journalism. If you keep this up the Inquirer might actually have a reputation to live up to. Sorry, edit that, GOOD reputation to live up to ;)
I also noticed that the Nvidia fanboi protests are getting pretty weak.
As for the hardcore gamer on a MAC... that was the best laugh I've had all day!
Cheers to you all, Seasons Greetings!
So there are no other chips, chip companies that have ever in their life had this problem?
I was shopping for laptop for kids two months ago and ended with Gateway. Why? The answer is: Charlie. Laptop I bought is with ATI 2600 Mobile 512MB card. Sure, it is great for games but when I sometimes use it for something I hate it to ultimate sickness. Again why? Because of lack of extended resolutions exceeding native LCD's one...Though I understand this is because of lack of good ATI Vista drivers and in future they may start to surface. Or it might be the laptop will die before I will ever see them.
First, I’d like to say thank you very much for a well researched and informative article. I would also like to thank you for your efforts to inform us (the consumers) about problems with popular products. I too have had the misfortune of dealing with nVidia’s aberrant attitude towards the press, and as such, your revealing articles incite a sense of vindictive pleasure.
To those among your commenters who claim that your credibility is lowered by your persistent reporting on nVidia and AMD/ATI, I would like to give this food for thought:
In journalism of all forms, it is not at all uncommon for a given journalist to specialize. I am lead to understand that in the united states, people actually compete for the opportunity to do little more than report on the goings on of the White House. Others report on environmental issues, Big Oil, or any of a number of different topics. Many of these journalists report exclusively on their chosen topic for a large amount of, or indeed the entirety of, their professional careers. These are professional journalists who live, sleep, eat and breathe their branch of journalism their entire lives.
I am uncertain how the economics of being a journalist for an online magazine such as The Inquirer or The Register works, but in my understanding, with only a few exceptions, most “staff” at these online magazines have day jobs. Their reporting is something done on the side for whatever reasons matter to them. (Money, the desire to get published, a desire to inform people about whatever they have information about.)
Charlie has stated in this article that he has some fairly extensive academic and professional experience, and I am assuming at least one job-worthy degree to back that up. While I admit it is an assumption to on my part that Charlie makes use of that (probably very expensive) education in a nice environment that provides him with a steady, (and I hope substantial) paycheque. It thus follows from that assumption that he writes for the Inq in his “spare time,” and does his investigation and checking with various sources on a similar schedule.
Given his stated academic and professional experience, it seems to me that there would have been a lot of opportunity “network” with people who now work for semiconductor companies of various sorts. Given his nearly exclusive reporting on nVidia, ATI and AMD, one could make a reasonable assumption that he “knows people” inside one or more of those organizations.
That all said, there is thus at least one perfectly logical explanation why someone like Charlie could be writing for the Inq, and focusing almost exclusively on a narrow group of companies: he has some inside “poop,” and enough other things in his life to not write about everything else too. As to the “hate” his articles direct towards nVidia, I ask you: please, try to deal with that company as something other than a consumer before passing judgment. They are dicks.
Charlie, as a conclusion to this, please consider writing a brief “hi, I’m Charlie, and here’s why I write what I write at the Inq” article. It will never shut up the hardcore fanbois, but I am very curious to know if my wild guesses above are right!
Thank you once more for your excellent article, I only which they published before I purchased my latest “gaming” notebook. I am rather dubious about the life expectancy of something with an 8600M…
Keep ‘em coming, sir.
"Ask any real gamer who makes the best graphics cards and they will tell you its Nvidia...I bet they tell you this because the products work."
Real gamers make graphics cards now? :o I didn't know that. Must contact DAAMIT now I suppose. I'm a real gamer. I must make the best graphic cards.
On another note, nV has the best graphic cards? Huh? I think you ought to use the internets a little bit. nV stopped being the best in ... what... July 2008? And had their 300$ card creamed by a 200$ part?
Re: Phil - "I'm a bit more convinced about this article than Charlie's previous rants.... - but it kindof negates the previous 'bad bumps' rant/article, as Charlie is finally admitting that the Bump aren't bad at all - and that nVidia are using High Tg underfill - and that the High-Lead + High Tg materials set is a workable solution (as used by Intel) - but (according to Intel) it should have a polyimide layer to distribute the stress better." ----- What Charlie said is that NV's FIX FOR THE BAD BUMPS may be as bad or worse - leading to delamination. And what do you mean "finally admitting"? He did no such thing - and only fanbois are waiting breathlessly for him to do so...
GREAT JOB CHARLIE!!!
Peace Out homies!
Thanks for the greatest hardware soap opera ever! LOL
Lets check my recent purchases for friends/family.
780g.... check
hd4870...check
Reason? Charlie!
Nvidia, time to put up or stfu! You have been pwned by a journalist! Wish I was a fly on the wall to hear all the expletives being outed in Charlies direction! Charlie, watch your back..... may be cheaper to out you, than fix their chips! LOL
I would like to know does this effect or exclude the 9300 IGP and the 9400 IGP? Because these are really nice HTPC/CarPC microATX boards, and I need to know before investing into a whole motherboard and new intel chip to go along with it, otherwise ill just have to suck it up and go with amd/ati which the 9300 kills atm, still waiting on amd/ati to come up with IGP to beat the 9300 IGP
he's mad that NV is launching a dual chip card that charlie says was not possible...
"Ask any real gamer who makes the best graphics cards and they will tell you its Nvidia...I bet they tell you this because the products work."
You are using a Mac and call yourself a harcore gamer... does playing World of Warcraft now warrant someone calling themself a harcodre gamer? Do Macs even have Counterstrike/Cod4 or any other half way decent FPS... I've not been down with gaming for over a year but as far as PC based gaming went the bar was always set by the FPS titles. Hell do you even need a powerful graphics card to run Wow?
Thanks again Charlie, I've owned a 8800GTS for almost a year now and it hasn't failed on me yet. The Asus Striker Exteme though is always a constant worry due to their high failure rate (I got it cheap from ebay and have been lucky so far... my friend has had to return 2 of them and he's not the only one). nVidia really have lost my faith in them with their last generation or so of their product lines and I regret now opting for the nforce 680i and geforce 8 combo.
Now if AMD/ATI can just get the price performance together on their phenom IIs I would happily switch over from my current setup to a PhenomII/790FX/4870 etc.
Good reporting Charlie. Sometimes I think your reporting on Nvidia is a bit harsh but here I think you have been performing a valuable service to the IT community.
Well, I've been burned twice by the bad bump issue. Got an HP Pavilion DV9000 laptop in June 2007. Dead in March 2008. Beep code suggests it's video card (Nvidia 7600 go). Sent back to HP (who in the meantime issued a warranty extension of 1 more year on said laptop for this specific reason and put out a new bios to use fans more and drop my battery time even more).
Gues whet new system board and new bios worked worse. From May when I got it back to early December. 9 months to die since new and 6 months to die with new board and new bios.
Althoug i suspect HP`s replacement board was not new, just an old one with resoldered chips. (I suspect that since around the southbridge there is a red glue like substance).
I could have it replaced since the extended warranty covers it, but I am thinking why bother. To have it fail on me in 3 months now? I'd rather buy something from an HP and nvidia competitor. Be it with intel or AMD graphics. Good thing that my last nvidia desktop card was the TNT back in 98. never touched them since until my laptop and now I know why.
If 'bad bumps' is taken as a catchy euphenism for 'bad material set', then 'bad bumps' is a fair journalistic shorthand. The previous article basically says, "High Lead Bumps = Bad", which isn't true, as it's only 33% of the story (and many companies, including Intel use High Lead). This article says "High Lead Bumps + High Tg, with no PI Layer = disaster". Is that true? - I've no idea. But if it is true, then Charlie will, indeed, have done a good job in highlighting nVidias less-than-perfect solution to their "Bad Bump" problem. If it isn't true, then Charlie is wasting everybody's time. Personally, I suspect that the solution that nVidia has used here is likely to be somewhat better than that which they had before - just because my gut feeling is that nVidia wouldn't fix a problem with a worse one. Charlie obviously thinks different!
Generally, I'm a fan of Charlie's reporting, but I think he's a little one-sided at the moment....
This is what happens when pr. people decide instead of engineers.
We were all so wrong to say you can't be a Hardcore Mac Gamer! I found a list of some HARDCORE games, just for MAC! From the looks of it, The Sims 2 seems like the hardcore game of choice. Followed by QuakeWars.... o and coming soon... COD4: Modern warfare.... I wonder how full those MP servers will be with all those HC Mac gamers. Do mac gamers play with the one button mouse or that douche mouse pad.
http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=2083340580&page=1&bop=And&Pagesize=100
Charlie;
Nice work. We did stuff like that every day in my 16 years in the Intel Electron Microscopy Lab in Santa Clara. I don't recall ANYTHING Intel ever made that skipped the polyimide layer. Even flash memory chips that don't face heat and use wire bonding instead of bumps. I'm astounded.
Keep up the good work!
Please tell me why I have 9 dead card in my house, 7 ATI and 2 nVidia.
Why my 5 radeon 1950xtx died after 16 mounth all the same mounth.
Why the 10 8800GTX I bougth the same week are working.
The other died cards are Geforce FX 5500, Geforce 6200, laptop Radeon x700 and laptop Radeon x600.
I have 50 computers doing lan gaming, same work, same room, same motherboards, same power supplies, only different CPU RAM and GPU.
You have to follow the story before you post. This problem is only significant on certain models suspected of failure. many batches of 8600's and 9600's are verified bad at this time. No other series are VERIFIED to be defective. We can only speculate that they might be latently affected due to similar designs. Other than that, you can read the public statements, apple's RADAR and support bulletin reports, on top of what Anandtech posted from internal press releases to form your own conclusions. I have several Nvidia cards that died, but I'm sure most of them were not related to this particular problem.
I just lost an MSI factory overclocked 3870x2 bought in February 2008. The funny thing is, the old built by ATI Radeon 9800 Pro survived until I finally ditched AGP and went PCIe. It seems some ATI partners didn't get the resistors right on enough 3xxx series cards that there were problems. Not design problems of the caliber discussed in this expose of Nvidia, but a dead card's a dead card. Nothing should die under normal use after 10 months; not motherboards, not graphics cards. Thankfully, the HD3200's helped me survive until I get the card back. It can at least play LOTRO well.
So, did Nvidia screw up? Yes, big time, but ATI's had problems with partners since the AMD purchase. I always went "built by ATI" and had no problems, but the minute I went with partners, an X1600 died in one PC and a 3870x2 in another, both within a year. At least the first card didn't cost that much.
I got to admit, i've bought 3 sets of nVidia graphics cards in my days as a gamer, and because the time they were the fastest cards i could afford. 2x8800GTS, 6800GT and Ti4200. The latter two, with massive overclocking did 3 years with no problems, though the aftermarket fan on the ti4200 started to go just before i replaced it (whirrs a bit)
I'm dubious to overclock my 8800s, i really don't trust there to be no issues with them currently.
If i bought a high end card today, even looking at the new GTX295 i'd buy a 4870 or X2.
Who knows if they really are a hardcore gamer or not, but they didn't claim to be gaming on the Macbook did they.
Why, I am wondering, do these many teardown houses have access to chips that are barely just fabbed, and all that (still very expensive) test equipment - but not a bunch of Macbooks to stress test and see how long it takes them to actually fail? OK, that isn't what they do - but if such tests were done it'd find out for sure.
Saying that, laptops overall are still in an awkward stage of having way too many models available, with not easy options of being able to replace parts. People expect them to run the same as larger computers though, which obviously can dissipate heat a lot better.
Then try finding out which of the many lappys your local electrical retailer or supermarket is selling would be compatible with a linux install - chances are most of the models show up: no information available on that one. Take a 300-odd quid gamble that the LAN and wireless and graphics etc will run.
But anyway, to get back to the article - how other than a standardised Macbook testing could it be determined for sure how bad the failure rate will be? What is considered a tolerable fail rate - and can that be compared to same-similar tests done on lappys with different graphics cards or chips in?
Macs are popular for DTP, video edits, photo-related 2D work; there must be some studios or freelancers using them that can chime in here how reliable they've been or not.
And if the same chips (same type of fab process I mean) are going in other machines? - are any of them gamer-targeted laptops? They'd use the videocard part the most - but, I always wonder when asking people their opinion on tech they bought - yeah but, do you have food bits on your hands when you type? do you leave it near radiators ever? have you dropped it or knocked it against anything ever? - and so forth.
eg - everyone at my local video rental store for example seems unable to watch a DVD without scratching it to bits and leaving fingerprints all over it. They've even started giving out brand new discs that look like badly used second hand discs. Just to stress the point that tests done outwith testing environments do not reveal much unless you know how the product's been treated other than that.
re: the title & username - if it wasn't for bad babylon-industry ways of doing things, then actual facts about the likes of industrial processes would be way easier to get a hold of. Real people don't like to do a bad job.
Or put another way - it's useless to have spokespeople talking about the ins and outs of something they weren't directly involved with. Even if they are not lying, they won't know if they are or not! Is also why politics is a bunch of crap.
A bit lately, but thanks, excellent article. I assume Charlie might be wrong about the details, or even be biased.
But why does not NVidia refute using the same level of technical details? Is it not true that they try to hide the whole story? Why they do not sue for libel?
Based on that, I am confident there is a problem, and NVidia must tell which models are affected.
Until then I would not buy NVidia product.
It is true that ATI cards may fail - let somebody investigate for possible reasons and write an article - I would read it with the same sincere interest.
I think it is great that Charlie is willing to dig this deep. What is not as clear is if he is correct in his conclusion. Anyone with a failed Nvidia device that has the new fix, should send it to Charlie to take a look at.
Those of you who are over clocking chips need to make sure that you are not exceeding the temperature specs. for the graphics chips and if you are increasing the voltage to the part you should expect early failures.
There are of course many reasons for a part to fail.
Good work, Charlie. I enjoyed seeing some electron microscope images and reading about the details and dependencies of the fabrication process. I am wondering if you can do that more often? Thanks!