The old adage 'Fight fire with fire' does not apply to non-metaphorical fires
The problem it seems is that if a DIMM shuffles off, it needs to be replaced in pairs or it will go bad again, or at least throw up a flag. Some of our more intelligent readers have traced it down to a latency problem in some DIMMs IBM shipped. If you replace just one, you probably will get an error again very quickly.
On the up side, it isn't an error error, more of a correctable problem. A firmware flash should fix the screaming P-Series box in the corner. On the down side, the service processor is smart enough to remember the bad DIMM so you can't reinsert it. A firmware flash should fix this, possibly explaining the two upgrades in short order.
The problem is that IBM technicians don't seem to be aware of this, or more likely IBM is threatening their children if they tell customers what the problem is. We wonder why they are trying so hard to silence this most likely minor but annoying series of failures? Needless to say, customers are not amused. µ