Remember, son, many a good story has been ruined by over verification - James Gordon Bennett
SOME OF OUR READERS seem to love our rants, and the more technical the better. One of these covers the basic floating point computation capabilities of today's CPUs and GPUs.
If you remember the early eighties, Intel came out with the first floating
point coprocessor, 8087, for its 8086 chip. While that thing ran at only up to
8MHz at that time, and took dozens of cycles for a simple add - even 1 MFLOPs
was like a pipe dream then - it brought in some important news and improvements
to an otherwise disastrous X86 architecture.
More precise, please...
A choice of 32-bit single or 64-bit double FP precision according to IEEE 754 standard, with all the rounding and normalising stuff, was a big plus by itself - the 15 digits of precision and exponent range above +- 300 meant a lot to the matematicians. On top of it, Intel also defined an 80-bit extended FP precision, also a part of IEEE 754, with all 80x87 ALUs and registers supporting it natively till this very day.
Yes, that 80 bit mode would bring only a few extra digits of precision - and a truly huge exponent jump too. But, for a long while, only X86 and now long gone Motorola 680x0 series had it hardwired in their FPUs.
The uber-fast 64-bit RISCs like Alpha, MIPS or Power stuck with 64-bit hardware FP as the end of it. So did the SSE1-2-3-4, as well as its AMD predecessor, 3DNow!
Explosive FP problems...
Interestingly, FP precision, rounding, normalisation, exceptions and other nerdy-sounding things can have horrible real world consequences if not followed sometimes. Remember the June 2006 French Ariane 5 launch in Guyana? The one big rocket that exploded soon after takeoff, firing up a loss of a billion dollars?
Prof William Kahan at University of California at Berkeley - the co-author of the 8087, a 40,000 tranny chippie that, in those early days, packed pretty much complete math library set in hardware - narrowed down the cause of the horrendous crash to the programming lingo mishandling the floating point exception handling in the rocket software. Here's what happened.
At the launch, sensors reported acceleration so strong that it caused a float-to-integer conversion overflow in software recalibrating the rocket's inertial guidance while on the launching pad. The software triggered a system diagnostic that dumped its debug data into an area of memory being used by the programs guiding the rocket's motors. At the same time, control was switched to a backup computer that unfortunately relied on that very same data.
This was misinterpreted as requiring major immediate correction, and the rocket's motors swivelled to the limits of their mountings. Next moment, the rocket turned into an unplanned fireball. Had overflow followed the IEEE 754 FP standard by default policy, the recalibration software would have delivered an invalid result to be ignored by the motor guidance programs, and the Ariane 5 would have continued as intended - or maybe, the French wanted to define their own "unique" FP standard?
But Anyroad
Anyway, the new IEEE 754R standard is now being finalised, after many years of 'discussions' - read infighting between opposing camps. Besides the decimal numbers and various fixes to the existing stuff, one very important addition is standard 128-bit quad precision (QP) floating point, a feature I like in particular.
While it won't help in spreadsheets or financial modeling, 128-bit FP with its extreme 33+ digit precision and very high exponents allows humongous number range, literally approaching this universe's number of atoms. Anything from very precise material simulations to ultrafine bomb explosion estimations - imagine future gaming physics where a projectile demolishes a building with every brick piece and concrete chunk falling apart precisely according to natural laws down to every millimetre!
How far are we from it? Well, the standard is approaching ratification. What about the hardware? Well, SSE3/4 units in the recent Intel and AMD CPUs are already 128 bits wide with 128-bit execution units, registers and data paths. So, not that much work may be required to get at least "hardware-assisted software" runs of standardised 128-bit FP on these PCs, even if with quite a performance penalty. IBM Power6+ is, of course, expected to have it too - they already can do the decimals natively.
The upcoming ATI R700 and the D9E top end of Nvidia 9000-series GPUs are expected to have both true 64-bit IEEE 754 DP FP, and, quite possibly, hardware assists for QP FP too. After all, their zillions of 64-bit execution units can be ganged up together to handle 128-bit values too.
Finally, both Nehalem next year and, if AMD survives in this shape till end 2009, Bulldozer CPU generation a year later are expected to include full IEEE 754R spec, including the 128-bit binary QP FP.
Or, at least, I hope so - what do you think, readers? Is it time for the 128-bit FP on the "mainstream" X86? ยต
Jonathan Harris seems to know more about this than I do, but just from reading the article I could tell it was totally off the mark. If the issue was that they weren't checking that their floating point numbers were in the valid range of the integers they were converting to, then floating point precision is totally irrelevant. Indeed, if Jonathan is right, they should have documented any "unsafe" speed hacks they implemented for later reevaluation.
Arianne 501 crashed in 1996, not 2006. An IEEE 754 64 bit floating point number can already represent numbers up to 10^300, which is MUCH greater than the estimated number of electrons in the universe, 10^80. The SSE registers on intel chips are 128 bits wide to support SIMD instructions which operate on several smaller numbers at once. It's not clear that two 64 bit floating point units can be "ganged up" to work on a 128 bit number.

This is a questionable article.
It is June 1996, not 2006.
Also see http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
Arianne 5 did not crash because 'the programming lingo mishandling the floating point exception' or because 'the French wanted to define their own "unique" FP standard' (you're such a humorist). IIRC the Arianne code was and is standard ADA running on off-the-shelf milspec hardware.

The misbehaving software routine was inherited from Arianne 4. On Arianne 4 the engineers determined that the float-to-integer conversion could never generate an overflow exception, so they removed the exception handling code in order to save CPU cycles (this is hard real time software - they needed every cycle).

No-one revisited this decision or re-tested the routine when it was re-used on Arianne 5, where this float-to-integer conversion could and did generate an exception. The ironic thing is that the routine in question should not have even been running at that phase of Arianne 5's flight.

So Arianne 5 crashed because of a litany of specification, documentation and test errors and omissions. 

Please do more research next time. Or get someone who understands this stuff to write the article instead.
Although I'm sure the rocket was destroyed as stated above, you can't blame the FP precision, or lack there of, for the crash. Any good mathematician calculates an error term to limit the degree of relative precision. I'm sure proper logic and avoidance of common FP pitfalls (i.e. no small numbers with big numbers) would have avoided this. Even more, better error handling would of prevented this.

PS dumping memory is not error handling in a mission critical system.
The number of people I've met who have said "so what, 32bit fp is fine.... why would anyone need more precision" just scares me. They dont seem to understand how math on computers works. If we some day hope to be building space elevators, or more relivantly, super fast planes and rockets (as this article so nicely illustrates), we had better be as accurate as possible, because when you are shooting a little rocket a couple billion miles away that 10e-32 error will start to add up when you need to hit some atmosphere at just the right angle.
> imagine future gaming physics where a
> projectile demolishes a building with every
> brick piece and concrete chunk falling apart
> precisely according to natural laws down to
> every millimetre!

We can do this with current 32FP or 64bits floating point. Why would we want to have 128FP if there is not enough horsepower to do it?
In the time they spent infighting, they could probably have generalized and extrapolated their standard so that future 256, 512, 1024, AND SO ON..., would have been already pre-defined.

Guys,

OK, apologies - the Friday nite Singapore time booze hit too hard. The unintended fireworks year was 1996, not 2006.

Now, as for the coverage why that FP problem was perceived as one of the reasons, here is the link with the Prof Kahan's statement:

http://www.intel.com/standards/floatingpoint.pdf

Have a fun reading it!
Nova
THe official report is found at:

http://esamultimedia.esa.int/docs/esa-x-1819eng.pdf

I am not sure that the Intel propaganda tells the real story

> An IEEE 754 64 bit floating point number can already represent numbers up to 10^300, which is MUCH greater than the estimated number of electrons in the universe, 10^80.

That's true. Also, to convert 10^300 to an integer and don't get an exception, you'd need 997 bits of integer precision.
While floating point standards are really important, Arianna 5's compliance (or not) did not cause the crash. EOF
Jeffy .. the new IEEE 754 draft does indeed define formats way beyond 128 bits. Check it out...
Funny how any technological advance always prompts some people to deny any possible usefulness (640KB anyone ?).
The great thing is that technology advances, unperturbed by these negationists.
As far as I am concerned, we need every bit of additional power we can get, right up to the day that the bottom range of computers is capable of doing a fluid full HD in 24000 x 18000 and that, whatever the amount of stuff on screen.
Give me 1024-bit FP, 1024 gbps bandwidth and 1024MB of DDR12 RAM. If that is what it takes to be able to play a game where shells make holes in buildings and in the ground, where tanks can ram into walls and punch holes in them, or run over trees and make them fall, well that is what I want.
I want a realistic game universe. Not real, realistic. And invulnerable walls and trees seriously harm that realism. Aircraft carriers that don't move an inch seriously impede the sensation. Trees that do not sway in the wind detract from the experience.
So don't tell me what we don't need until games that do the above are at $5 in the bargain bin. Until then, we very much need any advance we can get.
And it did not happen in June 2006, it was June 1996. So much to "news".
The accounts here (and elsewhere) seem to show that the role of f-p in the Ariane crash was incidental rather than central: a more central thing seems to have been the ADA language compiler's property of causing memory-dumping in response to an exception.
The deal with needing 128-bit FP for games is probably wrong unless you're going to be travelling light-years from the origin before moving small-scale objects around (a possibility for space games). Otherwise, while still on Earth, 64-bit is OK.

32-bit, on the other hand, is very much NOT OK in quite a few situations with large maps. I think it has precision roughly 1 part in 2^24 or 16 million. The problem comes with positions being incremented "per frame". If you're a few km from the origin, your granularity becomes maybe 1/4mm, probably just about accurate enough for most things, but if the game is animating at 100fps that means your velocity could end up quantised to 1/4mm/frame = 25mm/second. That actually becomes quite noticable and gets worse as the fps goes up (provided it's not locked) and when you consider accelerations.

The 128bit standard appeals because of certain "dirty but nice" mathematical operations (multiple numerical differentiation for instance) that cut your precision down to 1/3 or 1/4 the number of digits you started with due to taking fairly small differences. Yes, they're not really recommended but in some situations they would be the fastest way if only the hardware supported large precisions.
The various comments on the Arianne 5 crash get the details right, but miss the real, underlying cause...
First, the computer (and software) used for flight control in the Arianne 5 was developed as part of an improvement program for the Arianne 4. No surprise, all the design decisions, including when to stop updating the inertial guidance portion were made based on the Arianne 4 requirements. Then the decision was made to reuse the computer (and software) in Arianne 5.
However, when the developers, working at an English company asked to see the Arianne 5 requirements, they were refused. The French were concerned that this would give the company an advantage in bidding on Arianne 5 related work.
The details of the cause of the failure in the report sort of try to mask this problem. The Arianne 4 flight control software was reused without change in the Arianne 5! In particular constants like the moment of inertia of the stack, the maximum permitted deflection of the engines and so on were incorrect, since they were never updated.
Yes, the actual trigger was a floating point to integer overflow that was unhandled. But any wind shear, like that it was about to hit, could have caused the same end train of destruction: the engines deflected too far causing the stack (the combined stages) to fall apart. The Arianne 4 stack was more sturdy, but the engines were much less powerful, leading to not only a wider deflection range, but more deflection being needed to stay on course.
When we were developing Ada, this issue was usually referred to as: "An M-2 tank is not an M-1 tank." The point being that even if you could reuse lots of the software without change, you still had to review the software against the new requirements. This was the cost that Arianne tried to avoid on the Arianne 5 project, with predictable results.
If this had been done, not only would the offending module have been shut down at T=0, not T+40, but about two dozen major flight parameters would have been changed. The surprise is not that the first Arianne 5 exploded, but that it got as far as it did.