The Inquirer-Home
Comments
The Arianne 5 crash cause...

The various comments on the Arianne 5 crash get the details right, but miss the real, underlying cause...

First, the computer (and software) used for flight control in the Arianne 5 was developed as part of an improvement program for the Arianne 4. No surprise, all the design decisions, including when to stop updating the inertial guidance portion were made based on the Arianne 4 requirements. Then the decision was made to reuse the computer (and software) in Arianne 5.

However, when the developers, working at an English company asked to see the Arianne 5 requirements, they were refused. The French were concerned that this would give the company an advantage in bidding on Arianne 5 related work.

The details of the cause of the failure in the report sort of try to mask this problem. The Arianne 4 flight control software was reused without change in the Arianne 5! In particular constants like the moment of inertia of the stack, the maximum permitted deflection of the engines and so on were incorrect, since they were never updated.

Yes, the actual trigger was a floating point to integer overflow that was unhandled. But any wind shear, like that it was about to hit, could have caused the same end train of destruction: the engines deflected too far causing the stack (the combined stages) to fall apart. The Arianne 4 stack was more sturdy, but the engines were much less powerful, leading to not only a wider deflection range, but more deflection being needed to stay on course.

When we were developing Ada, this issue was usually referred to as: "An M-2 tank is not an M-1 tank." The point being that even if you could reuse lots of the software without change, you still had to review the software against the new requirements. This was the cost that Arianne tried to avoid on the Arianne 5 project, with predictable results.

If this had been done, not only would the offending module have been shut down at T=0, not T+40, but about two dozen major flight parameters would have been changed. The surprise is not that the first Arianne 5 exploded, but that it got as far as it did.

posted by : Robert I. Eachus, 15 April 2009 Complain about this comment
128-bit FP

The deal with needing 128-bit FP for games is probably wrong unless you're going to be travelling light-years from the origin before moving small-scale objects around (a possibility for space games). Otherwise, while still on Earth, 64-bit is OK.

32-bit, on the other hand, is very much NOT OK in quite a few situations with large maps. I think it has precision roughly 1 part in 2^24 or 16 million. The problem comes with positions being incremented "per frame". If you're a few km from the origin, your granularity becomes maybe 1/4mm, probably just about accurate enough for most things, but if the game is animating at 100fps that means your velocity could end up quantised to 1/4mm/frame = 25mm/second. That actually becomes quite noticable and gets worse as the fps goes up (provided it's not locked) and when you consider accelerations.

The 128bit standard appeals because of certain "dirty but nice" mathematical operations (multiple numerical differentiation for instance) that cut your precision down to 1/3 or 1/4 the number of digits you started with due to taking fairly small differences. Yes, they're not really recommended but in some situations they would be the fastest way if only the hardware supported large precisions.

posted by : Stephen Brooks, 05 December 2007 Complain about this comment
ADA not f-p

The accounts here (and elsewhere) seem to show that the role of f-p in the Ariane crash was incidental rather than central: a more central thing seems to have been the ADA language compiler's property of causing memory-dumping in response to an exception.

posted by : NaN2, 03 December 2007 Complain about this comment
Right: crap journalism

And it did not happen in June 2006, it was June 1996. So much to "news".

posted by : Tjeerd, 03 December 2007 Complain about this comment
RE:We don't need 128FP for good games

Funny how any technological advance always prompts some people to deny any possible usefulness (640KB anyone ?).
The great thing is that technology advances, unperturbed by these negationists.
As far as I am concerned, we need every bit of additional power we can get, right up to the day that the bottom range of computers is capable of doing a fluid full HD in 24000 x 18000 and that, whatever the amount of stuff on screen.
Give me 1024-bit FP, 1024 gbps bandwidth and 1024MB of DDR12 RAM. If that is what it takes to be able to play a game where shells make holes in buildings and in the ground, where tanks can ram into walls and punch holes in them, or run over trees and make them fall, well that is what I want.
I want a realistic game universe. Not real, realistic. And invulnerable walls and trees seriously harm that realism. Aircraft carriers that don't move an inch seriously impede the sensation. Trees that do not sway in the wind detract from the experience.
So don't tell me what we don't need until games that do the above are at $5 in the bargain bin. Until then, we very much need any advance we can get.

posted by : Pascal Monett, 03 December 2007 Complain about this comment
Jeffy and research

Jeffy .. the new IEEE 754 draft does indeed define formats way beyond 128 bits. Check it out...

posted by : John Coll., 01 December 2007 Complain about this comment
Conversion

> An IEEE 754 64 bit floating point number can already represent numbers up to 10^300, which is MUCH greater than the estimated number of electrons in the universe, 10^80.

That's true. Also, to convert 10^300 to an integer and don't get an exception, you'd need 997 bits of integer precision.
While floating point standards are really important, Arianna 5's compliance (or not) did not cause the crash. EOF

posted by : Tactics, 01 December 2007 Complain about this comment
Original ESA Report

THe official report is found at:

http://esamultimedia.esa.int/docs/esa-x-1819eng.pdf

I am not sure that the Intel propaganda tells the real story


posted by : INF, 01 December 2007 Complain about this comment
Yeah year is 1996 not 2006

Guys,

OK, apologies - the Friday nite Singapore time booze hit too hard. The unintended fireworks year was 1996, not 2006.

Now, as for the coverage why that FP problem was perceived as one of the reasons, here is the link with the Prof Kahan's statement:

http://www.intel.com/standards/floatingpoint.pdf

Have a fun reading it!
Nova

posted by : NaN, 01 December 2007 Complain about this comment
this article is error ridden

Arianne 501 crashed in 1996, not 2006. An IEEE 754 64 bit floating point number can already represent numbers up to 10^300, which is MUCH greater than the estimated number of electrons in the universe, 10^80. The SSE registers on intel chips are 128 bits wide to support SIMD instructions which operate on several smaller numbers at once. It's not clear that two 64 bit floating point units can be "ganged up" to work on a 128 bit number.

This is a questionable article.

posted by : RSB, 01 December 2007 Complain about this comment
a bit of google/wikipedia would never hurt :)

It is June 1996, not 2006.
Also see http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html

posted by : evince, 01 December 2007 Complain about this comment
float-to-integer

Jonathan Harris seems to know more about this than I do, but just from reading the article I could tell it was totally off the mark. If the issue was that they weren't checking that their floating point numbers were in the valid range of the integers they were converting to, then floating point precision is totally irrelevant. Indeed, if Jonathan is right, they should have documented any "unsafe" speed hacks they implemented for later reevaluation.

posted by : bazald, 30 November 2007 Complain about this comment
Short-sighted

In the time they spent infighting, they could probably have generalized and extrapolated their standard so that future 256, 512, 1024, AND SO ON..., would have been already pre-defined.


posted by : Jeffy, 30 November 2007 Complain about this comment
finally

The number of people I've met who have said "so what, 32bit fp is fine.... why would anyone need more precision" just scares me. They dont seem to understand how math on computers works. If we some day hope to be building space elevators, or more relivantly, super fast planes and rockets (as this article so nicely illustrates), we had better be as accurate as possible, because when you are shooting a little rocket a couple billion miles away that 10e-32 error will start to add up when you need to hit some atmosphere at just the right angle.

posted by : Hyperion2010, 30 November 2007 Complain about this comment
We don't need 128FP for good games

> imagine future gaming physics where a
> projectile demolishes a building with every
> brick piece and concrete chunk falling apart
> precisely according to natural laws down to
> every millimetre!

We can do this with current 32FP or 64bits floating point. Why would we want to have 128FP if there is not enough horsepower to do it?

posted by : Javier, 30 November 2007 Complain about this comment
Use a Hammer

Although I'm sure the rocket was destroyed as stated above, you can't blame the FP precision, or lack there of, for the crash. Any good mathematician calculates an error term to limit the degree of relative precision. I'm sure proper logic and avoidance of common FP pitfalls (i.e. no small numbers with big numbers) would have avoided this. Even more, better error handling would of prevented this.

PS dumping memory is not error handling in a mission critical system.

posted by : schmide, 30 November 2007 Complain about this comment
Crap journalism

Arianne 5 did not crash because 'the programming lingo mishandling the floating point exception' or because 'the French wanted to define their own "unique" FP standard' (you're such a humorist). IIRC the Arianne code was and is standard ADA running on off-the-shelf milspec hardware.

The misbehaving software routine was inherited from Arianne 4. On Arianne 4 the engineers determined that the float-to-integer conversion could never generate an overflow exception, so they removed the exception handling code in order to save CPU cycles (this is hard real time software - they needed every cycle).

No-one revisited this decision or re-tested the routine when it was re-used on Arianne 5, where this float-to-integer conversion could and did generate an exception. The ironic thing is that the routine in question should not have even been running at that phase of Arianne 5's flight.

So Arianne 5 crashed because of a litany of specification, documentation and test errors and omissions. 

Please do more research next time. Or get someone who understands this stuff to write the article instead.

posted by : Jonathan Harris, 30 November 2007 Complain about this comment

Floating point bugs cause rockets to explode

aboutus
Advertisement
Subscribe to INQ newsletters
Advertisement
INQ Poll

Authorities in several countries raided Megaupload recently, shut down all of its services, seized hundreds of servers and arrested several of its executives on criminal charges.

Do you think the move was justified?