Never underestimate Chipzilla's ability to execute - Gilbert Ann Sullivan
In the first few months of last year Microsoft spent about eight weeks in what was reportedly an intense effort to improve the security of their software. And what a joke that turned out to be, because within a just few months we were seeing security alerts about Microsoft products that had supposedly been thoroughly checked and corrected.
These statements of 2002 were not the first time that Microsoft has declared the problem solved and buffer overflow banished. Back in September 2001 Jim Allchin, a Microsoft vice president, declared that this problem had been stamped out in Windows XP. Supposedly Microsoft had made a complete code review of its operating system and removed all the buffers which could overflow.
Microsoft has had more than 15 years to get it right and it still cannot create a secure operating system. In fact in 2002 Windows had the dubious honour of accounting for 87% of all virus infections reported to the Australian office of the Sophos anti-virus group. This came on top of about 130 vulnerabilities that were reported for Windows during the year 2000, which is an average rate of more than one every three days.
Given this kind of track record from Microsoft I am quite surprised that in jurisdictions with strong consumer laws there has never been a class action against Microsoft for selling poor quality software. Other operating systems have achieved far better security and have done so since their very early releases, so why is Microsoft unable to?
As for secure operating systems, ask IBM users about the security of their operating systems prior to AIX which itself introduced the usual Unix problems. Or ask OpenVMS users about its security. Its bug list is still in the low double digits after about 30 major and minor versions in its 25 years, which is a sharp contrast to Microsoft's 130 problems in year 2000 alone!
OpenVMS is even more relevant to Microsoft because about 1989 it acquired about 20 software engineers from Digital's cancelled Prism project which was developing an operating system called Mica. These engineers were the designers for Microsoft's NT and borrowed a large number of concepts from OpenVMS, but unfortunately the security concepts were not included. Was it a matter of meeting release deadlines, potential breakage of other code or keeping third party software houses happy? We will probably never know.
Microsoft relies on the users to apply the stream of patches for Windows but many users are unaware of the patches or where to find them, and they are often reluctant to download large patches which can take hours over a dialup line. The frequency can be overwhelming and some users just ignore any problems that do not directly affect them. Microsoft's attitude seems to be so what if the virus mail bombs other users, so long as no damage happens to my system.
And wrapped around all this is the quite reasonable argument that if Microsoft cannot produce secure product releases then its ability to produce secure patches just as suspect.
In recent years Microsoft has had the gall to receive an award for its security from the Department of Defense (perhaps the first award for "lowering the bar" in many years) and another reward for the manner in which it created tools to allow users the ability to automatically patch their software versions. It is simply beyond a joke.
In my opinion, the fundamental problem is that the basic architecture of Windows has two fatal flaws in its memory management and while these remain in the software the ad hoc patches will never be enough to make Windows a secure operating system.
Fundamental Problems with the Stack
The first problem is the same as that which has bugged the Unix world for many years, the notorious "buffer
overflow problem".
This occurs when a program attempts to write data into a space that is not large enough for it. Within a routine there may be references to an array that actually point beyond the end-point of that array and point at some other data. Using this data at some point in the processing would be invalid, and writing new data into those memory locations would corrupt any data that exists there.
Character strings are particularly notorious for this, especially those where the length is not defined and is only discovered by finding a null character. If that last character was overwritten by other data, the end of the string would not be found and the processing would continue.
These problems are bad enough when dealing with data in the one routine but when the data exists on the stack, it can cause very large problems.
The "stack" is a control structure which is used to manage the transfer of control to functions or other routines that are called by a program. (Some people confuse this with the "heap" which stores temporary variables but essential difference is that the stack contains control data as well as user data and is managed largely by the operating system, whereas the heap holds user's temporary data which is accessed by the program making calls to system routines.)
When a function is called, the arguments are written to the stack along with a pointer to the next instruction in the current code. On returning from that function the operating system uses the pointer to discover the next instruction to be executed.

The vulnerability in the stack exists because in C, C++ and a number of other languages, the value of the arguments is written onto the stack and the length of these data items is never checked. It is very simple for the code in the function to replace these argument values with values of greater length and in doing so corrupt other data on the stack, including the pointer to the next instruction.
It is as simple as passing an array or character string of a certain length to a function but within the function using it as longer array or string by writing some additional bytes onto the end of it.
This problem is fundamental to any language and operating system where data of variable length is written to the stack and no check is ever made on the length of the data items being used.
Fortran 77 and a few other languages almost always only write the address of a data item to the stack and these addresses are of known length. I believe that there are also some operating systems will monitor the length of the data items to ensure that no corruptions can occur.
To take advantage of this vulnerability of stack corruption all that one has to do, by accident or design, is replace the execution pointer that was written onto the stack with a new value. On return from the function the operating system will access what it believes to be the stored program counter and proceed to execute the code at the new location.
One further point here is that even the simplest call of another function can cause the corruption of the stack. Email headers of great length have caused this corruption and even access strings for remote procedure calls.
Problem of Controlling Access to Memory
This problem with stack handling would be an irritation rather than a real danger if it all it did was cause the
software or the operating system to crash. Unfortunately the second problem turns this into a very nasty vulnerability,
one that can permit viruses to execute and cause havoc.
This second problem is the crude manner in which Windows - and indeed some forms of Unix - fail to properly control access to memory. In both systems it is very easy to write data into memory and then execute it. In early versions of these operating systems there were chronic vulnerabilities that led to some very serious viruses and worms.
Windows XP has finally implemented some controls on the access to memory and according to Microsoft's web pages, it is the responsibility of programmers to specify one of the following values when allocating or protecting a page in memory.
PAGE_EXECUTE
PAGE_EXECUTE_READ
PAGE_EXECUTE_READWRITE
PAGE_EXECUTE_WRITECOPY
PAGE_NOACCESS
PAGE_READONLY
PAGE_READWRITE
PAGE_WRITECOPY
Despite all the known problems with allowing a user to write data to memory and then have the potential to execute it, Microsoft has explicitly allowed this as an option that is under the control of the software developer. I am not sure that I want to ask what the default state is because I am aware that compatibility wuld be seen as desirable.
It might appear that the solution to this problem on Windows is very simple and all that is required is to remove the PAGE_EXECUTE_READWRITE option and to enforce compilers and linkers to use the appropriate protection by default. The reality is that certain Microsoft Office products have been known to dynamically create code and then execute it, in a manner known as "trampolining". Rather than modify its Office software so that this did not take place, Microsoft seems to have chosen to put the IT world at risk.
A Volatile Combination
Put the problem of buffer over-run corrupting the instruction pointer with the ability to execute instructions
that have been written to memory as "data" and the extreme vulnerability of the situation becomes obvious.
All that it requires is for those new instructions to copy files across the network to this machine, or to delete files, or to modify system settings, or to use system routines to perform some function such as send mail to other systems and all hell breaks loose.
Some claim that it is rather more difficult to discover the memory location that needs to be written in place of the correct execution pointer but I believe that all it really requires is for the malicious code to contain a known string of characters, perhaps normal ASCII characters which are bypassed during execution by a "jump" statement, and to search through memory until that string is discovered.
Attitudes to the Problem
The lack of proper control of the stack used in Windows has spawned a number of third-party solutions, which
attempt to enforce some kind of check. Most do not prevent the buffer overflow from occurring but enforce checks to
take place as control is returned to the calling routine.
One common solution is to use "canary" data, a single data item that is written to the stack to separate the arguments from the execution pointer. As the control is returned a check is made as to whether the canary data contains the expected value and if it does not, the execution pointer cannot be guaranteed and the program terminates.
This seems all very well but, depending on the platform, if the called routine was malicious, it could examine the stack for the canary data and take steps to ensure that the same value is written to that location. Thus, in a single step, the protection mechanism is rendered useless and the user's confidence in it has been entirely misplaced.
Unix users have no cause to smirk at these deficiencies of Windows because most Unix variants suffer from the same problem, including those that are regarded as the more secure.
Earlier this year the folk responsible for OpenBSD made a big statement about how they had made changes to their code, which according to a CNet report, would "make causing a buffer overflow extremely difficult, if not impossible." They were, in effect, admitting that OpenBSD had been suffering from this problem.
They went on to remark that this was " a software issue that has been plaguing security experts for more than three decades." but this comment is incorrect.
The OpenBSD solution involved a three-pronged approach. They were about to implement the use of a data "canary", they would randomised the memory location for the stack and to use CNet's words, they " found a way to hack the BSD file system and divide main memory into a writable portion and an executable portion".
As I mentioned above, the use of a data canary may be no guarantee of security and the randomised location is of limited value if the malicious code has a method of searching memory to find a specific string of characters and from that deducing the location.
The division of memory into writable and executable portions is certainly nothing novel and I am very surprised that this is seen as an innovation in the general Unix world. As I mentioned above, Windows XP has this division and has introduced a limited kind of memory protection.
BSD might be introducing it now but OpenVMS had even stronger and more flexible protection on regions of memory from its very first release in 1979. If my memory serves, OpenVMS wasn't even the first operating system from Digital Equipment to have this feature but inherited it from the PDP machines running RSTS, RSX or RT-11 and ironically it was these PDP machines that were the first platform for Unix.
Some Solutions
The solution that Microsoft has been trying to apply involves the use of software packages to identify
vulnerabilities. It also appear to be experimenting with different languages, perhaps in the hope of finding one which
offers its programmers a better chance of fixing the problem or avoiding it altogether. Both seem to be rather a waste
of time and effort when all it really requires is to use correct concepts in the operating system and compilers.
On the matter of memory regions and their protection it is absolutely clear that this technique needs to be applied and done so in a very strict fashion with none of the stupidity of EXECUTE_READWRITE. I can do no better than suggest that Windows and Unix take a good look at how OpenVMS handles these matters because it has the most effective system that I am aware of.
The method used by OpenVMS is one of separating the virtual memory into regions, each with their own protection. At execution time the various program sections (PSECTs) are loaded into one of these regions into orderly and defined areas, applying the protections specified for each PSECT as it does so. Thus data is separated from executable code.
It is similar to the protection offered by Windows XP, which is not surprising since NT arrived on the scene but the important difference is that PSECT protections are set by default and the programmer must explicitly modify them for special circumstances.
Now this introduction of proper memory access controls is all that is required to prevent the introduction and execution of malicious code but it does not solve the problem of an overflowed buffer corrupting the call stack.
A reliable solution to this second problem can probably only come about by altering the manner in which the stack is used.
One possible option is to change the order in which the data items are written to the stack so that the return address pointer and the pointer to the previous call frame are written before the parameters. This means that any buffer overflow on the function arguments would not corrupt these data items but may simply attempt to write beyond the end of the stack.
Another option is to write the parameters to the heap, not the stack and on the stack simply write their data length and their address on the heap. This would enable them to be used and if they were returned to the calling routine, it would be a simple matter to copy the specified number of bytes from the specified heap address into a known location for the calling routine. Buffer overflow on the heap would still quite possible but these would be impossible to eradicate without fundamental changes to several programming languages.
Both options would require changes to compilers so that the stack was used in a way that made it immune to the vexatious problem of buffer overflow corrupting the stack. One would imagine though that these changes and the very small amount of additional processing would be a small price to pay in order to avoid these problems. ยต