You can't just locate a chip region and say "Look, all x86 overhead is isolated there".
About decoding. As small the decoder area is, it still induces comparatively heavy overhead. Modern x86 implementations devote *several* pipeline stages to decoding. Non x86 cpus usually get it done with just one. They tend to have much shorter pipelines. Shorter pieline means lesser penalty when any of speculative mechanism in CPU logic fails. And this is just an example. You pay penalty for x86 all the time.
Now, remember that x86 relates also to SOFTWARE, not just to HARDWARE. You not only have to deal with inefficiency of DEcoding. There is also inefficiency of ENcoding. Horribly assymetric ISA with tiny number of registers is a nightmare for optimising compiler. You spit out spaghetti of superfluous dependencies and memory traffic that on any sane ISA wouldn't exist. When it's consumed by x86 CPU, it engages ALL resources, not just decoders.
Dave - see http://groups.google.com/group/comp.arch/msg/ecf4b72cb8b21754 , which is a post by one of the Opteron CPU architects on just how much overhead x86 actually causes. Hint: not 30%.
On the other hand, the ARM926EJ runs an older version of the ARM instruction set and is a fairly slow core. It's too bad they didn't use one of the newer ARM cores.
You can't just locate a chip region and say "Look, all x86 overhead is isolated there".
About decoding. As small the decoder area is, it still induces comparatively heavy overhead. Modern x86 implementations devote *several* pipeline stages to decoding. Non x86 cpus usually get it done with just one. They tend to have much shorter pipelines. Shorter pieline means lesser penalty when any of speculative mechanism in CPU logic fails. And this is just an example. You pay penalty for x86 all the time.
Now, remember that x86 relates also to SOFTWARE, not just to HARDWARE. You not only have to deal with inefficiency of DEcoding. There is also inefficiency of ENcoding. Horribly assymetric ISA with tiny number of registers is a nightmare for optimising compiler. You spit out spaghetti of superfluous dependencies and memory traffic that on any sane ISA wouldn't exist. When it's consumed by x86 CPU, it engages ALL resources, not just decoders.
Dave - see http://groups.google.com/group/comp.arch/msg/ecf4b72cb8b21754 , which is a post by one of the Opteron CPU architects on just how much overhead x86 actually causes. Hint: not 30%.
On the other hand, the ARM926EJ runs an older version of the ARM instruction set and is a fairly slow core. It's too bad they didn't use one of the newer ARM cores.
It is estimated that keeping x86 backwards compatible takes about 30% of the silicon die space. That's a lot of space in an embedded part.
It has been a long time since VIA brought anything competitive to market
Realists at head office probably realized that their x86 gear still wasn't up to snuff and ARM's CPUs simply kick ass for energy efficiency.