Why undocumented instructions?

The recent news of a ESP32 vulnerability related to undocumented opcodes makes me wonder about most classic 1970s and 1980s CPUs having undocumented instructions.

Since it’s hard enough to meet the public product specifications within time and budget, especially back then when this technology was new, why did chip vendors go beyond and allocated resources to implement undocumented instructions most customers didn’t use or weren’t aware of? Was it some sort of byproduct of chip design? A way of preparing for the product to evolve? Something else?

1 Like

I am no chip person, but my understanding it is the “byproduct”.

Certain designs and their implementations for some promised functionality gives extra functionality for “free”, accidentally, but they were never a design goal or product requirement.

That is one reason why they were undocumented, other reasons include that it might be that the resulting accidental functionality is either not orthogonal – some operation-operand combinations might not exist, or produce invalid results, or even crash the CPU – and if, say, the implementation in the future slightly changes, the accidental functionality changes, in random ways.

1 Like

One other reason, is instructions may have been defective or omitted in the first draft of the hardware.The 6502 ROR instruction comes to mind.
Another reason is new instructions where added in later revision as a upgrade by a different cpu vender. Set this flag and your new X86 is 8080, like with the V20 cpu.

“Our CPU implements SHL, SHR, ROL, and ROR.

A few points about this…

  • These aren’t CPU opcodes. They’re hardware commands given to a device (Bluetooth controller) in a command block issued by a device driver.
  • It’s got nothing to do with the CPU (see more about this below).
  • The authors of the article wrote and installed their own custom device driver so they could issue these commands…
  • …so they required root access to the computer to pull off their nefarious “hack” (snort).
  • The commands are for exercising the hardware and are probably used by automated test software.
  • Lots of devices (e.g. mass storage, like SSD and spinning disks) have similar undocumented commands in addition to the documented ones, for the same reasons.
  • And these commands can only be issued by custom system software that already has completely control of the device (i.e. root access on the computer), for the same reasons.

In short, this “vulnerability” is completely bogus. If an attacker can install their own device drivers on your computer, this scenario is the least of your worries…by such a margin that it’s not even worth a second thought. This is a couple of researchers trying to make a name for themselves whose work got picked and overhyped by a publication that lacks technical expertise.

See Undocumented backdoor found in Bluetooth chip used by a billion devices | Hacker News

I found this because I started to write something about undocumented CPU opcodes. No modern CPU that I’m aware of has such undocumented codes. And then I thought to myself, wait a minute … the ESP32 is a modern CPU. So what the heck is going on? Well, the article wasn’t about CPU opcodes.

What do I mean by “no modern CPU has undocumented opcodes”? I mean that in modern CPUs, every instruction either (a) has a documented behavior, or (b) causes a documented fault (trap, exception, … the terminology varies). CPU vendors make sure this is true for security, in order to reserve the unused opcodes for use as new instructions in future versions of their CPUs, and probably for other reasons. It just makes sense.

What about older CPUs? Well, it turns out that making the CPU’s instruction decoder “complete” (i.e. meeting the spec above, everything is either documented or causes a fault) adds a lot of complexity to the decoder logic. And this takes circuitry. In the early days there was no circuitry to spare: it was all about cramming the required functionality on to the die.

Also, many early microprocessors didn’t even have a concept of “faults”. So it wasn’t even obvious what they would do, even if they had a “complete” decoder that could recognize every possible opcode pattern and sort out the undocumented ones.

It wasn’t until 16-bit processors like the 8086 and the 68000 that microprocessors got a concept of “faults”, e.g. the INT instruction in the 8086 (one instance of which is used by the 8086 for a divide by zero fault) that it even really became possible to have “illegal instruction” faults … but keep in mind even the 8086 had undocumented opcodes and did not have illegal instruction faults. Again because there just wasn’t room on the chip for all the additional decoding that would have been required.

There are a few exceptional cases. The Intel 8085 had a set of very useful instructions that went beyond its predecessor, the venerable 8080. But as the 8085 was getting done, they were already designing the 8086. And these new 8085 instructions would have been incompatible with the 8086. So they just didn’t document them. Since the 8085 had no concept of traps, there was little else they could have done that point. They just left them out of the documentation so they weren’t responsible for any compatibility issues caused by their use.

3 Likes

As far as I can see, there 3 major classes of undocumented opcodes.

The first one is the classic 1970s of incomplete decoding, where there is no effort made to inhibit the execution of undefined instruction bit patterns.

E.g., on the 6502, instructions are generally laid out in a aaa.bbb.cc bit pattern, which pretty much defines the instruction grid. Generally, there are no instructions defined whatsoever, where both bits of c are set. What happens is that both the instructions for c=1 and c=2 execute at once.

E.g.:

inst aaa.bbb cc   opc addr-mode   (comment)
------------------------------------------------------------
$99: 100.110.01 … STA absolute,Y  (store A)
$9A: 100.110.10 … TXS implied     (transfer X to stack pointer [SP])
------------------------------------------------------------
$9B: 100.110.11 … TAS absolute,Y  (A and X are transferred to the internal latch,
                                   at once, resulting in `A AND X`,
                                   which is then transferred to SP [as in TXS].
                                   This is also stored at the provided address,
                                   like STA, here an absolute address indexed by Y.
                                   In the course of these address calculations,
                                   the high-byte of the address [AH] + 1 is added to `A AND X`,
                                   resulting in `(A AND X) + (AH+1)` being stored.)

I guess, we may see how unintentional this behavior is, but also, how this corresponds to the official instructions that are executed. This is classic undefined behavior.

There are also a few other cases, where there are “holes” in the populated parts of the instruction grid, where, what seems to be defined by the general decoding matrix, may either be without external effect or will fail entirely with the CPU becoming stuck. (E.g., a code may indicate that this is a STA instruction, but also with immediate address mode, where we read a literal byte value as the operand. As there is never a write address asserted on the bus, this will result in a NOP.)

The CMOS variants (65C02) add circuitry required to inhibit any effects. (While all undocumented instruction codes are now NOPs, they are still of varying byte length and execute with varying cycle counts.)

The second class is more modern and probably the most dangerous one: Over time, what are official instructions on the outside (to the programmer) and what is executed internally, drifted apart. We’re now dealing with two instruction sets: the official, external one and a mostly undisclosed internal one. And there may be undocumented ways to directly inject an internal instruction from the outside, or to transfer firmware code… Notably, these CPUs typically incorporate a security model (rings) of one way or the other, and accessing the internal instruction set may allow to break out of this guard rails.

And then there’s the amazing case of the 8085, a business decision, as described by @pdxjjb.

3 Likes

That’s interesting … do you have an example of the second case? I’ve always thought about the RISCops of modern x86 CPUs being just an implementation detail, something you can’t directly observe like the reorder buffer and the hundreds of renamable registers. And commonly, “rings” are an operating system abstraction.

I was actually thinking of a particular historic example, where the undisclosed microcode architecture and access methods leaked for what I think was a late 1990s/early 2000s CPU. But I can’t remember what this was, only that it was a CPU mostly deployed on “reasonably priced” consumer hardware.

There’s a recent case involving AMD CPUs and undocumented internal behavior on microcode updates and a related exploit that allows for “jail breaks”:

Are you thinking of back doors here?