Tagged memory on modern machines

Tagged memory systems “were a thing” in the 1960s and 1970s especially, but seem to have disappeared since. I assume this is simply because of the ubiquity of general-purpose processors that lack this feature. Yet much of modern programming development relies on memory management systems that introduced varying degrees of overhead (sometimes a lot), or goes to extreme lengths to avoid it, as is the case in Rust.

I am curious why tagged systems are not used to address this in modern systems? Memory size for everyday tasks is simply not an issue these days (the base model of my system comes with 16 GB, which I’ve never remotely come close to using) so I can’t imagine that the extra memory needed would be an issue, and it would seem that hardware support for garbage collection would help most programs written in modern languages.

I realize this is a speculative question, but I suspect there’s people here who are more familiar with memory management who might offer some insight.

The most successful current tagged memory project is CHERI, which initially added capabilities to MIPS and later adapted its scheme to the x86, RISC-V and ARM. The ARM version had made it to actual chips and development boards.

There have been other efforts to add tags to RISC-V and it is on the roadmap for the J Extension (for languages like Java and Javascript, Lisp or Smalltalk).

Besides what I call “out of band” tags (extra bits beyond the data word), there are also “in band” tags which steal bits from the data word, which is specially tempting when all addresses are 64 bits but you have less than terabytes of memory. This is already part of the J Extension proposal. ARM has the TBI (top byte ignore), Intel and AMD offer similar (but incompatible) schemes.

In band tags don’t help with security and programmers eventually will complain about the missing bits. The original 68000 used 32 bit addresses but only brought 24 of them to the pins, so the Macintosh used the remaining bits for interesting things. Once the 68020 came out and machines wanted more than 16MB of memory they regretted this design.

Out of band tags are easy enough if you build your memory in multiples of 1 bit. The Burroughs 5500, for example, had 48 data bits and 3 tag bits. The Intel 960 had models with the option of 32 data bits and 1 tag bit (replacing the positive/negative offset scheme in the Intel iAPX432). If you get your memory in the form of wide chips or DIMM boards then it is more complicated. You could use 72 bit wide ECC DIMMs but then tags would mean not having ECC protection. You could store the tags at different addresses than the data words and fetch both to combine them in the caches and registers.

1 Like

Here’s a very recent article on GCC and glibc support for CHERI:

1 Like

To clarify, the J-extension reference is to the Risc-V architecture.
It is possible to steal an ECC bit for a tag if the ECC is over a large enough word (i.e. a 64 bit word only needs 7bits of ECC to determine a single bit failure, and it’s position, or a double-bit failure, so there is one bit left over).

The ARM Morello chip - an actual chip that works - is the only architecture I know of that uses an out-of-band tag for security (as opposed to ECC-like use cases).

The main reason that the use of tags didn’t catch on was the fact that DIMMs came in 8 or 9 bit widths only, which isn’t enough for ECC on 32bit architectures, so only used for parity, and they were much more expensive than byte-wide DIMM. As mentioned, you could store the tags in a separate address, but that’s pretty complicated in mnay ways.

So I’ve been reading some of the links above and trying to understand as much as I can. One thing eludes me though, the J-extensions for Java… how is the tagging used in this case? Is it a security feature, or for garbage collection?

So who used ECC memory, PC’s just had a maybie parity bit.
The 68000 or the X86 was the cpu for ‘home’ computers and they never had
any kind of tagged memory. What about other sytstems?
Ben.
PS I assume tags
here are like for LISP, not for Memory management and or heap managment.

The PC and AT class machines indeed used only parity (to my knowledge), but somewhere in the late 80s or early 90s ECC became a much more common thing on server PC hardware. Of course, various Unix workstations used ECC all along.

I think, in the Intel range, ECC memory was only supported by Xeon processors. The only consumerish machine using Xeons and in consequence ECC memory, I can think of, were the tower Mac Pros. (And memory was pretty expensive, for the lack of scale and the “industrial” bonus price. The engineering of the Mac Pro memory was quite beautiful, though: LED lights would tell you which bank was concerned in the event of failure and the entire memory would slide out on a tray by a gentle push.)

There were 386 and 486 class machines that also allowed ECC memory. I am not sure about Pentium-class, but the PPro and Pentium II certainly continued that trend.

1 Like

So did the ECC memory find many errors?

The J-extension pointer masking proposal is a fairly general mechanism, and is intended to be useful for both security and garbage collection (and there are probably others) - but it gets trickier if you want to to do both at the same time.

1 Like

Here is a 2017 presentation about tagged memory in the lowRISC project which implements tags in RISC-V stored in a different area of memory than the data.

About security: the key idea is that if the system gives an application a capability (fancy pointer) and later on the application gives it back to the system in order to get something done, the system should be 100% sure that the application did not mess around with the capability.

In the Intel 960/BiiN, for example, each 32 bit word had an extra 1 bit that was 0 for normal data and 1 for capabilities. Normal user code can’t deal with the extra bit directly, but if uses any normal instruction on that word the bit is cleared. So when the system receives a capability as an argument and the tag is still set it can be sure that the application didn’t touch it.

1 Like

I am sure it did. I didn’t have any Intel machines that had ECC back then (probably some of my SPARC or UltraSPARC machines do). I know that I have seen ECC catch errors on my current Xeon. With 32 GB or more of memory in a current machine, with typical feature sizes and chip areas, the expectation of a single-event upset in main RAM is rather nontrivial over a moderate period of time (say, a year or two).

i had quite a number of (repaired) memory incidents in my logs. Bit-flips are a thing.
(To the point of me being actually concerned about not having ECC memory any more.)

2 Likes

ECC may be built in to the RAM I suspect. With Covid, getting any Soild state logic parts
is iffy. Tubes I suspect are still easy to come by, but few people have valve machines.

Actually, the Burroughs B5500 had only one sorta-tag bit, termed the flag bit, which was the high-order bit in the 48-bit word. It distinguished data words from controls words. That played merry hell with character processing using full words, so the machine had two modes – Word Mode where the flag bit was meaningful, and Character Mode where it was not. Programming in Character Mode (“Stream Procedures” in the ALGOL language) had to be done carefully in Character Mode, since there were no bounds protections and you got to work with absolute memory addresses. As you can imagine, that could lead to real problems. Despite that, it validated the concept of protected control words and fairly granular memory bounds protection.

Starting with the B6500 (late 1960s), the flag bit was expanded to three bits and moved outside the 48-bit data word. Character Mode was dispensed with and new string ops were implemented that respected controls words and bounds protection. That has worked very well now for over 50 years. The tag field was expanded to four bits, mostly to optimize the handling of control words, some of which have multiple forms.

That architecture continues to be used today in the Unisys ClearPath MCP systems. Unisys (formed from the merger of Burroughs and Sperry Univac) stopped making custom ASIC processors of their own about a decade ago, when the cost could no longer be justified by the size of their market, and now the architecture is emulated on Intel x64 processors.

Since those processors use 64-bit words, there’s room for the four-bit tag and the 48-bit data word in the 64-bit physical word. Some of the control words have actually expanded beyond 48 bits to allow for greater addessability, which works because user-level code can’t access the internals of control words.

2 Likes

Thanks for the correction about Burroughs large systems history. I had thought that the original B5000 had 1 tag bit and 12 bit instructions and the B5500 changed that to 3 tag bits and 8 bit instructions which could handle multiple data types as indicated by the tags, but the transition was in the B6500 as you said.

The first computer I had access to was my university’s B6700, but at the time I didn’t get to mess with it at a very detailed level. Later on I did read the Organick book, but that was a very long time ago.

1 Like