Bootstrapping - C, tiny languages, etc

Some findings from around the web, prompted to some degree by @drogon posting elsewhere about wanting a 65816-native C compiler for his text editor which is written in C.

Radek Pietruszewski posted a query on the fediverse seeking a particularly small C compiler, and got some interesting responses:

Jeremiah Orians put together a multistage bootstrap to start from an absolutely minimal handwritten binary seed, and his stage2 seems to be a C compiler for various CPUs - not just x86 - but written in a kind of macro language which looks eminently retargetable:

The stage2 is dependent on the availability of text source files and at least a functional macro assembler and can be used to build operating systems or other ā€œBootstrapā€ functionality that might be required to enable functional binaries; such as programs that set execute bits or generate dwarf stubs.
… [forth, lisp]
After being told for months there is no way to write a proper C compiler in assembly and months of research without any human written C compilers in assembly found. To prove the point Jeremiah decided the First C compiler on the bootstrap would actually be a cross-compiler for x86, such that everyone would be able to verify it did exactly what it was supposed to and see it self-host its C version.

By the same author, a related project, a transpiler

The late @monsonite took a survey (discussed here on HN):

Over on stardot, @hoglet found that Atom Pascal is a derivate (perhaps with serial numbers filed off) of the rather interesting Instant Pascal written for the AIM-65. That’s a Pascal compiler/intepreter and text editor for a 4k RAM machine with a one-line display.

The author, Mel Conway, has written some very interesting history about this and about his career and philosophy.

See also previous threads:
SectorC: A C compiler that fits in the boot sector
Bootstrapping the GNU Compiler Collection (where we see a mention of Pastel, a tiny Pascal)

7 Likes

Hoping this is seen as interesting here, and on-topic… Brian Swetland is, I think, bootstrapping a compiler, with some derivation from a RISC-V version of Oberon. Two repos of note:

  • swetland/compiler ā€œThe general plan is a small compiled systems language (in complexity, source size, and binary size) borrowing syntax from some of my favorite ā€œbraces languagesā€, C, Go, and Rust, aiming to be a bit safer than C, and suitable for small, embedded, self-hosted systems.ā€
  • swetland/spl ā€œA small compiled language borrowing ideas from C, Go, Rust, Zig. Very much a work in progress.ā€

Updates on the Fediverse:

1 Like

I think a small needs well defined first. How small is small with a 64 bit cpu?
Small Computers in my case is from the 16 bit computer era like the IBM 1130, with 4K,8K,16K or 32K
words of memory. For a while I had a z80 computer with 16K memory, until the power supply
Zapped it. A many early mainframes like the IBM 7090 had only 15 bit addressing.For years I wanted
a computer with 64K and dual floppies.
This is my version of small.
The second issue is, do you have a disc file system? Only after the mid 1970’s did disks become more
popular than magnetic tape.
It good to see projects like this, as modern machines are too complex for me to under stand at the bare
hardware level in general, and this gives me good insight on the details.
I also need to bootstrap a compiler, but I just have a single accumulator architecture machine like IBM 704

Worth mentioning BCPL here. To bootstrap BCPL you write a few functions in the ocode (or bytecode in later versions) and then concatenate them with the existing compiler ocode. The only other bit you write to bootstrap your machine is the simple ocode/bytecode interpreter.

Once you’ve done that you’ve got a complete self hosting compiler and you then use that to compile yourself a native backend.

5 Likes

I’ve done this three times now. (maybe 3.5, given 2 variants of one CPU). It does get easier but I’m not sure I want to do it again.

Namely 65c816, RISC-V (RV32IM), ARM32, RISC-V (32IMACZ…)

Simple… for some meaning of Simple…

-Gordon

3 Likes

The BCPL2 Ocode one was pretty trivial, the later bytecode is more 8bit friendly and compact but I agree a lot more work.

PASCAL-S was another similar one where you could implement a minimal interpreter for the virtual machine in whatever you had then run the existing output in your vm.

3 Likes

I went for the cintcode/bytecode more as I had a passing familiarity to it than anything else - having used BCPL in a big (for the time, I guess) factory automation project in the early 80s at uni. running over a raft of BBC Micros with Econet…

PASCAL-S was another similar one where you could implement a minimal interpreter for the virtual machine in whatever you had then run the existing output in your vm.

I’ve always had a bit of a soft-spot for Pascal and one day I’d like to think I’ll have time to write something that resembled a Pascal to cintcode compiler. It would be nice to get something other than BCPL (and Basic) to run under my little OS…

-Gordon

1 Like

I want that OS, well any OS for the matter.:slight_smile:

Now that I have a 36 bit CPU designed, with byte addressing, I will look again at porting BCPL,
as it started on large machines.
It is unlikely I will port C since my short and character data is unsigned.
W(onder) M(achine) (19) 69
Fast Core memory (1 us), general register CISC architecture with 9,18,36 bit data and carry bit.18 bit addressing with a Stack Frame pointer. 8 digit binary software floating point with hidden bit.

Boot strapping would be trivial if one had a full featured instruction set
for character, short and long and indexed addresses larger than 64K like the NS16032.
Parsing would be easy if white space separated tokens,and ANSI 69 was standard,
with a option for left and up arrow graphics,
ASR 33’s and punch cards lived until the PC era.

High level languages need to have known limits or limited features like with boot strapping,
or porting to new hardware. VM’s are needed only because the hardware is too poor (pdp-8) or the
language is too complex (PASCAL) other than in initial bootstrapping.

my thoughts on bootstrapping.
Ben.

1 Like

If you are thinking of calling your design ā€œWMā€, note that this could cause confusion with the 1990s WM architecture.

2 Likes

C specifically allows for the default char type to be unsigned, and in fact in an awful lot of processors it is. There’s also very little in C that cares about signs - right shift, remainder, compare and division mostly assuming you are working 2’s complement. A signed compare is an unsigned compare by toggling the top bit of the value.

Other than bootstrapping the reason for a VM is simply that you need a higher code density. All the things like limited processor capability are a variant of code density in reality. There are some other cases like needing to do virtual memory or protection when the hardware cannot.

BCPL is very much adapted to a word addressed machine. It’s possible to handle word addressed machines correctly in C but it’s a bit more ā€œinterestingā€. I’ve done it for the DG Nova series and have some prototype code for the DDP. BCPL because it doesn’t really treat characters as a native type avoids the problem entirely as does B.

Some of what you can do also depends on CPU properties. If you don’t have sensible ways to access data relative to stack frames for example then older FORTRAN is fine, COBOL is fine but many later languages are deeply painful

2 Likes

Is that real hardware, all the WM links seem to be broken?
The only source code for C I have is Small C, and it is signed bytes
and no longs.
I like 1’s compliment better, as arithmetic is symmetrical.
Ample griping about bootstrapping, time to find my ā€˜round to’its’ and start coding something.
Ben.

1 Like

WM was a university project. I have the impression that they made a TTL prototype but never got around to making a chip. And yes, the papers are very hard to find online.

Chuck Thacker cited it as an inspiration for version 6 of his TinyComputer FPGA cpu design. Specifically the idea of splitting memory reads into two separate FIFO accesses. You push the address to one FIFO and later on pull the data from another FIFO (seen as registers in TinyComputer 6), stalling if the memory system hasn’t produced that data yet.

Another WM feature is that each integer or floating point instruction actually encoded a pair of operations and three operand registers plus one destination register. They were easily able to patch a C compiler to use this more than 30% of the time (if I remember correctly). Many RISC designs just waste the bits (Wirth’s RISC5, for example) while ARM uses them to combine a shift with every operation.

1 Like

The 9 page Evaluation of the WM architecture is online, at least. (This is an early 90s idea, mentions FIFOs and ā€˜faintly resembles’ a dataflow machine.)

This report describes the results of studies of the WM architecture- its performance, the values of some of its key architectural parameters, the difficulty of compiling for it, and hardware implementation complexity. The studies confirm that, with comparable chip area and without heroic compiler technology, WM is capable of outperforming traditional scalar architectures by factors of 2-9. They also underscore the need to devise higher bandwidth memory systems.

1 Like

Some links to related bootstrapping projects in this HN comment in a discussion on ā€œSectorC: A C Compiler in 512 bytes (2023)ā€.

1 Like

Over the past couple of years I’ve envisioned a bootstrapping process for a homebrew RISC design. The hardware is small but not tiny: PDP-11 kind of small. 16-bit Harvard arch, 128K address space. I hadn’t done much except write an architectural document. Then last Halloween I had a stroke. Now I cannot really type.

So I have been experimenting with AI and I have had Claude code write me a compiler for a small language of my own design (YAPL, ā€œyet another programming languageā€). This compiler is written in Go, but I envision it could be rewritten in itself and made to run native on the RISC (which I call the WUT-4) eventually. I’ve had Claude code write me a machine emulator in addition to the compiler. The AI is remarkably good at learning anything it can hold in its 200 K context. So I’m going to try asking it to translate the lexical analyzer into YAPL sometime soon.

wut4/lang at main Ā· gmofishsauce/wut4 Ā· GitHub for anyone who may be curious. It’s a fairly large experiment for an AI to write a complete compiler. But after having it write several tests it seems to be working pretty well. Claude wrote every line. The README was written by the AI for its own use so it has kind of an odd perspective: it refers to ā€œthe userā€ meaning me.

3 Likes

Sorry to hear about your challenge. But this is a very modern and interesting kind of constraint, to pack enough guidance into 200k of context to get the machine to implement your intentions. Self-hosting in 64k is a good goal! It’s also a substantial test case…

That then leaves some memory for a file system and a shell. One never thinks of the operating system space since a mmu is taken for granted noways,

Self hosting in 64K with a typical compiler for most older languages (Fortran, COBOL, C89 etc) is easy enough but I doubt that glorified autocomplete can actually do much of this because it’s not commonly done and it requires thinking outside of the classic compiler innards.

SmallC really doesn’t care if the code is signed/unsigned for char types. It’s fairly oblivious to it providing your sign extend routine does the right thing.
On a 36bit machine you can also just make int = long for a simple life

The default size is 18 bits unsigned.I don’t have the microcode space for a real 36 bit machine.
The real reason I am avoiding Small C is the fact the code generation is a real mess.RatC is
better but I can’t seem to find the C text files for that.
Since we now have long long a simple language for 36 bits possible in about 1250 lines of C code.
Code Generation is easy since all the basic operations are supported as 32 bits. Subroutine calls
for things like multiply and divide, use fixed registers. Indexing uses a fixed index register.
Function calls are limited to the first term in a expression. foobar() == 6 is valid but
6 == foobar() is not.

.