A tiny self-hosted Forth (and an interesting Basic too)

A particular (interesting) tiny self-hosted Forth, by Kragen Javier Sitaker:

There’s an interesting pointer in there, to Andre Adrian’s 2008 BASICO programming language which in turn includes pointers to some half-dozen examples of bootstrapping:

  • Niklaus Wirth published the source code of many small languages like PL/0 in 1976, PASCAL-S in 1975 and Oberon-0 (sources) in 1996. These languages can not bootstrap themselves. BASICO follows the PASCAL syntax.

  • P4-Pascal and UCSD-Pascal can bootstrap themselves.

  • Dennis Ritchie used compiler bootstrapping for the UNIX C compiler. Today the sources of the “last1120c” compiler are on his home page. This compiler has no struct and creates PDP11/20 object code. Together with the UNIX version 1 binaries and the PDP11 emulator you can go back to 1973 and program like the old heros. Even the UNIX version 1 man pages are available again. BASICO follows the C semantics.

  • Jack W. Crenshaw published between 1988 and 1989 the TINY language. He used characters instead of constants for keywords in the parser. This idea is used in BASICO, too. TINY can not bootstrap itself. BASICO follows TINY’s code generation ideas.

  • Euphoria is a BASIC like programming language. Programs can run interpreted or compiled with help of an Euphoria to C translator. There is a free Euphoria interpreter in Euphoria.

  • Edmund Grimley Evans wrote in 2001 “Bootstrapping a simple compiler from nothing”. He started on the pure binary machine code level. The language is Forth like.

  • Fabrice Bellard wrote in 2002 the Obfuscated Tiny C Compiler (OTCC). Macro extended with gcc -E and pretty printed with indent the source is 463 lines and can compile itself on Linux into 80386 machine code. See this masterpiece yourself (readable otcc.c).

There’s an interesting page on the BASICO source (less than 1000 lines.)

Jack W. Crenshaw wrote the “LET’S BUILD A COMPILER!” installments. His practical look at predictive parser and detailed explanations about code generation for a real CPU were a big inspiration for BASICO

Back to the StoneKnifeForth, there’s an HN discussion here.

(What I was actually looking for is a Forth which was used to implement a Basic compiler for ARM, which I heard a mention of. From the days of the Archimedes, probably.)


(Note that OTCC is now found here (the original link is dead), and remains very fun. It eventually turned into TinyCC, which is C99, now maintained on Savannah.)


Oh, PL/0! – I once made a variation on this in JavaScript, back in 1999… :slight_smile:
(This really gave me the confidence to have a deeper look into emualtors and language implementations. The original implementation by Wirth was in Modula 2 and is found in his book “Compilerbau”.)


PS: One ingenious and also somewhat convoluted idea introcuded by his PL/0 example was to have error levels encoded in the keyword/symbol order. So the compiler could decide on this, whether to go on trying, even if there was a (recoverable) error, or not, just as a matter of precedence. (This becomes somewhat of a problem, as soon as you consider extending the language. And I don’t think that I had implemented this faithfully in that JS version.)

1 Like