Post by Raymond C on the history of The Applesoft Compiler, including the tricks to make it fit into memory. Like the writer of the compiler, I once wrote a a script in Perl, left it alone and when I came back to it one day, I couldn’t understand what I was doing in the script.
Back in the early 1980’s, the Apple ][ computer was taking the personal computing world by storm, and Microsoft released a compiler for Applesoft BASIC. That compiler went by the name TASC: The Applesoft Compiler.
TASC was written by one person in his dorm room while studying at MIT. Microsoft licensed the product and hired the author, who spent the summer at the Northup building¹ polishing the code.
As the author added features, he kept hitting the Apple ][‘s 48KB RAM limit and was forced to delete all the comments from the code, and when that wasn’t enough, he resorted to shortening all the important variable names to one character.
Such is the desperation of developing on a system with very tight memory constraints.
Everything was working smoothly, until the author returned to school for a semester. Upon returning to Microsoft, he found that he no longer understood the code. He had a sprawling compiler, with no comments, and unhelpful variable names.
Yet somehow, he finished TASC, and it shipped.
Huh, didn’t know about this at the time. I used an Applesoft Basic compiler once, because I wanted to speed up a large program I’d written. It was around 800 LOC. The compiler was called “Einstein.” I was surprised at how long it took, something like an hour (so, about 13 LOC per minute). The main difference I remember about it from what’s described of TASC was I don’t remember it doing multiple passes. Or if it did, I didn’t notice.
I needed to modify my program a bit to make it compile. It was possible to base a DIM statement on a variable, so I could vary the amount of memory I allocated at run-time. The compiler didn’t like this. So, I made a separate version where I picked reasonable values, and substituted literals in these DIM statements.
Interesting that a Basic compiler was written in Basic, and that it was used to compile itself. Sounds like quite a feat, given the limitations.
The multiple-pass strategy reminds of what I read about the first Fortran compilers. Even though they ran on mainframes, they had even smaller memory constraints to deal with (maybe 16K). I forget how many passes they had, but it seemed like more than this. It helps explain a story I heard about from my high school computer teacher, who had taken computer science in college in (as I recall) the late '60s. She told us that you’d submit your program on punch cards to the computer administrator, who would schedule it to be run. You might submit it in the afternoon, and not get your results back until 3am, or a few days later, depending how busy the computer was.
The 704 had just 4K of 36-bit memory (in standard configuration)!
FORTRAN I did some nifty things, e.g., from the Wikipedia page:
The FREQUENCY statement was used originally (and optionally) to give branch probabilities for the three branch cases of the arithmetic IF statement. The first FORTRAN compiler used this weighting to perform at compile time a Monte Carlo simulation of the generated code, the results of which were used to optimize the placement of basic blocks in memory—a very sophisticated optimization for its time. The Monte Carlo technique is documented in Backus et al.'s paper on this original implementation, The FORTRAN Automatic Coding System:
The fundamental unit of program is the basic block; a basic block is a stretch of program which has one entry point and one exit point. The purpose of section 4 is to prepare for section 5 a table of predecessors (PRED table) which enumerates the basic blocks and lists for every basic block each of the basic blocks which can be its immediate predecessor in flow, together with the absolute frequency of each such basic block link. This table is obtained by running the program once in Monte-Carlo fashion, in which the outcome of conditional transfers arising out of IF-type statements and computed GO TO’s is determined by a random number generator suitably weighted according to whatever FREQUENCY statements have been provided.
(https://en.wikipedia.org/wiki/Fortran, quoting The FORTRAN Automatic Coding System (1957, PDF-Copy))
Well, I guess, you need a few extra passes for this…