Bootstrapping the GNU Compiler Collection

I wonder what C compiler GCC was originaly written in?
CP/M has often 48Kb of memory just ample for a compiler.
I have felt GCC was like Mozilla, rewrite something every few
months and add new marketing features, and drop older features. The same goes for my free FPGA software.
Back then I suspect most things in the early 60’s was being done for the first time on machines that had so little
memory that every thing was swapped to DISC. TSS/8 for
the PDP 8 was good example. Writing code back then
was different than to today, where you use memory with out care, ignoring leaks and virtual memory thrashing
.
I guess having a GUI of any kind, doubles your program
size for stupid I/O, and debugging 3x the time.
Ben.

We’re drifting off topic to be sure.

GCC was written to be compiled with pcc, the “portable C compiler” that was readily available on Unix systems. It was meant to be compiled once, in to a new binary, then compiled again with itself. I’ve done this myself back in the day.

Nowadays, GCC can host itself, naturally, but it’s still able to be compiled with very old versions of the compiler.

Versions of GCC prior to 11 also allow bootstrapping with an ISO C++98 compiler, versions of GCC prior to 4.8 also allow bootstrapping with a ISO C89 compiler, and versions of GCC prior to 3.4 also allow bootstrapping with a traditional (K&R) C compiler.

So, any GCC prior to 3.4 can be compiled with a K&R compiler.

So, if you have pcc (a K&R compiler), you can build the latest version of GCC by bootstrapping the earlier versions of GCC until you can build the current compiler.

There is effort to create a very simple, auditable tool stack to build tools like GCC. They start with the simplest of languages (such as very crude Lisps) within which can be written something like an early K&R compiler. They don’t have to be fast, or create good code, just have to be functional. They don’t even have to produce machine code if they can run on an audited VM. A crude lisp builds a crude C compiler which builds an early GCC, and then you’re off to the races.

Why do they want to do this? Because of Reflections on Trusting Trust.

You could write a compiler, but they’re not very good. Lots of discussion of how awful Z80 C compilers are (partly due to the Z80 being a Z80, and partly due to writing a good compiler on system as memory constrained as a stock CP/M machine).

I don’t know of any real features GCC has dropped. At best they may have dropped support for a CPU that hadn’t been maintained for 10 years. There’s a vast array of legacy code that doesn’t care to be broken written in GCC.

1 Like

Maintained for 10 years. Ha Ha.
PDP 8’s are still going strong from the late 1960’s,
The problem with the 8 bitters, is that they are designed
as micro controlers, not a general purpose computer like
a PDP-11 or a IBM-360 ( Not counting decimal math here),
and don’t have the opcode space for muiltiple data sized operation.

  1. Correct Code is what is wanted.
  2. Faster code means caching of data, and that may make
    one architecture perform better than another for some problems, but not all of them.
    I am a fan of 36 bits, but that is way off topic,
    Ben,

I don’t know anybody who ignores leaks or virtual memory thrashing; both are pretty painful. These days a lot of people go so far as to choose languages that can’t leak, even at a fairly substantial performance penalty.

As for using memory without care, well, everybody does that when the amounts of memory they’re using are small relative to what’s available. You could accuse 1983 first-generation MSX machines (32K RAM, 16K ROM) to be using memory without care because their code wouldn’t fit on, and their data would use up most of the RAM in a typical machine from 1977.

But that’s the smart way to do it. Sure, for a parser I was working on the other day I could have built a system that would let me read in just a few characters at a time, instead of slurping the entire file (dozens of kilobytes!) into memory. But spending extra time to put extra code and, no doubt, extra bugs into the system doesn’t seem to help anyone.

But this is a pretty interesting topic of its own right, so I’ve split it into another thread! :slight_smile:

2 Likes

The first GNU compiler was based on Pastel, a dialect of Pascal.

Interesting, and new to me:

Indeed, if you have memory use it. No sense artificially forcing yourself to use less memory if you don’t have to. Unless you are doing it as a challenge that is.

Memory leaks are definitely still an issue, mainly because they’re leaks which means that they keep growing and growing. At some point performance starts to surfer and then people start complaining. The more memory you have the longer you can go before a problem will occur but you will eventually hit that wall.

In some ways that’s probably become more of an issue. If you are writing a program for a non-multitasking disk-operating system or a cartridge based system you can kind of go crazy as long as you don’t blow yourself up. These days there’s so many other programs competing for resources that you have to be a bit more careful. Especially if you are working on a web application where you have to play in the same sandbox as a JavaScript framework, a JavaScript engine, a browser and an operating system.

But that is not always the best case. Look at reading a big epub say a on tablet,
why must I wait for it to read all in, for chapter 1 of my book, or have it crash with
no memory. My other gripe why do they keep revising the standard for C, name it
some thing new, and have a new front end for that. RATFOR for example has been
dropped from GCC, so to be compatable with newer standards.

C has changed very little, except for twice: the initial standardization of C in 1989, and the 1999 ISO C revision. In both cases most of the changes to the language itself were related to simply tightening up the vast arrays of possibilities for incompatibility in things that were left to implementer or platform choice. ANSI C additionally introduced the standard I/O library (which I think we can all agree was a meaningful step forward), and C99 did introduce some additional standard library (much of it by code volume revolving around multibyte strings).

C11 and C17 were both very minor in comparison, although they did introduce some additional thread-aware and threading constructions (that last time I checked had literally zero implementations, but it’s been a couple of years).

All in all, the changes to the C standard are smaller than the differences between implementations of the C standard (and certainly between implementations of C prior to ANSI C).

What has changed more is POSIX, if you think of C as an implementation language for Unix-like systems; POSIX has expanded greatly in scope, as well as tightened up a lot of ambiguities, since it was first standardized.

Very interesting about Pastel, I hadn’t heard about that.

A quote from one of the references listed in the Wikipedia article Ed pointed to:

" (1998 historical note: at one point in the project Richard Stallman visited, and had the Pastel compiler explained to him. He left with a copy of the source, and used it to produce the Gnu C compiler. Most of the techniques that gave the Gnu C compiler its reputation for good code generation came from the Amber Pastel compiler.)"

(And searching for “pastel compiler” with google shows up a lot of interesting historical links)

Edit: And something more accurate here, in a quote from RMS: History - GCC Wiki - see “The Early Days”

1 Like

The problem isn’t using too much memory or too little memory but using it improperly. If the user isn’t going to see the next page for a minute or so don’t waste time loading it into memory. On the other hand if you are building a parser like cjs talked about and you are going to have the whole file in memory at some point why waste time loading it in little chunks? Now having a whole bunch of memory does give developers a chance to be lazy but that laziness is only a problem if it interferes with performance.

Performance is a feature.

(It would be good if we could return to sharing interesting findings and experiences! No argument here is worth winning. Try to post the sorts of things that others here might enjoy reading. Posting a gripe is not likely to bring joy.)