Anyone here back porting C compilers to old machines?

I finally lost patience with the terrible C compiler on the Tek4404 and am now porting (an old) version of gcc. Its proving challenging because the native compiler just explodes with anything but the most trivial expressions - and compiling compilers generates non-trivial expressions :wink:

I spent last week “pre-masticating” files on a Mac so all includes were resolved into a single file and replaced the Tek4404 cpp with a passthru since it frequently core dumped just reading thru the file…

But after rewriting / strength reducing lots of the large expressions and finally getting it to compile, the code the native compiler produces is wrong and it crashes anyway…
I even tried pasting in an enormous expression into ChatGPT and asked for simple K&R and it came out with something that “looked” sane… But it failed to compile. :frowning:

My latest attack is to use gcc-cross-m68k on my Mac to emit assembler that I can hopefully get translated on the Tek4404 without the assembler losing the will to live also…

So, if there are others walking this road, be great to chat!

I’d forgotten how big the expressions generated by compiling compilers do get… see below.

if (((((y)->code) == LABEL_REF || ((y)->code) == SYMBOL_REF || ((y)->code) == CONST_INT || ((y)->code) == CONST) || (((y)->code) == REG && ((((y)->fld[0].rtint) & ~027) != 0)) || ((((y)->code) == PRE_DEC || ((y)->code) == POST_INC) && (((((y)->fld[0].rtx))->code) == REG) && ((((((y)->fld[0].rtx))->fld[0].rtint) & ~027) != 0)) || (((y)->code) == PLUS && (((((y)->fld[0].rtx))->code) == REG) && ((((((y)->fld[0].rtx))->fld[0].rtint) & ~027) != 0) && ((((y)->fld[1].rtx))->code) == CONST_INT && ((unsigned) ((((y)->fld[1].rtx))->fld[0].rtint) + 0x8000) < 0x10000))) goto win; }; { { if (((y)->code) == PLUS && (((((((y)->fld[0].rtx))->code) == REG && ((((((y)->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((y)->fld[0].rtx))->code) == SIGN_EXTEND && ((((((y)->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((y)->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((y)->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) || ((target_flags & 1) && ((((y)->fld[0].rtx))->code) == MULT && ((((((((y)->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((y)->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((((y)->fld[0].rtx))->fld[0].rtx))->code) == SIGN_EXTEND && ((((((((y)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((y)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((((y)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) && ((((((y)->fld[0].rtx))->fld[1].rtx))->code) == CONST_INT && (((((((y)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 2 || ((((((y)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 4 || ((((((y)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 8)))) { { if (((((y)->fld[1].rtx))->code) == LABEL_REF) goto win; if (((((y)->fld[1].rtx))->code) == REG && ((((((y)->fld[1].rtx))->fld[0].rtint) & ~027) != 0)) goto win; }; } if (((y)->code) == PLUS && (((((((y)->fld[1].rtx))->code) == REG && ((((((y)->fld[1].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((y)->fld[1].rtx))->code) == SIGN_EXTEND && ((((((y)->fld[1].rtx))->fld[0].rtx))->code) == REG && ((((((y)->fld[1].rtx))->fld[0].rtx))->mode) == HImode && ((((((((y)->fld[1].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) || ((target_flags & 1) && ((((y)->fld[1].rtx))->code) == MULT && ((((((((y)->fld[1].rtx))->fld[0].rtx))->code) == REG && ((((((((y)->fld[1].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((((y)->fld[1].rtx))->fld[0].rtx))->code) == SIGN_EXTEND && ((((((((y)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((y)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((((y)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) && ((((((y)->fld[1].rtx))->fld[1].rtx))->code) == CONST_INT && (((((((y)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 2 || ((((((y)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 4 || ((((((y)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 8)))) { { if (((((y)->fld[0].rtx))->code) == LABEL_REF) goto win; if (((((y)->fld[0].rtx))->code) == REG && ((((((y)->fld[0].rtx))->fld[0].rtint) & ~027) != 0)) goto win; }; } }; if (((y)->code) == PLUS) { if (((((y)->fld[1].rtx))->code) == CONST_INT && (unsigned) ((((y)->fld[1].rtx))->fld[0].rtint) + 0x80 < 0x100) { rtx go_temp = ((y)->fld[0].rtx); { if (((go_temp)->code) == PLUS && (((((((go_temp)->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((go_temp)->fld[0].rtx))->code) == SIGN_EXTEND && ((((((go_temp)->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) || ((target_flags & 1) && ((((go_temp)->fld[0].rtx))->code) == MULT && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((((go_temp)->fld[0].rtx))->fld[0].rtx))->code) == SIGN_EXTEND && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) && ((((((go_temp)->fld[0].rtx))->fld[1].rtx))->code) == CONST_INT && (((((((go_temp)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 2 || ((((((go_temp)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 4 || ((((((go_temp)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 8)))) { { if (((((go_temp)->fld[1].rtx))->code) == LABEL_REF) goto win; if (((((go_temp)->fld[1].rtx))->code) == REG && ((((((go_temp)->fld[1].rtx))->fld[0].rtint) & ~027) != 0)) goto win; }; } if (((go_temp)->code) == PLUS && (((((((go_temp)->fld[1].rtx))->code) == REG && ((((((go_temp)->fld[1].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((go_temp)->fld[1].rtx))->code) == SIGN_EXTEND && ((((((go_temp)->fld[1].rtx))->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[1].rtx))->fld[0].rtx))->mode) == HImode && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) || ((target_flags & 1) && ((((go_temp)->fld[1].rtx))->code) == MULT && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((((go_temp)->fld[1].rtx))->fld[0].rtx))->code) == SIGN_EXTEND && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) && ((((((go_temp)->fld[1].rtx))->fld[1].rtx))->code) == CONST_INT && (((((((go_temp)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 2 || ((((((go_temp)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 4 || ((((((go_temp)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 8)))) { { if (((((go_temp)->fld[0].rtx))->code) == LABEL_REF) goto win; if (((((go_temp)->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[0].rtx))->fld[0].rtint) & ~027) != 0)) goto win; }; } }; } if (((((y)->fld[0].rtx))->code) == CONST_INT && (unsigned) ((((y)->fld[0].rtx))->fld[0].rtint) + 0x80 < 0x100) { rtx go_temp = ((y)->fld[1].rtx); { if (((go_temp)->code) == PLUS && (((((((go_temp)->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((go_temp)->fld[0].rtx))->code) == SIGN_EXTEND && ((((((go_temp)->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) || ((target_flags & 1) && ((((go_temp)->fld[0].rtx))->code) == MULT && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((((go_temp)->fld[0].rtx))->fld[0].rtx))->code) == SIGN_EXTEND && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((((go_temp)->fld[0].rtx))->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) && ((((((go_temp)->fld[0].rtx))->fld[1].rtx))->code) == CONST_INT && (((((((go_temp)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 2 || ((((((go_temp)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 4 || ((((((go_temp)->fld[0].rtx))->fld[1].rtx))->fld[0].rtint) == 8)))) { { if (((((go_temp)->fld[1].rtx))->code) == LABEL_REF) goto win; if (((((go_temp)->fld[1].rtx))->code) == REG && ((((((go_temp)->fld[1].rtx))->fld[0].rtint) & ~027) != 0)) goto win; }; } if (((go_temp)->code) == PLUS && (((((((go_temp)->fld[1].rtx))->code) == REG && ((((((go_temp)->fld[1].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((go_temp)->fld[1].rtx))->code) == SIGN_EXTEND && ((((((go_temp)->fld[1].rtx))->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[1].rtx))->fld[0].rtx))->mode) == HImode && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) || ((target_flags & 1) && ((((go_temp)->fld[1].rtx))->code) == MULT && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8)) || (((((((go_temp)->fld[1].rtx))->fld[0].rtx))->code) == SIGN_EXTEND && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->code) == REG && ((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->mode) == HImode && ((((((((((go_temp)->fld[1].rtx))->fld[0].rtx))->fld[0].rtx))->fld[0].rtint) ^ 020) >= 8))) && ((((((go_temp)->fld[1].rtx))->fld[1].rtx))->code) == CONST_INT && (((((((go_temp)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 2 || ((((((go_temp)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 4 || ((((((go_temp)->fld[1].rtx))->fld[1].rtx))->fld[0].rtint) == 8)))) { { if (((((go_temp)->fld[0].rtx))->code) == LABEL_REF) goto win; if (((((go_temp)->fld[0].rtx))->code) == REG && ((((((go_temp)->fld[0].rtx))->fld[0].rtint) & ~027) != 0)) goto win; }; } }; } } }; };
** FATAL ERROR: Internal table overflow - “switch” item count.

2 Likes

oh that’s truly horrible! I am wondering if gcc is a good choice though - wasn’t it always a bit of a monster? Is there a simpler compiler to choose, to try to bootstrap??

2 Likes

gcc1.27 Its when it was lovely and simple.

Matt Dillon’s DICE C, originally for Amiga, can target generic 68K. BSD licence, source here: APOLLO - FREESRC

3 Likes

The Portable C Compiler (PCC) might be a good choice for compiling for older machines
if you have u**x platform.
PDP 7,10,11 VAX NOVA and 80’s cpu’s.

5 Likes

Sozobon C was written for the Atari ST, which had a 68k architecture. It was distributed as freeware, and you can find the source easily online; I don’t know the license. It compiles K&R C, but there’s a version that compiles ANSI C on GitHub under the GPL.

The other two suggestions, DICE C and PCC, seem like better options, but if for some reason those don’t work, you could try this one.

3 Likes

In addition to the compilers already mentioned, there is of course tcc: TCC : Tiny C Compiler

When Bellard was working on it (it is now maintained on a mailing list), it emitted only x86, but it is tiny duh so retargeting might be easier.

1 Like

Also for assumedly easy retargetability: https://sdcc.sourceforge.net

Friend who maintains a C compiler (Tendra C) suggests pcc might be the best option in this case.

1 Like

Good luck with finding a good 32 bit 68000 compiler as most stuff is all 64 bit addressing.

Current gcc for 68000 is extremely good and seems well maintained. For Fuzix I am using gcc 12 with great results. In fact the only bugs I’ve had in 12 have been it being too clever and optimizing stuff in legal ways I hadn’t allowed for. There are some “complications” in building it where certain patterns of build options get you a 68000 compiler with a 68020 support library (see the notes in the Fuzix tree)

ANSI pcc isn’t a bad compiler option for some things but it’s much poorer code and far less tested.

For smaller machines I got fed up with the existing options having retargetted cc65 to 6800 series so wrote my own which at this point is doing 8080/8085 (using all the undoc stuff)/Z80, with the 65xx and 68xx ports slowly being progressed.

It also has the advantage that it can build itself on an 8bit machine under Fuzix as it fits in 64K - although it’s not terribly fast (lot of hashing and other stuff needs adding to speed bits up)

5 Likes

SDCC is not fun to work with in my experience. It’s continually changing so quite unstable and difficult to get patches accepted. It is the best Z80 compiler on the planet by miles (by lunar orbits more like). It also dumps a lot on the backend because internally the core of the compiler still thinks all the world is a PIC.

TCC has been retargetted (eg to 65C816)

3 Likes

Another option is LCC which seems each to retarget to new processors. It also had a book, but the current code has changed quite a bit from what is described there if I understood correctly. It is also not open source since the license puts restrictions on selling it.

In terms of complexity it is much larger than TCC but much smaller than GCC.

All much appreciated advice. Thanks all.

I would say gcc1.27 still retained the simplicity that was subsequently lost. ie non of the byzantine binutil stuff.

the “gcc” executable is just 4 files. cpp is 5 files, cc1 is 20 odd files.

You can “get” what the whole codebase is trying to do.

Just to be clear, your problem is not finding a C compiler that generates good code, but finding a C compiler that generates good code which you can port to and run on the Tek4044 itself, when bootstrapped using the Tek4044’s native compiler? (Correct me if I’ve misunderstood your comments.) Why not bootstrap from a modern system and ignore the Tek4044’s native compiler completely? gcc3 produced reasonable 68K code (eg https://www.doc.ic.ac.uk/~phjk/CompilersCourse/SampleCode/68000/BubbleSort-68000-assembler.s )
Alternatively can you get your hands on some 1980’s/1990’s native compiler and use that instead? I remember Aztec C was reasonable when I used to use it on an early Mac, for example.
I’ve no doubt you’ve done your own searches but from seeing how many different 68K compilers are out there from a quick Google search, I have to ask what your criteria are for a compiler that meant that none of those were appropriate for your application?
That said, since you’re looking for a C compiler that has a chance of being ported, here’s one: https://gtoal.com/compilers101/c68k/ (ZIP here) (Original source here)

Tek4404 has a native C compiler that is based on a 1979 K&R version. It can compile simple programs (I’ve written a window system for it) but for example has a cpp that predates “defined” and barfs on #ifdefs for undefined symbols. And the code it generates is generally poor (though it does seem to do something with the register keyword) …and its not always correct.
So when I ported uemacs to tek4404 it involves lots of manual fiddling to get it building.

The object and executable format is proprietary so I need the last mile (ie asm) to be processed with the native tools - or reverse engineer those formats too - prefer not to.

Currently trying to compile with gcc-cross-m68k -S on my Mac, munging the assembler syntax into what Tek4404 requires and assembling object files on the Tek4404. So far its working but I have not got to the really big expression tree files.

[Added bonus is Tek4404 has a NatSemi FPU that is accessed by doing a kernel call for each operation. My eventual plan is to find a way of mapping the FPU into user space and pinging it directly. Needless to say calling the OS for each FP op is horribly slow. So at some point I would do a second pass to get the gnulib to use that FPU.]

So anything that requires a native C compiler will probably be a problem. Not just the compexity of the expressions but also the nagging feeling the compiler is generating the wrong code. Its true I could have another run with a different compiler - and I may have to - but I’m going to see if I can make the ‘just use the assembler’ workflow work.

I guess I also have a bit of soft spot for gcc too. I had programmed solely in (1802,z80,6502,6800,68k) assembler for many years before using C compilers but in the late 80s when I saw gcc use movem.l and use spare address registers to cache data values rather than spill to memory, I was sort of in awe - and saw the writing on the wall regarding writing in assembly.

Reversing binary and sometimes object formats is normally fairly easy as most platform binary formats were not terribly imaginative and back then didn’t worry about the joy of ELF and share library dynamic bindings and other crap.

I’ve reversed bits simply by doing things like assembling short instruction sequences to get all the relocations in the binary, then creating data and bss objects of different sizes to see how the headers work.

1 Like

There’s a talk about SDCC at FOSDEM today:

Looking at the slides, it seems that it will give an overview of the state of the compiler on various systems.

3 Likes

I had to write a Tek4404 exe format to ELF to be able to use Gidhra, so perhaps you’re right that the time spent doing this would be easily saved with cross compile times on a Mac of seconds vs many minutes on Tek4404.

I’ve now implemented an ELF32 to Uniflex relocatible file converter so I can cross compile using gcc on my Mac and generate .r files that I just have to link on the Tektronix 4404.

And I am getting very close to having gcc up and running on it! Woot. Its been a brutal experience but it going to be awesome to have a decent compiler toolchain.

firstcompile

4 Likes