8-bit BASIC virtual machine?

drogon · February 19, 2024, 1:17pm

Going back to your original post - for many reasons…

I am a big fan of adventure games - being brought up on the original Colossal Cave, then Zork, Infocom, some of the Scott Adams ones then MUDs. I even wrote my own MUD which ran online for the best part of 20 years. I could write several books on that one…

Some thoughts out-loud - maybe critical but not intending to criticise… Writing adventures and creator systems is something close to my heart too - sadly now I lack the time to take it much further - for now, anyway…

So the aim is to write some sort of adventure game creator and execution system - but you wanted tape?

I’m going to suggest that that’s not feasible or practical today. The Infocom ones needed a back-end disk paging system just to deal with the rich textual environment, so without that, you end up with tiny adventures with a minimal wordset or you start to use a compression techniques which adds to the code complexity.

People do still use tapes on old systems though - but almost all I’ve seen are really PCs with sound card outputs and the tapes being digitised and stored as files on the PC. Old computers have a multitude of disk emulators now too - from the “flux” level to complete re-implementations using modern parts. I suspect very few people use actual floppy disks now.

Could you bootstrap it off tape by loading the core interpreter, then the word-list, then the map/database and room descriptions? What about custom code triggered by actions…

Then… Basic or a basic-like language. Because you want others to write their own adventures or because you find it easy to write in Basic yourself… Maybe that’s a limitation?

Writing/developing on the target or an external system? Todays programmers are often brought up on using an IDE running on their desktop - to produce code for different targets, but back in the day - An exception might be Infocom though - they used a PDP-10 (at least initially) to do the development then they had a target execution environment for every system. Not sure about the Scott Adams one but it looks like he did something similar - a common game file then custom machine run-times.

Wind on a year or 2 and people were writing in Basic in their home PCs - not always fast but it worked.

Round about the early 90s MUDs started to gain popularity due to this new-fangled thing called “The Internet”. Mostly still text based when the PC ones were moving to graphics. These were mostly written in C, running on “borrowed” work or university systems like Sun workstations and other (typically) Unix systems. Mine first saw the light of the 'net in late 1992…

Where does that leave us today?

I’d love to port my MUD engine to a retro system - but it’s written in C (the engine) and the actual “world definition” is written in a intermediate language - a C-like thing which handles lists and database objects natively. The world is a database which holds text or code or lists of database objects. The code is compiled into a bytecode.

Then there is the parser - if verbs and nouns were good enough for Colossal Cave then …

Kill Dragon
What with, your bare hands?
Yes.
Amazing.

How deep do you want to go? That can become very complex very quickly. For the most part, I stuck to the old verb/noun thing, but it could parse stuff like

unlock door with skeleton key

but it was fairly brute-forced - the parser looked for the keyword “with” then returned a left side and a right side. The left side was searched in the room contents and the right side searched in the players inventory… Not the best, but very workable and if it all matched then a function bound to the left side was called. It was good enough.

Returning from that rabbit-hole…

Could this be done on an 8-bit micro? Sure. Infocom did it - separate dev. environment, per-system run-time.

The hard part is writing the per-system run-time.

But lets say we target systems that already have Basic (maybe this is the intention?) Back in the day the magazines published big (and they got very big) posters detailing the difference between Basic dialects on different systems - most systems that run a Microsoft Basic were very similar - Pet, Apple II, TRS-80, Various CP/M systems, OSI/UK101 and so on. The changes were minimal. Other systems with their own Basics were more troublesome - ZX Spectrum, BBC Micro and so on. String handling was often an issue too.

Memory sizes too. The Beeb was criticised for only having 32K of RAM at the time when the Apple II had 64K, the Spectrum 48K and so on, however you can run Zork on a Beeb…

So - you create a common language/descriptor system then write the world generator and a set of run-time subroutines for every different Basic platform then have the generator crank out optimised code for each platform and “link” in (ie concatenate) the “library” for each platform and off you go.

If only life were that simple…

I’d like to do it again, but the target - it would be easy to target modern MS Windows or Mac, or Linux - but the old/retro systems running 8-bit Basics? I’d probably write a run-time to interpret the game world/database, etc. but how many different systems, how long have I got?

Plan B: Make your own little 8-bit SBC with the sole intention of running a text-based adventure game - VGA graphics, USB keyboard enough RAM - open source or sell it… Uptake… Single digits.

Have you seen Pico-8?

I’ve waffled too long…

-Gordon

IsaacKuo · February 19, 2024, 2:24pm

Bear in mind that Zork started off on a mainframe that had WAY more memory than home computers of the time, so the whole running off a floppy thing was more or less forced.

In contrast, other people were programming text games that simply fit in memory from the start, because … well … that’s all they had.

That said, here in the USA we had floppies everywhere.

NoLand · February 19, 2024, 2:44pm

I guess, if it’s dedicated to text adventures, it may be worth going down the assembler route for the parser. How many (important) platforms are there? 8080 (includes 8085, Z-80), 6502, 6800. These may be compiled from a common C-source, machine specific is probably just the memory location of the runtime code and any data sections. Notably, this may be from the same code as used for any more modern platforms.

With this out of the way, the remaining engine (in BASIC) is really about the UI, room definitions, and high-level stuff. Some of the room/inventory logic may be pushed as byte-code to the same assembler runtime as the parser.

Regarding parsing, you may want to have a look at the “Hobbit” and its “Inglish” parser, which was quite esteemed at the time. There are probably articles on this to be found…

Edit:
The logic behind this: all the internal heavy-lifting can be done in machine language, there’s just a handful of platforms to support, since all systems of a type are basically the same, as long as the code is running headless. This code is arguably easier to write and share in a language like C. What is machine specific, on the other hand, is really the user interface and any machine specific customizations (e.g., character encoding), which are arguably easiest done in BASIC. Here, BASIC provides already a direct implementation of any machine specific handling and limitations should be less felt.

This really leaves the question, where to best implement anything that is specific to a particular adventure, but the same for any platform. Should this be pushed to a common back-engine, or is this better left static and these aspects should be thus implemented in BASIC? As this is probably more of a philosophical question, there is no general advice. But, again, there should be common tasks which run entirely internal and headless, where byte-code may become of interest.

drogon · February 19, 2024, 2:47pm

Oh sure. Zork itself was/is huge. It was effectively split into 3 for the microcomputer release - using floppy disks.

We had floppies in the UK, but often due to the cost (dollar to GBP plus hefty import dutys and tax and postage) they were almost out of the hobbyist reach in the UK. The popularity of the BBC Micro (late 1981) started to change that. Very few could afford one of the “holy trinity” of Apple/PET/TRS-80 at that time.

As for micro adventures - I do have a Sharp PC1211 (aka TRS80 Pocket Computer) and there were adventure games for it - a challenge in both RAM (1.5KB) and display…

Gordon

Wysardry · February 19, 2024, 11:28pm

My first aim is to create a text adventure system that can be run on modern systems and produce games to run on 8-bit tape-based computers and modern platforms.

If I can manage that (which will depend on how quickly I can learn), my next aim would be to do something similar with a general programming language.

The first goal has been achieved before by others. See the ngPAWS site for some active projects.

DAAD Ready is the closest to what I have in mind, as there is a Windows front end and compiler. For each of the tape-based 8-bit computers it supports, there is an interpreter/VM written in assembly code.

Most of the source code is available on GitHub, except for the interpreters. It is possible to request the interpreter source code, but it wouldn’t mean much to me if I did.

There’s another online system called Adventuron that can produce HTML games and export files suitable for compiling with DAAD.

Level 9 Computing had a system that allowed games to be written in A-code then compiled to byte code that could then be run on any tape-based 8-bit computer that they had written an A-machine VM for.

There are plans to release at least some of the source code and documentation for that system, but nobody can say when that might be or how much of the original code survived.

As I understand it, both DAAD and the A-code system merge the byte code and VM code into a single file so that it will load in one go from tape or disk. It is possible to add a loading screen too, I believe.

As both systems implement text compression routines, the VM programs were written in assembly code. It might be possible to write something similar in a compiled language with only the speed critical parts in assembly code, but that’s not something I’ve ever tried.

I’ve heard that it’s much easier to create a VM once you’ve already written one for a computer with the same processor though.

EnthusiastGuy · February 20, 2024, 12:11am

I am a bit confused. Since you want the games to run on modern systems (which is understandable), why would you also want them to run on “tape-based computers”? Is it about the games, the game producing framework or the “hardware”?
And who’s to gain most in each scenario?

mc4004 · February 20, 2024, 2:02am

Diomidis Spinellis submitted a minimalist BASIC interpreter to the 1990 Obfuscated C Contest. (I think he won too.) A remarkable aspect of his entry was it fit in just 1,536 bytes of C source code (with macro-based data compression). After I macroexpanded the obfuscated code, the structure of the interpreter became clear. I gather this is pretty much how all early BASICs worked: a parser intimately intertwined with an interpreter, except this one optimistically proceeds to interpretation the moment the parser disambiguates a command token. Probably many “tiny” BASICs did this too.

In 1997 Michael Somos went one step further. He reverse-engineered and commented the re-expanded C code. Since I can’t upload files to this forum (or at least I don’t know how), I have posted these two files on my web site for your viewing pleasure:
https://4004.com/dds-basic-expanded.c
https://4004.com/lander.bas
Note that there is absolutely no syntax checking or helpful error messages. But the beauty of this code is it is about as minimal as you can get and still be a real BASIC interpreter. I really like minimal, especially for teaching purposes. Error handling really clutters things up.

P.S. While I wasn’t looking, modern C compilers adopted more error checking than before.
dds-basic-expanded.c needs two small changes to meet modern compiler standards:

Add #include <string.h>
Rename strstr to strstr2
Happy (lunar) landing!

Wysardry · February 20, 2024, 4:28am

@EnthusiastGuy My main objective is to be able to create games for some of the less popular 8-bit computers.

I have four machines that I have owned in the past and would like to create text adventures for. The Sharp MZ-700, Dragon 32/64, Enterprise 64/128 and the Amstrad CPC series.

Currently, the only way to do this using the same code is with C, which I dislike the syntax of and (from what I’ve heard) does not produce very efficient 8-bit executables.

A text adventure creation system that could produce games for more platforms would benefit creators by increasing their potential audience, retro computer enthusiasts by making more software available and text adventure players in general by having more players writing hints and walkthroughs.

A general programing language that could produce software for many platforms would have an even wider impact.

@mc4004 Thanks. That might come in handy.

Some of the BASIC interpreters were tiny by today’s standards as some machines had 16Kb of RAM (or less) to work with. The Sinclair ZX80 had only 1Kb RAM and a 4Kb ROM which they still managed to squeeze a BASIC interpreter into.

IsaacKuo · February 20, 2024, 6:06am

Perhaps your focus could be on more classic style BASIC games that don’t require as much RAM as classic text adventures.

Stuff like Mugwump, Catch the Wumpus … even good old Star Trek don’t rely on lots of text overall.

Or consider some sort of text adventure demake demake … a text-like adventure based on Atari 2600 Adventure. Each “screen” could be an 8x8 map, like Star Trek, which is rendered with simple ASCII or something. My point is, it doesn’t require text descriptions … compare 64 bytes for an 8x8 room compared to 80 bytes for just 2 lines of minimal descriptive text.

drogon · February 20, 2024, 9:06am

A few years back I wrote a bytecode/VM for the 65816 CPU in assembler - based on the documentation for it and some existing C source code… That took a few months. The bytecode was that produced by the BCPL compiler.

Latterly I re-wrote it in RISC-V assembler - that took a few days, but the time to learn RISC-V was a few weeks while I wrote an emulator for it…

So different processors here, but doing it the 2nd time was very easy. (Partly due to the '816 being a right pig to write efficient code for)

So pick what you might think is the hardest CPU to write the bytecode/vm for and take it from there.

-Gordon

drogon · February 20, 2024, 9:16am

As a personal/vanity project, then I’m all for it. But taking from something you’ve spent a lot of time on, and are rightly proud of to something usable by others is very challenging. Firstly you need to establish a “presence” on the most popular forums for each of these systems then take them on one at a time and hope to gain enough followers to make it worthwhile - while at the same time not going totally mad over it…

And on the last point, I ended up in a pit of depression trying to maintain what was at the time a very popular piece of open source software - I ended up with people doing the wrong thing and expecting me to support them - I asked for help, got none, closed the project and walked away. Please do not let yourself get overburdened like that.

The CP/M crowd has a good number of followers and many different physical implementations but it’s the one common denominator in the 8-bit world. In the 6502 world - there are dozens - you are in a maze of twisty passages all different. Then there is the 6809 and behind them, lurking in the corner is a couple of old beardies supping their real ale over their 6800…

-Gordon

drogon · February 20, 2024, 9:48am

Last year I wrote a Tiny Basic for the 6502. It’s under 4K - how much under depends on the os/monitor support you can give it. With a monitor providing all the IO then it can be as small as 3.4KB. On an deliberately minimal SBC I designed for it, requiring it to provide all the IO (bit-banged serial) and program save/load code then I have one byte free. One.

That platform has 4K of RAM, 4K of ROM and some GPIO. The CPU is the 6507 - hence the 8KB address space limitation (although I do ‘page’ the 32KB EEPROM in in 4K sections giving me 7 ‘save’ slots and one for the Basic itself).

It supports some very crude string handling - you can peek/poke bytes and copy strings, but memory management is left up to you.

Hows that for a challenge…

-Gordon

Wysardry · February 20, 2024, 10:34am

Does anyone have experience with The Amsterdam Compiler Toolkit?

Version 5.5 seemed to support multiple languages (BASIC, Pascal, C, Ada etc.) at the front end, included an interpreter and virtual machine, which created compiled files for multiple CPUs (Z80, 8080, 8086, 6809 etc.). It didn’t work well with modern computers though.

Version 6 seems to be in the process of being rewritten for modern computers. Some of the front end languages and the final compilers are missing as is more current documentation though.

I’m wondering if it might be worth trying to repurpose some of the code from ACK or whether that would require more effort than starting from scratch.

ceptimus · February 20, 2024, 10:56am

For text adventures in a limited memory space, some form of compression for the text is likely a good idea. Of course, the decompression code itself takes up space, so there’s a trade-off.

For machines with very small RAM, the decompressor is likely to take more space than the memory the compression saves - but those machines are likely too small to run a reasonable text adventure anyway, without external storage being used.

For modern machines with lots of RAM, there’s likely no need to compress the text anyway.

The decompression code, with its memory buffer working space, is likely to take about 1K of RAM - so if you get a compression ratio of 2:1 or better, then it’s worth using if your text adventure contains more than about 3K of text.

The decompression code also slows down the operation of the program - though that may not be a problem with most text adventures.

The interesting thing is how to best implement the compression: many of the standard algorithms are designed to minimise file size, and expect to decompress the whole thing into one large memory block - that’s not what you need for a text adventure: you want to decompress individual strings, or maybe blocks of strings, so you need a mechanism to selectively extract a string at a given ‘address’.

drogon · February 20, 2024, 11:29am

I looked at it last year with a view to writing a back-end to support the existing bytecode engine I have for BCPL - the intention being to give me more languages in my little BCPL operating system without having to write them myself.

I gave up in the end and decided on other routes instead.

I did use the P-System for Pascal in the early 80s though on the Apple II. It worked well, but was very constrained to its own environment/IDE but it did need a fully-loaded Apple II with 64K of RAM.

-Gordon

mc4004 · February 20, 2024, 3:19pm

While we are discussing the “small is beautiful” aspects of vintage software, I am reminded of how truly minimal the earliest versions of UNIX were. I recommend perusing the source code (e.g. familiar command line utilities), all of which was released by the Computer History Museum in 2019:

Wysardry · February 20, 2024, 10:47pm

@ceptimus Yes, text compression has a whole rabbit hole of its own.

Fortunately, there are at least three systems used in commercial text adventures that have been documented to some extent.

With the system I envision, text compression would be done on a modern machine and the retro systems would only need to be able to decompress it.

I’ve heard that Infocom’s system could compress text to about 70% of its original size and that Level 9’s system was closer to 50%.

Both systems used some form of lookup table or dictionary for common words or phrases.

@drogon Did you give up because TACK was inherently unusable, or because it wouldn’t work for your use case?

@mc4004 You might find SymbOS (a Z80 OS) interesting.

EnthusiastGuy · February 20, 2024, 11:23pm

I feel that one of the challenges is to match up different system’s available memory. A game that would fit its story in 128k would definitely have trouble with a 48k system and would either need to rely on fragmentation or scale-down. I mean, if the purpose is to also make games universally work on a certain number of machines, the lowest memory one will always drag down. A high compression rate would work fine for existing games, but what if someone decides to use the compression to write Robinson Crusoe on a specific machine? It would be very hard to port that on a machine with less RAM without some tricks that would break a bit the genericness of the engine. Unless, maybe the engine would be prepared for that?
Just shooting off some ideas.

Wysardry · February 21, 2024, 12:08am

Yes, there would need to be restrictions on how large a game could be for specific machines.

I would hope to be able to include checks during the writing, compilation and execution stages to discourage writers from trying to squeeze a 128Kb game into a machine with only 32Kb of memory.

A custom IDE would likely be the best way to do that.

Game files will likely need to include some form of header information that the virtual machine can read to determine whether they can be executed on the platform it’s running on.

drogon · February 21, 2024, 8:25am

It was more a time thing. I think it would still be nice to have other languages then BCPL on my little retro system but the prospect of getting up to speed with the compilers and the back-end bytecode generation was somewhat daunting.

-Gordon