8-bit BASIC virtual machine?

Wysardry · February 21, 2024, 9:28am

Thanks. It does look a little daunting, but seems to be the most complete system of its type.

The maintainer has just enabled the discussion feature on the ACK GitHub repo, so I’ll likely be asking lots of questions there.

IsaacKuo · February 21, 2024, 4:03pm

Switching gears to the original idea … instead of trying to come up with a compact interpreter for each target system, how about a cross-compiling system that outputs native BASIC code for each target system?

The source code could have long variable names and labels rather than line numbers. It would then translate the variable names into one or two letter variable names in the output code, line numbers, and such.

The point is - you take advantage of the target system’s own BASIC ROM code, rather than trying to squeeze in your own BASIC interpreter. Since a lot of the target systems have limited RAM, this could be a good thing?

EnthusiastGuy · February 21, 2024, 7:00pm

Maybe i misunderstood, isn’t level 9 producing assembly code?

Wysardry · February 21, 2024, 8:18pm

@IsaacKuo Unfortunately, there were almost as many different dialects of BASIC as there were machines.

ANSI standards were eventually proposed, but there’s only one 8-bit computer I know of that followed them (the Enterprise 64).

A virtual machine is usually faster and smaller than an interpreter.

An interpreter has to read code that a human has written, then convert it (line by line) to the machine code it understands.

A virtual machine only has to read byte code, which has already been compiled to a form nearer to machine code.

I don’t know of any speed tests that have been done, but most sources I’ve read say that a virtual machine is usually at least three times faster than an interpreter.

@EnthusiastGuy Like Infocom, Level 9 wrote a new virtual machine (called the A-machine) for each new platform. This was written in assembly code and assembled into machine code.

They also had a human readable language called A-code that they compiled to byte code using their own tools. This byte code could run without changes on any suitable A-machine.

There were 4 versions of the A-machine created over the years, which slightly complicated matters, but they had checks in place to prevent the wrong type being used with a particular byte code file.

What might be causing your confusion is that the byte code and the A-machine were often merged into one file so that they could both be loaded from tape together.

There’s a tool called L9Cut that can separate the byte code data from the A-machine code, in the Level 9 tools section of the IF Archive.

IsaacKuo · February 21, 2024, 8:54pm

If RAM is no object, then sure - a P-code interpreter can be faster than a typical BASIC interpreter. But you’ve got to deal with a general trade-off between interpreter compactness vs speed.

If you want the best speed, though, the best thing will be a compiler that creates actual machine code. The target code will probably be pretty compact also, since it wastes no space on an interpreter.

Hmm … for the best combo of speed and compactness, you actually want a pretty complex variant of BASIC - something with 8-bit signed byte integer type in addition to 16-bit and possibly 32-bit. This makes the compilers bigger, of course, but they’ll only be running on a modern computer with gigabytes of RAM.

drogon · February 21, 2024, 9:21pm

Usually faster…

And there is an intermediate - the tokenised code - which is what most BASICs actually are/were on the old 8-bit systems. Keywords would be tokenised at line entry time into tokens and everything else - mostly to save space and a nice bonus was that it was faster as the tokens were typically bytes, so you could use them like a sort of very long instruction word type thing.

My own BASIC took it a step further and tokenised everything. Strings, numbers, comments, etc. all got a token with the resulting data looking like a 16-bit virtual machine. What I didn’t do was any expression evaluation which is what a real compiler would do along with all the many optimisations that could be done at that point - still it’s quite fast enough - but Linux and baremetal Pi only - for now.

I did some benchmarks of the fastest 8-bit BASIC I had (BBC Basic 4) vs. my 32-bit BCPL VM on the same hardware - and yes, compiled BCPL bytecode as faster - but not as fast as I feel it could be. There are issues though - the Basics were operating in a pure 8-bit environment and the BCPL bytecode VM is operating in a kludged 32-bit space on an almost 16-bit CPU with an 8-bit memory interface. The 65C816.

https://projects.drogon.net/retro-basic-and-bcpl-benchmarks/

Way back there were 2 other systems I used where compiled into a bytecode VM was very much 2-5 times faster - the UCSD P-System on the Apple II (I wrote a whole load of “computer aided learning” stuff on that system), and BCPL (a 16-bit version) on the BBC Micro. The advantages of being able to code in something other than the “native” Basics on those systems were many fold.

-Gordon

drogon · February 21, 2024, 9:25pm

I know I keep banging the BCPL drum - but the BCPL compiler can actually output a few different types of bytecode - one is designed to be easily subsequently translated to a native architecture - you lose the benefit of small & compact, but gain speed…

I’ve seen Basics like that - they get ugly (IMO) and you either have to start using additional ways to specify the variable type which just isn’t Basic (again, IMO).

-Gordon

IsaacKuo · February 21, 2024, 9:51pm

Well, I take inspiration from BASIC V2 (C64), since I liked using “Blitz”. Blitz was a P-code interpreter rather than a compiler, but here I’m just talking variables.

Even though Blitz was compatible with normal BASIC V2 code, it did integer math more quickly because it translated into integer math P-codes when possible. So, you wanted to use integer variables when possible (using “%”), which is the opposite of what you normally want to do. Normal BASIC V2 always did float math, so using integer variables always slowed down your code. You only used integer variables if you wanted to store a big array - 2 bytes per int vs 5 bytes per float.

Anyway, the BASIC V2 way of doing things was:

A$ - string
A - float
A% - (16 bit) int

I’d maybe extend this to 8-bit and 32-bit with:

A# - byte integer
A& - 32 bit long integer

So, it’s still like BASIC V2 in spirit … just a bit extra stuff.

Michael_Barry · February 21, 2024, 9:56pm

I’m relying on a fading memory, but I think that TRS-80 Level 2 BASIC used the suffixes # for double, ! for single, % for integer and $ for string. You could even use DEF to alter the default single behavior for no suffix, e.g. DEFSTR A:A=“HELLO”

Wysardry · February 21, 2024, 9:58pm

@IsaacKuo A virtual machine can be both smaller and faster than an interpreter because it does less work (as there is less translating to do).

Michael_Barry · February 21, 2024, 10:01pm

… unless the virtual machine is the interpreter, which is what I believe was going on with Tiny BASIC way back in the day, i.e. the interpreter was implemented in virtual machine language, rather than having a compiler translate the user’s program into virtual machine language. The former is for ease of porting, and the latter is for improved performance.

drogon · February 21, 2024, 10:28pm

Indeed. The “Intermediate Language”.

I wrote a “classic” TinyBasic recently too - based on existing works, but it’s line after line of string compare, in what’s essentially a large if/then sequence. The “interpreter” can call itself recursively so that it can handle operator precedences.

Example:

; ilStatement:
;       Program statements - interactive or command

ilStatement:    tStr    tryLet,         "LET"           ; Optional LET
tryLet:         tVar    tryPokeByte                     ; If not a variable then carry on
                tChar   synErr,         '='             ; ... else variable assignment.
                call    relExp
                do      storeV,stmtDone,stmtNext

It’s ‘test’, go to here if fail, else fall to next statement…

These are macros which are expanded at assembly time, then interpreted by the simplest of execution engines. Here, there are only 4 instructions - tStr (test for a string), tVar (test for a variable), tChar is a special tStr and call. Otherwise it defaults to ‘do’ which means call this assembly language code… In theory you can use the same IL code on different architectures. Just write the run-time…

-Gordon

Wysardry · February 22, 2024, 3:56pm

Another advantage to a virtual machine is that you can use different programming languages to create the necessary byte code.

For example, there are several different interactive fiction languages that can be compiled to Z-code for the Z-machine.

AFAIK, nobody has written an interpreter that can understand more than one programming language.

drogon · February 22, 2024, 4:37pm

That was my theory when I looked at TACK with a view to making it output the same bytecode I’m using for BCPL…

But at the same time, there are many 100s of compiled languages that output e.g. x86 code, or ARM … You just need to be good at writing a back-end to these compilers (if/where possible) to generate your “ideal” bytecode.

The BCPL CINTCODE (compact Intermediate Code - it’s primary bytecode output format) is fairly ideal if you’re writing BCPL - I could probably get Pascal, C and FORTRAN to run under it - if I had the time but my skill-set isn’t really compiler writing.

-Gordon

EtchedPixels · February 22, 2024, 4:47pm

Scott Adams tackled it a different way initially. The game is a sort of bytecode which is expressed in game terms (things like “PRESENT AXE”) and the interpreter for it is written in BASIC and that can then be tweaked by platform. It’s very compact providing you want to do what it supports.

Fairly early on Scott switched to asm but it kept some of the oddities of the BASIC version and the way it organized data.

Of the small adventure bytecode engines the Level 9 is by far the neatest and supports arrays and stuff as a generic low level code. It’s also squashable into about 4K. The Infocom one is a lot heavier. The Quill/Paws style ones are adventure game level code derived from the Scott Adams concept so much like Scott’s stuff.

A different problem for games though is this. The virtual machine engine can be smaller than a BASIC interpreter. However the BASIC interpreter is free in ROM and already there.

Not sure BCPL is terribly useful unless you want another layer of retro. Plus it’s non-commercial-only so you’d be in a funny spot if you wanted to sell a few copies of the game for fun.

I do have a C bytecode interpreter, as I’ve been trying to get an 1802 running ANSI-ish C but even then you have the same memory cost of interpreter.

In the old days people did write BASIC that was ‘portable’ often with macro substitutions for those with a strong dialect (Spectrum strings for example).

Wysardry · February 22, 2024, 5:33pm

I was thinking of approaching the problem from a different direction.

TACK already has several languages on the front end with an interpreter and virtual machine which it uses to compile and assemble to machine code.

I was thinking of using its existing byte code tools and building standalone virtual machines that could read and run TACK byte code.

I would then have the option of compiling to machine code or byte code, depending on which was the best compromise between speed and the memory needed.

@EtchedPixels If Level 9’s tools were available, I would definitely try them out.

The Austin brothers are hoping to release them at some point, but cannot say when that might be. It’s also not clear how much of the code has survived.

drogon · February 22, 2024, 6:54pm

Are you sure about the non-commercial bit? Recent versions of the compiler do have a copyright notice on them, but there is nothing else in any of the source distributions that I’ve found to say no commercial use allowed.

MRs Raspberry Pi version also appears to be free of restrictions too.

Many moons back I do remember buying the BCPL Stand alone generator for the BBC Micro (ended up not using it though as it as easier to have the ROM in all the Beebs we had and boot the code into them via Econet)

I really can’t find anything to say otherwise and the closest I’ve found is in a document dated 2013:

If you like BCPL, you may also be interested in Cintpos, MCPL, VSPL, Bench
Cobench and Tcobench that are also freely available via my home page.
I particularly recommend Cintpos under Linux.

And I know that that doesn’t mean it’s free as in beer, or even open source.

Maybe I ought to ask MR about it while I still can… He’s almost 84 now.

-Gordon

EtchedPixels · February 22, 2024, 7:26pm

Older versions have a no commercial use, newer ones don’t appear to have any permission to use them at all.

Might be good to get that clarified and see if it and tripos can be clearly freed

mmiler · February 24, 2024, 6:31am

I’m scratching my head a bit from your description, but I think I understand what you’re getting at. You want to enable people to write text adventures in Basic on modern platforms, to be able to run those adventures on modern platforms, and on retro systems, but you want them to be compatible with loading from tape.

You’re thinking of creating some standard versions of Basic, with some limited capabilities, to maintain cross-platform compatibility.

Do I have that right?

The Basic/VM idea is certainly doable.

One example I’ve seen, being from the Atari world, is basicParser (at Serious Computerist), which allows you to write Turbo Basic code (a third-party Basic) in the editor of your choice on a modern system, and compile it to bytecode that runs in Turbo Basic on 64K Atari XL/XE computers.

Re. loading from tape

As @drogon said, that’s doable, too, but you’re going to be limited in what you can do in the adventure.

You could make it so it would load a data file off tape, but you’d really only want to do that if you wanted to make it possible for others to write their own adventures. Even so, the memory limitation would still apply, because there is no random access with tapes. There is only “Moving forward. Forgetting the past (for the most part).”

As drogon said, what I’ve seen with retro computerists is they mod their systems to use modern storage devices, like SD cards, where they access disk images. (I’m not sure how they would load virtual tapes, other than to transcribe them to actual tapes, and use an old cassette drive/tape recorder.)

It seems to me the hardest part would be creating the virtual tape files, since this means creating sound files compatible with the signal characteristics that the target systems use.

Just a suggestion, but an approach I would take, since it seems your real goal is to make a text adventure authoring platform, would be to skip the idea of using Basic, since that’s a much more general platform, and use the Z-Machine as a model for what you want to do. I’m not saying “just use Z-Machine.” What I mean is look it over, and mine it for ideas for the VM you want, and leave aside features you don’t want. You could then create your own programming language modeled on the VM you’ve created. I think this would make more efficient use of space, would be less work to create, and might perform better than Basic for this project.

Re. 8-bit systems that ran P-code (Pascal P-system)

The P-system was ported to the Apple II+/IIe (as Apple Pascal), and the TI-99/4A (which I know technically was a 16-bit system, but it was a screwy architecture that tended to run slower, since it had an 8-bit bus).

It only ran on disk.

TI had a P-System card you could add to the Peripheral Expanson Box.

The Apple version required the Language card expansion (which I think added an extra 16K of memory(?)).

The main thing that was unusual about the P-system is since it was actually an OS, it had its own disk format.

It was intended to be portable across platforms (including the disk format), but the “dream” didn’t work out that well, since the only portable part of the standard had to do with text and I/O, not graphics, sound, or anything else one might want to implement.

From that perspective, it would work for writing portable text adventures , but I understand you want something that works with tape systems.

IsaacKuo · February 24, 2024, 10:56am

I think that all of the intended target systems are popular enough to already have utilities to translate binary files into audio. This audio can then be recorded to a tape, via a normal tape recorder, or even directly played back to the tape input of the target computer. So usually, no extra hardware is required - just an audio patch cable between your modern device (such as a laptop or phone) and the target classic computer.

An exception would be Commodore computers, which used purpose built tape drives rather than letting the end user use any tape player. In that case, a direct audio patch cable connection isn’t possible; the easiest thing is to record onto an actual tape using a normal tape recorder.

So, multi-load is possible, and it’s even possible to do a sort of “random access” multi-load. Random access is possible in a couple ways:

The user can be instructed to rewind the tape to the start, and press play from the start of the tape. The computer program can then wait until the tape is playing back the desired file to load it. Obviously, this involves wait times.
The user can be instructed to play a specific tape or CD track. With a direct audio connection, the user can play a specific audio file. This eliminates wait times, but of course there’s some manual user effort involved.

How bad would the wait times be? Well, if we assume a fast loader of, say, 3600bps, that’s 450 bytes per second. A 128K file would play in under five minutes. That’s not necessarily a bad thing, but I think a game player would generally rather spend a little effort rather than wait that long.

Using the second approach, the player spends a little effort touching the specified playback file on their phone or laptop, and then waits perhaps 20 seconds for that 8K data file to load.