BBC BASIC for loop optimization?

MauryMarkowitz · January 18, 2022, 9:27pm

I seem to recall reading somewhere that BBC BASIC had an optimization for FOR loops, but I cannot find anything about it now.

IIRC, there was a second type of FOR loop (invisible to the user?) that was used if the index and all parameters were integers - or something to that effect?

If my memory is correct, does anyone know when this decision was made? Was it during the tokenizing or at runtime?

EdS · January 18, 2022, 9:40pm

One optimisation springs to mind: nested FOR loops maintain a stack of their iterator machinery. If the NEXT names a variable, the stack is searched to find the matching FOR. But if the NEXT does not name a variable, the innermost FOR is used from the top of the stack, and that’s very slightly quicker.

Indeed, instead of NEXT J,I you can just use NEXT, (with the comma) which looks a little odd.

Play with it here.

MauryMarkowitz · January 18, 2022, 9:51pm

That one I am familiar with - MS introduced this around 1979 in the second version of the 6502 port so it’s in later versions of Commodore and Applesoft as well.

No, the thing I’m thinking of, if it exists, is somewhat more complex.

EdS · January 18, 2022, 10:37pm

Hmm, I see no special handling of FOR or NEXT in this disassembly of BBC Basic 2.

MauryMarkowitz · January 19, 2022, 12:53am

That is a useful book!

The line in question is $B6E4. If the variable is not “real”, it does a direct addition of the integer value inline rather than calling FADD. If it is real it jumps to $B766, which I assume is calling FADD at $A500.

In the following lines, especially $B7F5 I see that it is reading the variable type and forcing the values to integer though &92DD.

So somewhat along the lines that I was thinking of, but not exactly as complex as I recalled. This is all based on the variable type, not reading the entire line to see if the index is “really” an integer as it is in FOR I=1 TO 10, which would call the floating point code.

EdS · January 19, 2022, 8:16am

Oh, well spotted. So, if I understand correctly, the integer path is favoured and inlined. And, I think it’s the case that MS Basic uses floating arithmetic in this case, even for integer variables, in which case Acorn’s approach looks even better.

It’s surely worth noting that Acorn’s BBC Basic was a second implementation after their Atom Basic, and also that it has the luxury of a 16k ROM where earlier Basics had only 8k (or even less.)

Although my first machine (a Compukit UK101) did run an (unlicensed) MS Basic, my greatest familiarity is with Acorn’s BBC Basic, so it’s always seemed natural to me that integer variables have both a storage and a performance advantage. But, in a sense, it is a luxury.

rwiker · January 19, 2022, 12:13pm

It’s probably also relevant that Atom Basic was an Integer Basic, so FOR loops would be integer-based from the start.

I find it really odd that FOR loops in Basic are even allowed to use floating-point variables and values - forcing the use of integers would make most cases faster, with realtively little loss of functionality.

MauryMarkowitz · January 19, 2022, 2:08pm

Well, that’s largely what this does; if the variable is an integer it truncs all the inputs to integers while building the stack frame and then it’s int from then. I do find it interesting that it does the increment inline though instead of calling a routine for that - I haven’t looked but I assume there is a 32-bit int add in there somewhere it could JSR.

I will point out that using a 32-bit int is overkill. I’ve run statistics on many BASIC programs and never found a single int that was larger than 16-bit. This makes sense if you think about it, especially given that machines of the era did lots of peeks and pokes on their 16-bit address spaces. There will be a few examples of larger ints for sure, but they are so rare (like 5% or less of all ints) that using FP for those few is quite reasonable as it will double the performance of every other case.

You can collect these sorts of stats using RetroBASIC: GitHub - maurymarkowitz/RetroBASIC: BASIC interpreter in lex/yacc/c

There are many real-world examples of floats in loops, there are several in the book EdS posted for instance!

I also find it interesting that all the books and materials on the BBC refer to “floating point” but inside the code, the comments are invariably about it being “real”. Same thing of course, but still a little odd to see different terminology.

NoLand · January 19, 2022, 6:42pm

I do not know, if this even applies to BBC BASIC, but there’s a common (but tiny) optimization for zero-based loops, where you use just the decimal point for zero (as in "FORI=.TO"). While it’s parsed as a float, parsing just the decimal point (as the decimal point has to be checked anyway and we thus fall back to the default value as there’s nothing before or beyond) is usually faster than converting characters to numbers, and converting zero from floating point to integer is not much of a task either.

Meaning, this is probably the fasted way to write the usual "10 PRINT "*";:GOTO 10"-loop:
0 FORI=.TO1STEP.:PRINT"*";:NEXT

EdS · January 19, 2022, 7:38pm

Thanks, I learnt something there - it is a valid optimisation in BBC Basic, and it’s a very very small win!

In BBC Basic you can also use
0 REPEAT PRINT"*";:UNTIL FALSE
which can be typed as
0 REP.P."*";:U.0

MauryMarkowitz · January 19, 2022, 8:21pm

Very clever! Of course this only works on BASICs that parse numbers at run time as opposed to edit time.

scruss · January 20, 2022, 9:34pm

You might be thinking of Richard Russell’s Z80 BBC BASIC interpreter, which he developed in parallel with Acorn’s while he was working at the BBC. Richard’s interpreter uses an aggregate type for numeric variables: integers benefit from faster routines, while floats use their own slower code.

NoLand · January 21, 2022, 9:06am

We can take this even a step further, as the “.” trick also works for line numbers targets:

0 REP.P."*";:U..

P.S.:
I’m not so sure, if “REPEAT…UNTIL” is faster than “FOR…NEXT”. Since “FOR” may occur anywhere in a line and is not necessarily the first statement, the interpreter has to memorize the actual RAM location of the loop, thus providing immediate access for the next iteration, whereas “UNTIL” has a line number target, meaning, the interpreter has to engage in a line number search. (I guess, it depends on the implementation and where this appears in the entirety of the BASIC script.)
This should be especially true for a bare “NEXT” without an identifier, where the interpreter just picks the current loop from the stack.

MauryMarkowitz · January 21, 2022, 1:36pm

Interesting! This was the main reason Atari BASIC was so slow, it did this for FOR loops so it was searching through the program on every NEXT. Bad programmer, no donut!

I wonder why they did it this way for UNTIL?

Note that UNTIL will also naturally be slower because it’s not simply calling some internal ADD function on the index variable, it has to perform the formula provided by the user using the evaluator because it might be something like UNTIL (X-Y)/2 > 5

NoLand · January 21, 2022, 2:20pm

My guess: REPEAT (unlike NEXT) has no associated identifier. How would you identify the corresponding start of the loop?

(Well, you could pick it just like with the bare NEXT, but isn’t the general way of BASIC. Also, you probably would have to store the condition separately in order to have it work with a stack, like FOR…NEXT. Personally, I think, it could be done just by a simple stack with pointers to REPEAT statements in RAM and UNTIL evaluated in-line, but the FOR…NEXT-stack was already kind of an extreme outlier for BASIC. Since GOTO and GOSUB are already based on line numbers, it’s somewhat logical to do the same with UNTIL.)

MauryMarkowitz · January 21, 2022, 2:43pm

Well that piqued my interest so I had a look in that great reference you posted. Looking at address $BBEA it appears it does indeed push the address on the stack, not the line number. And the UNTIL handler at $BBD3 grabs that address and then jumps to the end of the GOTO handler at the point after it’s done a line lookup. So it looks like it does not use the line number and thus works almost exactly like a NEXT that doesn’t have a variable, in that it simply takes the first REPEAT it finds.

NoLand · January 21, 2022, 4:28pm

That respect is duly deserved by @EdS – which also explains why I’m not at the exact apex of Beeb-related knowledge. (Personally, I’m more familiar with Commodore BASIC, which lacks such commodities like REPEAT…UNTIL.)
Thanks for having a closer look into this!

EdS · January 21, 2022, 4:31pm

Oh no, it doesn’t! It takes a boolean expression as argument, and as noted, takes execution back to the REPEAT. In my example, 0 was standing in for FALSE, which can be typed as FA. As you note, a bare . can stand in for 0. But I did a speed test, and I think I found that FALSE is fastest to execute. 0 is faster than a bare point. I think.

NoLand · January 21, 2022, 4:36pm

Well, it seems that I’ve exposed sufficient incompetence on the matter. Sorry for the irritation.
(Thankfully, the Internet was purposefully invented so that we may show off our very stupidity. )

NoLand · January 25, 2022, 3:18pm

Here’s another interesting detail regarding the performance of FOR…NEXT loops, but this won’t be an issue with the BBC Micro, as it affects only systems which use floating point math for the loop, like Commodore BASIC. There’s a non-linear time factor involved, as there’s a difference in run time for any iterations, where the mantissa isn’t simply incremented, but has to be shifted, as well, when the value approaches another power of two.

This turned up in a recent video by “8-Bit Show and Tell”, where he explored these runtime issues.
The effect can bee seen at about 9:08 into this video:

(Well, it’s a Commodore 64, I apologize to any BBC Micro fans in advance.)