One optimisation springs to mind: nested FOR loops maintain a stack of their iterator machinery. If the NEXT names a variable, the stack is searched to find the matching FOR. But if the NEXT does not name a variable, the innermost FOR is used from the top of the stack, and that’s very slightly quicker.
Indeed, instead of NEXT J,I you can just use NEXT, (with the comma) which looks a little odd.
The line in question is $B6E4. If the variable is not “real”, it does a direct addition of the integer value inline rather than calling FADD. If it is real it jumps to $B766, which I assume is calling FADD at $A500.
In the following lines, especially $B7F5 I see that it is reading the variable type and forcing the values to integer though &92DD.
So somewhat along the lines that I was thinking of, but not exactly as complex as I recalled. This is all based on the variable type, not reading the entire line to see if the index is “really” an integer as it is in FOR I=1 TO 10, which would call the floating point code.
Oh, well spotted. So, if I understand correctly, the integer path is favoured and inlined. And, I think it’s the case that MS Basic uses floating arithmetic in this case, even for integer variables, in which case Acorn’s approach looks even better.
It’s surely worth noting that Acorn’s BBC Basic was a second implementation after their Atom Basic, and also that it has the luxury of a 16k ROM where earlier Basics had only 8k (or even less.)
Although my first machine (a Compukit UK101) did run an (unlicensed) MS Basic, my greatest familiarity is with Acorn’s BBC Basic, so it’s always seemed natural to me that integer variables have both a storage and a performance advantage. But, in a sense, it is a luxury.
It’s probably also relevant that Atom Basic was an Integer Basic, so FOR loops would be integer-based from the start.
I find it really odd that FOR loops in Basic are even allowed to use floating-point variables and values - forcing the use of integers would make most cases faster, with realtively little loss of functionality.
Well, that’s largely what this does; if the variable is an integer it truncs all the inputs to integers while building the stack frame and then it’s int from then. I do find it interesting that it does the increment inline though instead of calling a routine for that - I haven’t looked but I assume there is a 32-bit int add in there somewhere it could JSR.
I will point out that using a 32-bit int is overkill. I’ve run statistics on many BASIC programs and never found a single int that was larger than 16-bit. This makes sense if you think about it, especially given that machines of the era did lots of peeks and pokes on their 16-bit address spaces. There will be a few examples of larger ints for sure, but they are so rare (like 5% or less of all ints) that using FP for those few is quite reasonable as it will double the performance of every other case.
There are many real-world examples of floats in loops, there are several in the book EdS posted for instance!
I also find it interesting that all the books and materials on the BBC refer to “floating point” but inside the code, the comments are invariably about it being “real”. Same thing of course, but still a little odd to see different terminology.
I do not know, if this even applies to BBC BASIC, but there’s a common (but tiny) optimization for zero-based loops, where you use just the decimal point for zero (as in "FORI=.TO"). While it’s parsed as a float, parsing just the decimal point (as the decimal point has to be checked anyway and we thus fall back to the default value as there’s nothing before or beyond) is usually faster than converting characters to numbers, and converting zero from floating point to integer is not much of a task either.
Meaning, this is probably the fasted way to write the usual "10 PRINT "*";:GOTO 10"-loop:
You might be thinking of Richard Russell’s Z80 BBC BASIC interpreter, which he developed in parallel with Acorn’s while he was working at the BBC. Richard’s interpreter uses an aggregate type for numeric variables: integers benefit from faster routines, while floats use their own slower code.
We can take this even a step further, as the “.” trick also works for line numbers targets:
I’m not so sure, if “REPEAT…UNTIL” is faster than “FOR…NEXT”. Since “FOR” may occur anywhere in a line and is not necessarily the first statement, the interpreter has to memorize the actual RAM location of the loop, thus providing immediate access for the next iteration, whereas “UNTIL” has a line number target, meaning, the interpreter has to engage in a line number search. (I guess, it depends on the implementation and where this appears in the entirety of the BASIC script.)
This should be especially true for a bare “NEXT” without an identifier, where the interpreter just picks the current loop from the stack.
Interesting! This was the main reason Atari BASIC was so slow, it did this for FOR loops so it was searching through the program on every NEXT. Bad programmer, no donut!
I wonder why they did it this way for UNTIL?
Note that UNTIL will also naturally be slower because it’s not simply calling some internal ADD function on the index variable, it has to perform the formula provided by the user using the evaluator because it might be something like UNTIL (X-Y)/2 > 5
My guess: REPEAT (unlike NEXT) has no associated identifier. How would you identify the corresponding start of the loop?
(Well, you could pick it just like with the bare NEXT, but isn’t the general way of BASIC. Also, you probably would have to store the condition separately in order to have it work with a stack, like FOR…NEXT. Personally, I think, it could be done just by a simple stack with pointers to REPEAT statements in RAM and UNTIL evaluated in-line, but the FOR…NEXT-stack was already kind of an extreme outlier for BASIC. Since GOTO and GOSUB are already based on line numbers, it’s somewhat logical to do the same with UNTIL.)
Well that piqued my interest so I had a look in that great reference you posted. Looking at address $BBEA it appears it does indeed push the address on the stack, not the line number. And the UNTIL handler at $BBD3 grabs that address and then jumps to the end of the GOTO handler at the point after it’s done a line lookup. So it looks like it does not use the line number and thus works almost exactly like a NEXT that doesn’t have a variable, in that it simply takes the first REPEAT it finds.
That respect is duly deserved by @EdS – which also explains why I’m not at the exact apex of Beeb-related knowledge. (Personally, I’m more familiar with Commodore BASIC, which lacks such commodities like REPEAT…UNTIL.)
Thanks for having a closer look into this!
Oh no, it doesn’t! It takes a boolean expression as argument, and as noted, takes execution back to the REPEAT. In my example, 0 was standing in for FALSE, which can be typed as FA. As you note, a bare . can stand in for 0. But I did a speed test, and I think I found that FALSE is fastest to execute. 0 is faster than a bare point. I think.
Here’s another interesting detail regarding the performance of FOR…NEXT loops, but this won’t be an issue with the BBC Micro, as it affects only systems which use floating point math for the loop, like Commodore BASIC. There’s a non-linear time factor involved, as there’s a difference in run time for any iterations, where the mantissa isn’t simply incremented, but has to be shifted, as well, when the value approaches another power of two.
This turned up in a recent video by “8-Bit Show and Tell”, where he explored these runtime issues.
The effect can bee seen at about 9:08 into this video:
(Well, it’s a Commodore 64, I apologize to any BBC Micro fans in advance.)