Regarding contending memory access for screen and programs, this is found on other systems, too.
E.g., on the PET (yes, I know, sorry for coming back onto this, again, it’s just something I’m familiar with), the first models without a CRTC chip printed only during V-BLANK, in order to avoid “snow” on the screen, resulting from clashing memory access. But it hasn’t the same impact on the overall performance.
Is it just that Sinclair didn’t limit this to any contended areas of the address space?
*) The resulting slowdown is actually remarkable, compare this ML implementation of the famous “10 PRINT” maze, once using the print routine of BASIC 2, and once this of BASIC 4, which eventually dropped that throttle, since new PETs came with the CRTC chip:
BASIC 2 (not much faster than BASIC, all the speed advantage is consumed by having to wait for V-BLANK to put anything on the screen): https://masswerk.at/pet/?run=asm-maze-bin&rom=2
BASIC 4 (several times faster) : https://masswerk.at/pet/?run=asm-maze-bin&rom=4
Hum. If you have a look at MS BASIC on the 6502, PLA and PLH are all over the place, and, where not, crucial data is temporarily stored in zero page locations. I still think, a Z80 implementation could find some speed advantage by strategically assigning registers.
Edit: As far as I know, the Acorn Electron had the same kind of ULA / CPU clash when it came to accessing memory and Acorn missed out on limiting this to the address ranges that were actually used for screen memory. So this would be another example, which is actually due to the hardware implementation and not the software. (There’s a mod, which limits this to where it’s actually needed, and apparently effects a considerable speed-up.)
But, if this would be the same on the Sinclairs, this should affect any kind of code and not just BASIC.