Going Back to EDASM, the 1980 Apple II Editor/Assembler

That was a great read - thanks!

Do not underestimate how slow these machines were. I/O crushes these things.

Even screen I/O. A 19200 baud “printer” is particularly fast.

Now, mind, back in the day, while a printer might support 19200 baud, fat chance of it actually printing at the speed. But your computer can drain a 19200 baud port and wait for more. 1920 Bps. That’s probably faster than the local disk drives are. (I don’t know about the Apple, but the Atari disk was very slow – 600 Bps, I know the C64 was no speed demon). The redeeming qualities of old school floppies were a) random access, b) faster than tape. “Faster than tape” is a low barrier to hurdle.

Now, you’d think that a memory mapped screen – how slow can that be? Sure, when you scroll 40x24, roughly 1K, that’s a simple block move, shifting the screen buffer by 40 bytes.

But if you only have, say, 20 character lines, that’s a lot of scrolling. Every 20 bytes, you eat a 1K buffer move. Now you’re starting to really eat bandwidth. 100 line listing, what’s nominally 2000 bytes of data is actually 100K of screen moves.

Vs the printer, where it’s simply pouring data out the port. It’s a lot less copying.

But it’s nice to see folks working on real hardware.

Nowhere near. The Atari 8-bitters and the Commodore VIC-20/C64/etc. were unusual amongst late-70s/early-80s microcomputers in having a slow serial interface between the drives and the computer, which is what made them so slow. The Apple is directly connected to the drive controller via the system data bus and thus doesn’t suffer from such limitations. It has no problem reading a full track (4 KB) in a couple of revolutions: 4 KB in 1/150 of a second = over 600 KB/sec. (Actually, it could even read a track in one revolution if programmed correctly, though DOS and ProDOS didn’t do this for reasons unrelated to the disk hardware itself. In fact, any drive interface must be able to write a track in a single revolution or you wouldn’t be able to format disks!) The main limit on larger reads was the head seek time.

To get some sense of the overall speed, I just formatted a diskette on my Apple IIc. The portion where it writes all 35 empty tracks took about 20 seconds, so the overall speed was around 70 7 KB/sec, or the equivalant of somewhere around 70,000 bps over a serial line.

EDIT: I’ve just timed the ADTPro format routine, which is more efficient than DOS 3.3, and it does 35 tracks in about 9 seconds, after the head reset, so call it 16 KB/sec for sequential read/write across multiple adjacent tracks, if you’re willing to do full-track reads or writes. (Or probably even if not; again, head seek time dominates.)

(That said, while the maximum speed the code in ROM will set, 19,200, is only a quarter the speed of a disk, the UARTS themselves will very reliably do 115,200 bps with no need for flow control, so the serial port can indeed be faster than the disk.)

To be frank, I have to question methodology here.

Looking at some legacy Dr. Dobbs magazines, there was a benchmark comparing CP/M Plus (CPM 3) to CP/M 2.2. And they were getting about 2600 BYTEs per second on a 4Mhz Z80 with 8" Shugarts during writes. (I didn’t catch the read speeds for the CP/M test.)

Here is a benchmark for an IBM AT: PC Mag - Google Books (hopefully that works for others).

For their sequential writes they were getting about 2850 Bytes per second. 5K/s during reads. (They were getting about 25K/s for writes and 50K/s for reads against a hard drive. Hard to be precise with the bar graphs.)

While the Apple should certainly be faster than the Atari and Commodore, I’d be surprised if it’s faster than an IBM AT.

For sure, random writes will be slower than a format. A format is the best possible case for write speed.

But truly random writes might well be dominated by seek time, which says little about the disk controller or the filesystem.

Nonetheless, I completely believe that Apple’s floppy interface, which is direct and designed by Woz, will outperform any retro system with serial-attached floppies, in any suitable benchmark. (One exception I can think of is operations on Commodore systems which can be completely contained at the drive end of the serial connection.)

Acorn’s machines, which I’m most familiar with, used a conventional approach with a floppy disk controller chip - either the older 8271 or the newer 1770 family.

1 Like

I’ve just tweaked the head post to add min 1 to the stty parameters. This is needed to ensure that the subsequent cat won’t exit immediately if there are no data to be read. This is actually the default setting, but ADTPro changes it to min 0 and does not set it back, so previously if you’d used ADTPro before setting up to see listings cat would immediately (and perhaps mysteriously) exit.

1 Like

First, this post is quite a deed!

You got me there… (What kind of address is this?) :slight_smile:

More seriously, regarding the differences when starting a program from the monitor or from the prompt (Do we need a new line? Should we exit by RTS or by a jump to BASIC?): you may want inspect the top two values on the processor stack (PLA), whether there is a return address to the monitor or not, and push them back onto the stack (PHA) in order to not to disturb anything. This way, you may set some sort of flag and support both modes.

Something like this:

; MFLG: some memory location used as a flag (0/1)
; RETL, RETH: return address (back to monitor) to compare with

FROMMON:
   LDA #0
   STA MFLG    ;initialize flag
   PLA
   CMP #RETH   ;pull and compare first address byte
   BNE EXIT2   ;not equal - restore and leave
   TAX         ;move it to X reg.
   PLA
   CMP #RETL   ;pull second byte and compare it
   BNE EXIT1   ;not equal - restore both bytes and leave
   INC MFLG    ;set flag (was called from the monitor)
EXIT1:
   PHA         ;restore low byte
   TXA
EXIT2:
   PHA         ;restore high byte

; this is not supposed to be a subroutine,
; but to be executed right at the beginning.
; (as a subroutine, we had to pull two bytes
; from the stack first, our own return address,
; save it and push it back onto the stack,
; before we leave.)
1 Like

That’s a good thought! But, as it turns out in the end, not really necessary. It seems that both BASICs and even EDASM set something appropriately such that the DOS entry hook at $3D0 Just Works, taking you back to BASIC or the EDASM prompt, if you did your .BRUN MY PROGRAM.OBJ0 from there. So that seems the best way to exit the program.

And the other issue is solved by simply printing a newline at the start. :-P

But it was such a pain to type this on a tablet – you must use it! :slight_smile:

Oh my! It seems that forty years of “improving” our computers has made them worse!

I had in fact just come here to mention that after about fifty assemblies (the assembler helpfully increments the #000000 in the ASMIDSTAMP file every time you assemble something) of five different (fairly small) programs, I’m about ready to put an end to this experiment.

The largest pain is probably just working on a 40×24 screen, with the width being a bigger problem than the length. That could probably be improved somewhat if EDASM were better about printing things in a nicely formatted way, though. What with the way it lays things out and particularly the broken auto-tab-stops for full-line comments, it’s not making particularly good use of what little screen real-estate there is. (And yes, my Apple IIc has a built-in 80-column card, but EDASM doesn’t know how to use it.) Well, at least the having (virtual) printouts makes things bearable, otherwise it would be a nightmare.

But the speed is also an issue. Not really the speed of the asssembler itself, but the speed of the disk I/O necessary to use it. For a 63 line (1 KB) program, it goes like this:

  1. 10 s: Save file.
  2. 16 s: Wait for assembler to load. Press a key.
  3. 17 s.: Wait for assembly to complete. Press a key.
  4. 5 s.: Wait for editor to reload.
  5. 5 s.: Load and run the assembled program.

So really it’s close to a minute to turn things around, whereas in my cross-development system I’m used to waiting no more than 3-4 seconds after pressing s to save my code before I see the output of the unit tests. Not to mention that those “PRESS ANY KEY” prompts that I need to respond to twice in the middle of all of that (to give me an opportunity to switch diskettes) really add to the annoyance. I am almost tempted to reverse engineer the binary far enough to remove those.

Still, it’s nice to have concrete, recent experience to make clear to me how good my life is now (so long as I don’t use a tablet!).

2 Likes

In that benchmark they were reading a “200K data file.” The PC stores 9 KB per cylinder (512 bytes × 9 sectors × 2 sides) as opposed to the Apple’s 4 KB, but it’s still got to do a fair number of head seeks for a file that size, especially when you take into account DOS overhead to read the FAT (potentially multiple times).

I would be very surprised if head seeks were significantly faster on a PC 5.25" drive than on Apple 5.25" drives; the mechanisms are very similar. (The big difference is in the electronics on the drive.) So no, I’m not surprised that the Apple, when using efficient code, is not entirely dissimilar in speed to the IBM AT when reading or writing files spread across multiple tracks. In optimum conditions (raw reads with no filesystem), the PC shouldn’t be more than about twice as fast, when head seeks dominate.

1 Like

Well that’s kind of the point of the benchmark. Something that measures the entirety of the “disk reading” experience. Most of the time, they write these files on to empty diskettes to avoid fragmentation.

Efficient code has little to do with these benchmarks. The vast majority of the time is spent in internal I/O routines (MS-DOS in this case), so even though the benchmarks are written in BASIC, the CPU cost of BASIC is marginal in this case. An equivalent assembly language program I doubt would clock much faster than a BASIC one. Perhaps raw sector I/O is faster, but that’s not a reasonable test as most users would never do that, they’d rely on the BIOS/BDOS/OS to do it.

What’s the capacity of the Apple floppy disk? 80K? 160K? On the AT they were 1.2MB. (PC was 360K). That suggests that 200K is ~16% of the floppy that has to be written. In contrast to an Apple floppy where it it’s a larger percentage, thus more dominated by head seek and rotational delay.

But then, we don’t have to actually theorize. You have an authentic Apple II and floppy drives. Care to throw together a simple BASIC program that reads and writes some reasonable number of sequential bytes (50K?) on a fresh diskette and tell us what you get?

This topic was automatically opened after 16 hours.

@whartung You seem to be agreeing with me that we should be looking at reads of many tracks, not just one. That’s great!

My floppy format test above wrote 35 tracks in 9 sec, or about 260 ms./track. Now it has to write each track in a single revolution (that’s the nature of formatting floppies), taking 3.33 ms., so even if you throw in several more rotations for verification reads, clearly this is completely dominated by seek time. The Apple II and original IBM PC use essentially the same physical disk mechanism, so you’d expect both to spend most of their time waiting for seeks to settle, rather than actual writing or running DOS code.

And the numbers bear that out; my maximum-possible-write-speed format test above writes 140 K in 9 s, 15 K/sec.; decent copy programs are not much slower. The fastest result from the PC Magazine benchmark (sequential read, with DD and QD being within 10% or so of each other) reads 200 KB in around 40 sec., 5 K/sec. Give the PC plenty of leeway for any kind of overhead and you’ve probably got about the same speed. (Or if you want a more accurate comparison, format a floppy on a PC 360 KB drive and tell us how long it takes.) You can pretty much ignore interleave and the rest; a track taking 20 ms. to read instead of 3 ms. makes little difference when you’re waiting a quarter second every time you need to change tracks.

I don’t think that the PC Magazine benchmarks were written in BASIC, but regardless, “efficient code” does make a huge difference for the Apple. With Apple BASIC the DOS hijacks the keyboard character input routine and feeds data from the disk character by character to an INPUT statement. This is many, many times slower than using assembly to call the DOS code directly to read or write 256 byte chunks.

(This seems to be turning into a different topic; perhaps a moderator can move all of the posts related to floppy speeds of non-Apple-II machines to a different thread.)

For reference, there’s a disassembled source for the EDASM editor/assembler here: markpmlim · GitHub

Cheers,
Andy

1 Like

Note that that’s a different EDASM: the ProDOS version.

If anybody’s got a link to the source for or a disassembly of the DOS 3.3 version that would be quite useful to me. I recently thought I might speed things up by using a RAM disk (in slot 3, as DOS sees it), but it turns out that EDASM always loads the editor and assembler from the disk in ,S6,D1, even when EDASM.OBJ is started from the RAM disk, and the SLOT 3 command doesn’t change this. Looks like fixing that would need a patch.

Thanks for a great post. Very informative - I am interested in self hosted editors and assemblers for a retro project I am working on.

What is the model of that Apple II ?

Is it an Apple IIc? - just that the keyboard looks a bit odd.

It is indeed an Apple IIc, as you can see from the logo on the upper right-hand corner of the case. There’s nothing odd about the keyboard; it’s a standard US version. The European keyboard was slightly different, though not hugely so.

A post was split to a new topic: Retro-BASIC programming Editors and IDEs