Random 6502 3D graphics engine thoughts

So, I want a Descent-like game using 40x25 character graphics. My goal is smooth frame rate rather than high fidelity graphics. Classic 6502 3D games had poor frame rates.

Well, how can this be done?

I’ve already figured out part of it. The end of the pipeline will be cropped to a 64x64x64 cube of camera coordinates in front of the player. These X-Y-Z coordinates get transformed into X/Z, Y/Z screen coordinates via 64x64 lookup table (X or Y coordinate is -32 … +31; Z coordinate is 0 … 63).

Basically, there’s a collection of “dots” in the corners and midpoints of the room and neighboring rooms to show terrain, and 2D character “sprites” for the ghosts and other objects. Each has object 3D coordinates in a 256x256x256 map … the basic pipeline is:

  1. Move objects in 256x256x256 block based space. It’s like a Roguelike in 3 dimensions. Typically, only one monster will move per frame, so this is very fast.
  2. Transform/cull 256x256x256 world coordinates to 64x64x64 camera coordinates … this doesn’t need to be super accurate, but it does need to be fast
  3. Erase characters drawn last frame
  4. Draw visible objects in semi-random order

The monsters are mostly ghosts, so I avoid bothering with hidden surface algorithms or even sorting back-to-front. I care about smooth frame rate, not high fidelity. Heck, it’s a 40x25 character display, who cares?

But it’s the transform step which I’m puzzling over. How can this be done quickly?

Here’s what I’m thinking … okay, so the camera has a position and rotation matrix. Position translation is easy and pretty fast … just byte-byte addition in all three coordinates. Rotation, though … well, the rotation matrix is three unit vectors, and I can sacrifice precision on each unit vector coordinate to 1/32, maybe? That way, the multiplication table is 64x128 = 8K in size (I’m assuming both positive and negative for both the unit vector component and the map component).

Does this strategy seem reasonable? I know there are clever ways to do fast 6502 multiplication other than a lookup table, but in this specialized application I don’t need precision results.

If you’re doing just 90 degree turns, you’ll find that not the entire screen has to be erased, nor has the entire screen to be redrawn, but just a few portions. Moreover, most is probably symmetric.
From there, you could aggregate the distinctive elements, the view consists of, and even build a tiny runtime (interpreting byte codes as array offsets) to draw a view.

Compare the graphic near the top of Retrochallenge 2016/01: M10 – PC-8201A Maze War (E-12)
(This works in MS BASIC, so it should perform fine in ML. See Retrochallenge 2016 for context.)

1 Like

I’m going for smooth rotations - not just 90 degree turns. If it were just 90 degree turns, then the math would be very simple and fast.

My original plan was for fractal “lines” inspired by the Apple ][ version of Koronis Rift. It used a horizontal resolution of 40, which would be similar to using PETSCII graphics and the 8 thin horizontal line characters (for a vertical resolution of 200, similar to the vertical resolution of 192 used by Apple ][ Koronis Rift).

However, I’ve given up on that idea because I couldn’t figure out an elegant way to deal with Z clipping. Instead, I’m just going to have “character dots”. Each terrain “dot” or object is simply a dot in 3D space, and drawn on the screen as a character depending on Z distance. Objects might be drawn with multiple characters if close enough.

Note that the frame rates on any version of Koronis Rift is not as high as I want. I want something silky smooth, for a fast paced FPS feel. Here’s a bit of Descent to give an idea of the pace I’m going for:

Now, I’m going for more of a Roguelike feel, so we’re talking simpler maps with box shaped rooms connected with pipe-like corridors. However, they will generally be more disorienting than Descent maps because there’s nothing providing any visual cues for “up” and “down”. It’s all just free-form 3D because the maps are derelict alien spacecraft rather than something designed by humans.

Ah, I see. – Just a single idea: Maybe you can optimize your algorithm for transforming 0…256 to 0…64 coordinates so that it works with masking instead of shifting, at least for a few dimensions? This may have some impact, as there are three dimensions for every dot. May be worth it, even if you would end up with intermingled data/lookup tables.
The real problem is, even at 64 x 64 it’s hard for a 6502 @ 1 MHz to keep up with the cathode beam of the display. (This is much like a double line kernel for the Atari 2600, which is running at 1.2 MHz.) So some drop in frame rates may be inevitable. I’d probably try to run any time critical code in the zero page with self modifying code, rather than using zeropage indirect addressing for lookup tables, etc.

Edit: When doing a PETSCII game, I used two queues, one for setting characters and one for resetting them to a procedural background, to be serviced in the VBLANK interval. However, this makes sense only, if there are significantly less updates per frame than screen locations.

Here’s the routine for drawing the character queue
(xor-ing the videoMask is for an optional reverse video mode)

; draws chars in charQueue of (screenCode, addrLo, addrHi)*
; self-modifying (sets address at .dcqScreen, sta xxxx)
drawCharQueue
                ldx charQueuePtr          ;get top-of-queue pointer
                beq .dcqDone              ;exit, if empty
                dex
.dcqLoop        lda charQueue, x          ;get screen address hi-byte
                sta .dcqScreen+2          ;fix-up
                dex
                lda charQueue, x          ;get screen address lo-byte
                sta .dcqScreen+1          ;fix-up
                dex
                lda charQueue, x          ;get screen code
                eor videoMask             ;adjust for normal/reverse video
.dcqScreen      sta $ffff                 ;store it (dummy address)
                dex
                bpl .dcqLoop
                lda #0                    ;reset top-of-queue pointer
                sta charQueuePtr
.dcqDone        rts

and the routine for pushing a character onto the queue, the idea is here that the routine takes care of any clipping, thus freeing the main routines from this burden.
(screenLinesHi and screenLinesLo are lookup tables for screen memory addresses per screen line)

; a single character 'sprite routine'
; pushes a screen code and address onto the charQueue, if on-screen
pushScreenCode
                lda qPosY
                bmi .pcqDone  ;negative
                cmp #25       ;gte 25 (off-screen to the bottom)?
                bcs .pcqDone
                lda qPosX
                bmi .pcqDone  ;negative
                cmp #40       ;gte 40 (off-screen to the right)?
                bcs .pcqDone

                ldx charQueuePtr
                lda qScreenCode
                sta charQueue, x
                inx
                ldy qPosY
                lda qPosX
                clc
                adc screenLinesLo, y
                sta charQueue, x
                inx
                lda #0
                adc screenLinesHi, y
                sta charQueue, x
                inx
                stx charQueuePtr

.pcqDone        rts
1 Like

Yeah, there’s no way to do anything on 4096 locations per frame; I’m not trying that. The display is only a small number of “dots” being updated (albeit a “dot” could be multiple characters for an object that’s relatively close to the player).

If you’re in a typical room, we’re talking 8 dots for the corners and 12 smaller dots for the edge midpoints, but not all of them will be visible in your field of view. There would also be more terrain dots for neighboring rooms. And then there would be dots for objects, including your bullet.

The point is, the screen isn’t full.

BTW, I’m also now thinking about whether or not it would be possible to do ray tracing. This would avoid all that matrix math. The idea is to NOT trace each character position, but rather only update characters neighboring the edges of the previous frame. This almost entirely eliminates matrix math, because you can calculate a neighboring character’s ray by adding or subtracting the camera DOWN or RIGHT vectors.

Hmm … the more I think about ray tracing, the more I like it. It gives filled polygons instead of sparse terrain dots, while also getting rid of a whole bunch of annoying matrix math.

Thing is … I can do a lot of cheap optimization based on camera movement/rotation. If the camera moves or rotates left, I only need to ray trace characters to the right of the last frame’s edges. If the camera moves forward, I only need to ray trace characters that are “outward” of the last frame’s edges. Hmm …

David Schmenk did some impressive work on a game he called “Escape From the Home Brew Computer Club”. Maybe he has some ideas on which you can riff:

1 Like

Awesome, thanks for the pointer! There’s similar stuff on the C64 also, which can replicate that Apple ][ mode using HIRES mode and only modifying the color map (each 8x8 pixel block has the bitmap set to 0 for the top 4 bytes and 255 for the bottom 4 bytes, so each 8x8 block becomes two “fat” pixels of background/foreground color).

Unfortunately, raycasting only works well because each vertical column as always aligned with a vertical z slice. With a 6 degree of freedom game where you can look up/down and roll freely … well, I tried to figure out ways to extend it and I couldn’t come up with anything.

My ray tracing idea is very different. There are no textures; each wall is a flat color. That way, I can try to optimize things by only tracing rays near the edges of color changes. This may still involve tracing more than 40 rays per frame, but it should be a LOT less than 1000 rays per frame.