High resolution timing on the PC - Don't try this at home

mbbrutman · July 27, 2020, 5:12am

I’ve been working on a little project to create a stratum one time server running on a PCjr. For those of you not familiar with the technology, a stratum one time server is one hop removed from the source of time, which in my case would be the GPS satellite system.

Getting time from a GPS is fairly simple - with the right model receiver it is as simple as reading data from the serial port and parsing it. GPS receivers generally send data to a computer once a second. Sending only the timestamp data on my receiver (a Garmin 18X LVC) at 4800 bps takes around 150 milliseconds and it is highly variable, so you can’t get great timing resolution just by using the serial port.

There is an elegant fix for this. Some receivers (including the 18X LVC) include a 1PPS (one pulse per second) line, which goes high for 20 milliseconds at the top of each second once a GPS fix is made. So once you read the time you can wait for the pulse to know when the next second has just started. The 1PPS is often wired to the Carrier Detect pin on a PC, which is a pin that can be programmed to generate an interrupt.

Fast forward, I have the code that parses the GPS timestamp data and takes an interrupt from the 8250 UART when the 1PPS signal trips the carrier detect line. This is great, but knowing the top of each second isn’t good enough on this machine. Here is why.

The clock crystal used on the original IBM PC runs at 1,193,180 Hz . This is connected to timer 0 on the 8253 timer, which is programmed to fire an interrupt every 64K pulses from the clock. That results in an interrupt 18.2 times per second, or once every 54.92ms. That interrupt is wired to IRQ 0 and has an interrupt handler to maintain the date and time on the machine. So the timing resolution on the machine is at best around 55ms.

One approach to get better timing resolution is to reprogram the 8253 to interrupt at a faster rate. If you interrupt 64 times faster you can get an interrupt every 0.85 milliseconds. Of course you chain the standard interrupt handler so that the machine keeps the correct date and time. This approach is simple but it slows the machine down due to the extra interrupt overhead.

Another approach is to try to latch the counter in the 8253, and then use that counter and the standard BIOS ticks to compute the time. In this setup the BIOS ticks gives you 55ms resolution while the counter in the 8253 gives you microsecond resolution.

I’ve tried both approaches. The latter approach is great on paper, and I can get timings as accurate as 270 micro seconds with my C code. However, every once in a while my timings are off by 55ms. Why? There is a fundamental flaw … to get the time accurately you have to freeze both the 8253 counter and the BIOS ticks counter at the same time. And there is no way to do this. Even if you are clever and disable interrupts at the right time there is still a race condition that can lead to an extra BIOS tick getting recorded. I’ve minimized the window and I’m still having it happen 1.5% of the time.

And of course then I find out that Michael Abrash writes about this problem in his Graphics Programming Black Book. I should have read that first. ;-0

As for the PCjr, what a rotten machine. ;-0 It works hard to keep up with a 4800 bps serial stream. Touching the keyboard fires the non-maskable interrupt which can delay the serial port and the 1PPS interrupt. And screen writes through the BIOS - awfully slow! I timed a BIOS call to move the cursor at around 300 microseconds alone.

It’s a fun project but I think I’ll settle for 1ms of accuracy and call it done. Which means going back to the first technique, where the 8253 interrupts 64x faster. It’s not great, but I don’t have the fundamental problem of trying to latch two counts atomically.

My next step: get the TCP/IP code worked in so that it can actually serve the time. Don’t worry, I won’t advertise it as a stratum one time server on the internet.

EdS · July 27, 2020, 12:05pm

Not sure if this is already covered, but can you nudge the phase of the regular (crystal-driven) interrupt in an advantageous way? Some sort of delay-locked-loop could converge so that you get an interrupt when you want one: right at the top of the second, or right before, so you can poll the PPS, or at the half-second so that the ISR doesn’t get in the way.

Here’s a copy of the black book, BTW, with a link within to a PDF:

mbbrutman · October 24, 2021, 3:05am

I’m still working on this project … all good things take time. ;-0

The GPS code is solid. I went back to speeding up the BIOS tick rate instead of trying to measure the 8253, as that approach doesn’t work if you are trying to measure intervals longer than 55ms. (The race condition can’t be worked around without resorting to undocumented implementation differences between 8253 chips.)

Running the 8253 64x faster causes some dropped timer ticks on a PCjr. On a 386 class machine I can run the 8253 128x or 256x faster with no detectable timer interrupts being lost.

The SNTP server is running, but it I did the bare minimum:

It only handles one request at a time.
The high-resolution GPS code is stubbed out.

The biggest challenge with the SNTP code is the NTP timestamp format. The NTP timestamp consists of a 32 bit count of seconds since 1900, and a 32 bit fractional part. It is easy to convert from the NTP time to Coordinated Universal Time, but the fractional part needs some tricks if you are going to do it without floating point and without killing the poor 16 bit machine. 32 bits of precision of a single second gives you timing resolution down to something like 237 pico-seconds; since I’m shooting for accuracy to within 0.5 milliseconds I made use of some integer division and bit shifting.

Testing this has been hilariously good fun. I’m running two virtual machines running DOS, and using my SNTP client to test the new SNTP server. I cut some corners on the SNTP client which I’m having to fix now. The “optimizations” were appropriate for getting to within 1 second on an old PC, but now that I’m trying to get to within 1 or 2 ms I have to fix things. I could use a PC running Linux instead, but I find it easier to debug with code that I already wrote and understand.

The next step is to merge the GPS code into the SNTP server so that it can issue high quality timestamps. And then to test it against other machines that have good NTP clients to ensure that I’m standards compliant. After that I’ll start optimizing the code so that I can get closer and closer to the accurate time without losing cycles on busy work in between.

oldben · October 24, 2021, 8:12am

Phase lock the the PC’s 4 x color burst to your 1 sec pulse output.
That way the pc is all ways in sync, even if it is slow.
(One of the many 3 am bathroom break great ideas)

Michael_Barry · October 24, 2021, 6:03pm

I get those too once in a while, but making useful sense of them a few hours later is somewhat hit-and-miss.

mbbrutman · October 25, 2021, 3:23am

Sorry, I like my CGA video output the way it is … but I did get the combined GPS code and SNTP server code working today, which pleased me greatly. I’ve had more progress on this project in the last weekend than in the last year.

Windows 10 has a tool (w32tm) that can measure the offset between the local machine and an NTP server, including showing you the NTP packet timestamps. Here are some observations:

Network delay is around 1.1 milliseconds, which is expected. (The 80386 that I’m testing against has a 10Mb/sec Ethernet card.)
When my laptop is synced to the SNTP server running on the 386, w32tm shows an offset between the two machines of around 2ms. I thought it would be tighter, so I need to investigate.
When my much larger/expensive Dell workstation is synced to the SNTP server it starts off with a small offset (1 to 2 ms) but then the time drifts, and the offset grows to 50ms over a period of about 10 minutes. I’m not sure what is going wrong there, so that needs investigation too. (A Dell workstation with two Xeons should be pretty rock solid.)
Syncing the Windows machines to time.google.com or time.nist.gov and the comparing against the 386 SNTP server works out well; there is never more than 20ms of offset between then. (The network path is much longer, but the offset calculation does not include that.)
The PCjr has a Xircom parallel-port attached Ethernet adapter, and it’s network latency is a bit higher. The PCjr is also more challenging because it drops interrupts, so I checked to see that the code runs there but I didn’t bother scrutinizing the network delay or time offset. (I’ll work on the fast machines first, and work my way up to the slow and difficult machines.)

Overall it was a good day. Still a lot of work to do, but it’s finally running.

mbbrutman · November 7, 2021, 3:11am

One last update …

I spent some time optimizing the code and trying to prove to myself that the code was working properly. I have some cleanup to do, but I think it’s working well:

The 80386 machine is always within 1 ms of the time being served by publicly available servers, such as time.google.com or time.windows.com.
The PCjr, a fairly slow machine, is also usually within 1 ms of the public time servers. It has some variability though, so I need to do more optimization to give it more headroom.

The full details of the project can be found here: Building a Stratum 1 Time Server for 16 bit DOS

It’s been fun, and I’ve met my goal of getting within 2 ms of the publicly available servers. But it’s not terribly practical to leave a 35+ year old machine on to serve SNTP.

elb · November 11, 2021, 12:04am

Prwhat? I am unfamiliar with this word…