The wide-ranging performance of the PDP-11

I wanted to benchmark some not-quite-PDP-11 second processors fitted to my BBC Micro, and try to calibrate against a real PDP-11, and that opened a bit of a can of worms… (here are my results - of course, modern implementations outperform the historical ones.)

Wikipedia tells us

In total, around 600,000 PDP-11s of all models were sold, making it one of DEC’s most successful product lines
and
The first officially named version of Unix ran on the PDP-11/20 in 1970

The PDP-11 was sold for over 20 years, with many models and variants were introduced in that time. The PDP-11 FAQ has this to say:

DEC eventually countered with its entry into this market segment - the PDP 11/20 [in June 1970]

Since then, the PDP-11 had 16 to 22 implementations, depending on how you count them, many with variants.

PDP-11 Relative Performance
From a chart in the 1978 "Computer Engineering"

        11/03 11/04 11/05 11/20 11/34 11/34c 11/40 11/60 11/45 11/55 11/70
        ----- ----- ----- ----- ----- -----  ----- ----- ----- ----- -----
Perf(1)   1     2.8   2.5   3.1   3.5   7.3    3.6   27          41   36
 core                                                      13
 mos                                                       23
 Bipolar                                                   41
Whets     26     18    13    20   204   262     57   592        725   671
 core                                                     260
 mos                                                      335
 Bipolar                                                  362

(1) performance is for the basic instruction set relative to the 11/03

From a chart in the 1987 "PDP-11 Systems Handbook"

                                   11/23 11/53 11/73 11/83 11/93
                                   ----- ----- ----- ----- -----
               CPU                 F-11  J-11  J-11  J-11  J-11
               Microcycle(ns)      300   267   267   222    222
               Clock (MHz)         ?     15    15    18     18
               Performance         0.2   0.5   0.7   1.2    ?
               (11/70 = 1)	 
               Cache               no    no    yes   yes    no
               Floating-Pt         opt   no    no    yes    yes
               Coprocessor

The FAQ goes on to do us the favour (in a fixed font) of graphing the performance, sorting by order of introduction:

P   r  45                                           _
e   e  40    -           -
r   l  35    |        -
f   a  30    |                       _
o   t  25    |                                -
e   i  20    |                                   -
m   v  15    _
a   e  10
n       9
c   t   8
e   o   7                               -  -
        6
f   1   5
a   1   4       _                 _
c   /   3 -        _           -
t   0   2
o   3   1                   -
r       0_________________M_o_d_e_l_s________________
          2  4  4  0  7  5  0  0  3  6  3  2  7  5  8
          0  5  0  5  0  5  3  4  4  0  4  3  3  3  3
                                        c

We learn, then, that from the original TTL KA11 CPU through the four-chip LSI and then two-chip single-package F11 and J11 processors, there’s a range of about 14x, and taking into account the cost-reduced low-performance models, a range of nearly 45x.

I also learnt that a company called Mentec made even faster CPUs for the PDP-11, and indeed eventually got the rights to some of the OS flavours too. They seem to have started with overclocked J11s, then made their own (double performance claimed) using TI parts in a microsequenced implementation with an i960 for floating point assist, and finally they offered an ASIC version (80k gates, up to 80% faster.) Notably they only expected to make and sell 1000 of these faster CPUs. Lots of detail in that link about the project:

The total planned effort for the project was 166 person days. The actual project effort took 484 person days. The extra effort put in meant that the calendar finish for the project was completed in 12 months as originally estimated.

4 Likes

You seem to have missed the Pi version of the pdp 11. :slight_smile:
Raw numbers, may not be the only thing involved in a sale of PDP-11.
Price vs performance, size and power are also a factor.( A CMOS PDP-8
could make a nice data logger that only needs to be seviced once a year, in the 1980’s.)
Fixed disks went from 1500 rpm to 3600 rpm in the 1970’s and seek times shrank as
well, A lot of factors that never make charts. Ben.

The 11/93 has no cache because all its memory is static. So all its memory is actually the cache! Therefore the performance of the 11/93 is almost the same as the 11/83 for most tasks.

Welcome! Yes, a large fast memory can be a win compared to a slower memory with a fast cache - one just needs the technology!

I see an updated version of the table above, here, which adds a figure for the 11/93:

From a chart in the 1987 “PDP-11 Systems Handbook”

                               11/23 11/53 11/73 11/83 11/93
                               ----- ----- ----- ----- -----
           CPU                 F-11  J-11  J-11  J-11  J-11
           Microcycle(ns)      300   267   267   222    222
           Clock (MHz)         ?     15    15    18     18
           Performance         0.2   0.5   0.7   1.2    1.4
           (11/70 = 1)	 
           Cache               no    no    yes   yes    no
           Floating-Pt         opt   no    no    yes    yes
           Coprocessor

NOTE: The 11/93’s memory was effectively all cache, so there was no need for a separate cache subsystem.

NOTE: The 11/93 vs. 11/70 factor was taken from the “PDP-11 20th Anniversary Systems and Options Catalog Supplement”, May 1990.

I also see a note here which says there are good reasons for thinking of the J11 as running at 4.5MHz, not 18MHz:

Note that the J11 system is listed with an effective microcycle rate of 4.5 MHz rather the chip clock rate of 18 MHz. This is also consistent with Bob Supnik’s notes on the J11 where the J11 is classified as ‘4.5 MHz’. This gives a more meaningful value for the Dhry/MHz or ‘Dhrystone per MHz’ column

And here we find a tabulation of floating point performance:

    PDP-11 Model                VUP Rating
    ------------               -----------
    PDP-11/03                         0.05
    PDP-11/04                         0.11
    PDP-11/23                         0.12
    PDP-11/23+                        0.18
    PDP-11/24                         0.18
    PDP-11/34A                        0.21
    PDP-11/53                         0.29
    PDP-11/53+                        0.29
    PDP-11/44                         0.42
    PDP-11/73                         0.45
    PDP-11/70                         0.60
    PDP-11/84                         0.72
    PDP-11/83                         0.72
    PDP-11/94                         1.00
    PDP-11/93                         1.00

Sort of, but not quite: one of my benchmark targets was the PiTubeDirect, a pi-based second processor for Acorn’s 8 bit micros. I was able to run on a pi zero and a pi 4. The C model in this case is not the same as SimH, the one used for the PiDP-11, so I’d expect the performance to differ. I didn’t benchmark SIMH.

1 Like

You can notice that for my pi-spigot the 11/70 is about 10% faster than the 11/93. It is because hardware division on the 11/70 is faster.
You can also notice a strange results for the 11/83 which are better than for the 11/93! I have two plausible explanations for this:

  1. RT-11SB was used on the 11/83 and this OS is light and faster than RSX-11 or Unix;
  2. the results for the 11/83 were gotten many years ago and I needed to do their approximation several times, and this procedure is prone to accumulate errors.

The J11 is an interesting case. Maybe it divides the external clock by 4? So it is a completely opposite thing to the common clock multiplication used in the Z280, R800, 80486dx2, 80486dx4, etc.
IMHO one of the most difficult case to determine the CPU frequency is TMS CPU’s.
The TMS9900 and TMS9985 that were used in the TI99/4a and Geneve 9640 respectively.
I use the memory frequency as the common denominator.

I didn’t benchmark SIMH.

Unless a lot of work has been done on simh emulators since the last time I looked at them closely, my opinion of them is that they were the solid but ‘obvious’ code as may be written by your average computer scientist as opposed to a video-game emulator writer - i.e. with scope for maybe an easy 25% speedup immediately and a lot more with some serious effort (such as dynamic recompiling). Old computers generally don’t attract the sort of hackers who put years into speeding up video game emulators, and the simh emulators have benefitted more from Moore’s Law than they ever will from ongoing code improvement.