‘A Million Random Digits’ Was a Number-Cruncher’s Bible. Now One Has Exposed

1 Like

I can’t quite see how these numbers were wrong, and how it’s possible to hypothesise the punched card layout. Anyone able to help me get it?


could be the result of cosmic rays messing with the computer’s memory

seems to be a very modern idea related to semiconductor memory, I don’t think that the type of memory and circuitry used at the time (was it even core memory? – we’re speaking of 1951-1955) was subject to this.

Regarding the punched cards, I don’t see, how this would work: even if a series of random numbers is misordered by accident, the order would be still random, as are the numbers on the card. I also fail to see, how the number of digits stored on a card (e.g., 72 or 80) could have an effect.

I think, this is more about expectations about randomness which are informed by an expectation of a higher order evenness at large scale. (Which is, how you lose your money at the casino: 208 zeros in a row and a single red are still random and do occur. Or, how your guaranteed to be secure one-time pad encoded messages are cracked, because you validated your pad-numbers against a statistical model, like Mrs K, who draws the numbers and throws back the fifth “J”, since this isn’t random anymore.) Think of it, if real-life randomness was to adhere to a statistical model, it wouldn’t be that random at all. Moreover, the random source, the voltage fluctuations, could have been not that random as expected (e.g., subject to some hidden meta cycles, etc – compare Mandelbrot’s work on the regularity of noise in communication lines, which lead to the discovery of fractals), and, since these were transformed to numbers in a controlled fashion, this could have some impact on the numbers.

1 Like

Ah, I see now

Mr. Briggs obtained the original numbers
so any subsequent analysis is about collation and printing, not about randomness.

But note the size of the discrepancy from expectation:

The book contains 48 runs of four digits instead of 40, an astoundingly wide divergence in statistical terms that eluded any explanation he could conjure

Amusing (or alarming):

In 1949, the Rand newsletter cheerfully announced the Numerical Analysis Department, in a spring-cleaning frenzy, had sold 8,435 pounds of used IBM cards for scrap, bringing in $60. Rand’s archivist suspects the random-digit cards were in the recycling.

I don’t really see how these two things go together. If the original data was lost 70 years ago, how could it be obtained at present time? Even, if there’s still the original binary stream, there’s no guarantee that the documented procedures for converting bits into numbers actually match the procedures deployed at the time. (I guess, even a small deviation would have an impact on the result, while the output would be still random.)

I’m not sure why this is such a big discovery. Table 6 of the introduction to the 2001 edition shows the anomaly quite clearly: A Million Random Digits with 100,000 Normal Deviates | RAND

The punched cards that were thrown out in 1949 could not have been the data set itself, as it was not published until a few years later