Everything I Know About The Fast Inverse Square Root Algorithm

Everything I Know About The Fast Inverse Square Root Algorithm


I do like this bit:

That sigma value was chosen by someone to optimise the approximation, but interestingly, it isn’t actually the optimal value (more on that later), and it isn’t actually known who came up with it!

Oh, yeah, this is an interesting one.

One note I’d like to make on it. This implementation is most likely no longer needed on modern hardware. And I actually benchmarked this myself to check when I implemented the ISQR assembly instruction for my virtual CPU. As such, I took Carmack’s implementation vis-a-vis the naive implementation and to my surprise, in benchmarks the naive one exceeded the performance of the fast inverse square root.

Just as a reference, these are the implementations (in C#) that I benchmarked. Unfortunately, I no longer have the benchmark results, that code was deleted when I took the decision to use the naive implementation.

Edit. Maybe it’s safer just to test, depending on the language/compiler used whether the FISR may actually bring some improvement or not.

If I remember correctly, the second (commented) y calculation line increases the precision by a small percentage. They deemed it’s “about enough” with the first calculation so there was no need to add a second iteration to make it more accurate.

Not really: the first iteration gives about 20x improved accuracy, and the second iteration would have improved by about 350x more. I think the thing is that the massive improvement in accuracy was of very little value - and of course it has a cost. The Chris Lomont paper is worth a look!

Ah, I see. I knew that the overarching philosophy here was to spend out performance in favor of what is needed within their specs. I did not know by how much.

However, I believe that the underlying lesson (developer wise) is to employ simplication of calculations by any mathematical mean that reduces the typical amount of CPU cycles to the least while keeping a reasonably accurate result. And for that, Carmack’s effort is brilliant.

This is why I also enjoy a lot of problems that challenge one to obtain a certain result with the minimum amount of computational effort.

This can never get old, just overlooked.

I wonder why he didn’t spell it ‘three halves’ rather than ‘three halfs’?

OT, but there’s a series of comedy shorts which covers this - all depends on whether you follow JRR Tolkien or CS Lewis.