x > -1 vs x >= 0, is there a performance difference

It is very much dependent on the underlying architecture, but any difference will be minuscule. If anything, I’d expect (x >= 0) to be slightly faster, as comparison with 0 comes for free on some instruction sets (such as ARM). Of course, any sensible compiler will choose the best implementation regardless of which variant is …

Read more

Why does breaking the “output dependency” of LZCNT matter?

This is simply a limitation in the micro-architecture of your Intel Haswell CPU and several previous1 CPUs. It has been fixed for tzcnt and lzcnt as of Skylake-S (client), but the issue remained for popcnt until it was fixed in Cannon Lake. On those micro-architectures the destination operand for tzcnt, lzcnt and popcnt is treated …

Read more

Why are loops always compiled into “do…while” style (tail jump)?

Related: asm loop basics: While, Do While, For loops in Assembly Language (emu8086) Terminology: Wikipedia says “loop inversion” is the name for turning a while(x) into if(x) do{}while(x), putting the condition at the bottom of the loop where it belongs. Fewer instructions / uops inside the loop = better. Structuring the code outside the loop …

Read more

Does calculating Sqrt(x) as x * InvSqrt(x) make any sense in the Doom 3 BFG code?

I can see two reasons for doing it this way: firstly, the “fast invSqrt” method (really Newton Raphson) is now the method used in a lot of hardware, so this approach leaves open the possibility of taking advantage of such hardware (and doing potentially four or more such operations at once). This article discusses it …

Read more

Avoiding the overhead of C# virtual calls

You can cause the JIT to devirtualize your interface calls by using a struct with a constrained generic. public SomeObject<TMathFunction> where TMathFunction: struct, IMathFunction { private readonly TMathFunction mathFunction_; public double SomeWork(double input, double step) { var f = mathFunction_.Calculate(input); var dv = mathFunction_.Derivate(input); return f – (dv * step); } } // … var …

Read more