Tuesday, July 26, 2011

Fast Integer Square Root

Here's my attempt at writing a fast square root function for integers.  It's not as fast as the built-in sqrtf function (even with type conversions) but it's decent.  In the worst case it was twice as slow -- but that would downplay the significance of the learning experience.

First, I try to estimate a point near the root.  Consider a number in binary, it's square root will be between two to the power of half of the length of the binary string rounded up and double that value.  The lower bound can be made more accurate, but this is sufficient.

Then, with the maximum and minimum, the midpoint should be a good starting point or guess that I feed into Newton's method.  Newton's method is used to increment or decrement the initial guess to something a bit saner.

More speed can be obtained by replacing some division by bit-shifting (but that's dangerous).  The fastest (and smartest) way would be to use the built-in SSE functions to do the square root and inverse square root (properly pipelined).


//! Same, but this time do Newton's method
int sqNewton(int in_val)
{
int c = in_val;
if (c <= 0) return 0;
int lower = 1;
if (c >= (1 << 16))
{
c >>= 16;
lower <<= 8;
}
if (c >= (1 << 8))
{
c >>= 8;
lower <<= 4;
}
if (c >= (1 << 4))
{
c >>= 4;
lower <<= 2;
}
if (c >= (1 << 2))
{
c >>= 2;
lower <<= 1;
}
//Hit mid-bound
lower = lower*3/2;
//printf("%li %li %li\n", lower, in_val, lower*lower-in_val);
int numerator = lower*lower + in_val;
int denominator = 2*lower;
return numerator/denominator;
}

No comments:

Post a Comment