First, I try to estimate a point near the root. Consider a number in binary, it's square root will be between two to the power of half of the length of the binary string rounded up and double that value. The lower bound can be made more accurate, but this is sufficient.
Then, with the maximum and minimum, the midpoint should be a good starting point or guess that I feed into Newton's method. Newton's method is used to increment or decrement the initial guess to something a bit saner.
More speed can be obtained by replacing some division by bit-shifting (but that's dangerous). The fastest (and smartest) way would be to use the built-in SSE functions to do the square root and inverse square root (properly pipelined).
//! Same, but this time do Newton's method
int sqNewton(int in_val)
{
int c = in_val;
if (c <= 0) return 0;
int lower = 1;
if (c >= (1 << 16))
{
c >>= 16;
lower <<= 8;
}
if (c >= (1 << 8))
{
c >>= 8;
lower <<= 4;
}
if (c >= (1 << 4))
{
c >>= 4;
lower <<= 2;
}
if (c >= (1 << 2))
{
c >>= 2;
lower <<= 1;
}
//Hit mid-bound
lower = lower*3/2;
//printf("%li %li %li\n", lower, in_val, lower*lower-in_val);
int numerator = lower*lower + in_val;
int denominator = 2*lower;
return numerator/denominator;
}
No comments:
Post a Comment