Computing the Integer Square Root
Fred Akalin
1. The algorithm
Today I’m going to talk about a fast algorithm to compute the integer square root of a non-negative integer , , or in words, the greatest integer whose square is less than or equal to .[1] Most sources that describe the algorithm take it for granted that it is correct and fast. This is far from obvious! So I will prove both correctness and speed below.
One simple fact is that , so a straightforward algorithm is just to test every non-negative integer up to . This takes arithmetic operations, which is bad since it’s exponential in the size of the input. That is, letting be the number of bits required to store and letting be the base- logarithm of , , and thus this algorithm takes arithmetic operations.
We can do better by doing binary search; start with the range and adjust it based on comparing the square of an integer in the middle of the range to . This takes arithmetic operations.
- If , return .
- Otherwise, set to and set to .
- Repeat:
- Set to .
- If , return . Otherwise, increment .
// isqrt returns the greatest number x such that x^2 <= n. The type of
// n must behave like BigInteger (e.g.,
// https://github.com/akalin/jsbn ), and n must be non-negative.
//
//
// Example (open up the JS console on this page and type):
//
// isqrt(new BigInteger("64")).toString()
function isqrt(n) {
var s = n.signum();
if (s < 0) {
throw new Error('negative radicand');
}
if (s == 0) {
return n;
}
// x = 2^ceil(Bits(n)/2)
var x = n.constructor.ONE.shiftLeft(Math.ceil(n.bitLength()/2));
while (true) {
// y = floor((x + floor(n/x))/2)
var y = x.add(n.divide(x)).shiftRight(1);
if (y.compareTo(x) >= 0) {
return x;
}
x = y;
}
}
2. Correctness
The core of the algorithm is the iteration rule: where the floor functions are there only because we’re using integer division. Define an integer-valued function for the right side. Using basic properties of the floor function, you can show that you can remove the inner floor: which makes it a bit easier to analyze. Also, the properties of are closely related to its equivalent real-valued function:
For starters, again using basic properties of the floor function, you can show that , and for any integer , if and only if .
Finally, let’s give a name to our desired output: let .[4]
Proof. By the basic properties of and above, it suffices to show that . and . Therefore, is concave-up for ; in particular, its single positive extremum at is a minimum. But . ∎
Proof. . Therefore,
(Note that any number greater than , say or , can be chosen for our initial guess without affecting correctness. However, the expression above is necessary to guarantee performance. Another possibility is , which has the advantage that if is an even power of , then is immediately set to . However, this is usually not worth the cost of checking that is a power of , as is required to compute .)
Proof. Assume it terminates. If it terminates in step , then we are done. Otherwise, it can only terminate in step where it returns such that . This implies that . Rearranging yields and combining with our invariant we get . But , so that forces to be , and thus returns if it terminates. ∎
Proof. Assume it doesn’t terminate. Then we have a strictly decreasing infinite sequence of integers . But this sequence is bounded below by , so it cannot decrease indefinitely. This is a contradiction, so must terminate. ∎
We are done proving correctness, but you might wonder if the check in step is necessary. That is, can it be weakened to the check ? The answer is “no”; to see that, let . Since , . On the other hand, consider the inequality . Since that would cause the algorithm to terminate and return , that implies that . Therefore, that inequality is equivalent to , which is equivalent to , which is equivalent to . Rearranging yields . Substituting in , we get , which is equivalent to . But since , that forces to equal . That is the maximum value can be, so therefore must be one less than a perfect square. Indeed, for such numbers, weakening the check would cause the algorithm to oscillate between and . For example, would yield the sequence .
3. Run-time
We will show that takes arithmetic operations. Since each loop iteration does only a fixed number of arithmetic operations (with the division of by being the most expensive), it suffices to show that our algorithm performs loop iterations.
It is well known that Newton’s method converges quadratically sufficiently close to a simple root. We can’t actually use this result directly, since it’s not clear that the convergence properties of Newton’s method are preserved when using integer operations, but we can do something similar.
Define and let . Intuitively, is a conveniently-scaled measure of the error of : it is less than for most of the values we care about and it bounded below for integers greater than our target . Also, we will show that the shrink quadratically. These facts will then let us show our bound for the iteration count.
Proof. , so , and therefore . But the expression on the left side is just . if and only if , so the result immediately follows. ∎
Proof. is just , so it suffices to show that . Inverting , we get that . Expressing in terms of we get and Therefore, it suffices to show that the denominator is greater than . But implies by Lemma 3, so that follows immediately and the result is proved. ∎
Proof. Let’s start with : Then . Since is an integer, if and only if . Therefore, .
As for : Since , and thus .
Finally, is just . Using calculations from Lemma 4, Therefore, . ∎
Proof. Let be the number of loop iterations performed when running the algorithm for (i.e., ) and assume . Then for . Since by Lemma 5, and for by Lemma 4, then . But by Lemma 3, so . Taking logs to bring down the yields . Then , and thus . ∎
4. The Initial Guess
It’s also useful to show that if the initial guess is bad, then the run-time degrades to . We’ll do this by defining the function except that it takes a function that is called with and assigned to in step 1. Then, we can treat as a function of and analyze how long stays above to show that uses an initial guess such that , then Theorem 4 reduces to Theorem 3 in that case. However, if is chosen to be , e.g. the initial guess is just or for some , then will also be , and so the run time will degrade to . So having a good initial guess is important for the performance of !
Like this post? Subscribe to
my feed
or follow me on
Twitter
.
Footnotes
[1] Aside from the Wikipedia article, the algorithm is described as Algorithm 9.2.11 in Prime Numbers: A Computational Perspective. ↩
[2] Note that only integer operations are used, which makes this algorithm suitable for arbitrary-precision integers. ↩
[3] Go and JS implementations are available on my GitHub. ↩
[4] Here, and in most of the article, we’ll implicitly assume that . ↩
[5] is using long division, but fancier division algorithms have better run-times. ↩