# Computing the Integer Square Root

## 1. The algorithm

Today I’m going to talk about a fast algorithm to compute
the *integer
square root* of a non-negative integer \(n\),
\(\mathrm{isqrt}(n) = \lfloor \sqrt{n} \rfloor\), or in words,
the greatest integer whose square is less than or equal to \(n\).^{[1]} Most
sources that describe the algorithm take it for granted that it is
correct and fast. This is far from obvious! So I will prove both
correctness and speed below.

One simple fact is that \(\mathrm{isqrt}(n) \le n/2\), so a
straightforward algorithm is just to test every non-negative integer
up to \(n/2\). This takes \(O(n)\) arithmetic operations, which is bad
since it’s exponential in the *size* of the input. That
is, letting \(\mathrm{Bits}(n)\) be the number of bits required
to store \(n\) and letting \(\lg n\) be the base-\(2\) logarithm of
\(n\), \(\mathrm{Bits}(n) = O(\lg n)\), and thus this algorithm
takes \(O(2^{\mathrm{Bits}(n)})\) arithmetic operations.

We can do better by doing binary search; start with the range \([0, n/2]\) and adjust it based on comparing the square of an integer in the middle of the range to \(n\). This takes \(O(\lg n) = O(\mathrm{Bits}(n))\) arithmetic operations.

However, the algorithm below is even faster:^{[2]}

- If \(n = 0\), return \(0\).
- Otherwise, set \(i\) to \(0\) and set \(x_0\) to \(2^{\lceil \mathrm{Bits}(n) / 2\rceil}\).
- Repeat:
- Set \(x_{i+1}\) to \(\lfloor (x_i + \lfloor n/x_i \rfloor) / 2 \rfloor\).
- If \(x_{i+1} \ge x_i\), return \(x_i\). Otherwise, increment \(i\).

Call this algorithm \(\mathrm{N{\small EWTON}\text{-}I{\small
SQRT}}\), since it’s based
on Newton’s
method. It’s not obvious, but this algorithm returns
\(\mathrm{isqrt}(n)\) using only \(O(\lg \lg n) =
O(\lg(\mathrm{Bits}(n)))\) arithmetic operations, as we will
prove below. But first, here’s an implementation of the
algorithm in Javascript:^{[3]}

```
// isqrt returns the greatest number x such that x^2 <= n. The type of
// n must behave like BigInteger (e.g.,
// https://github.com/jasondavies/jsbn ), and n must be non-negative.
//
//
// Example (open up the JS console on this page and type):
//
// isqrt(new BigInteger("64")).toString()
function isqrt(n) {
var s = n.signum();
if (s < 0) {
throw new Error('negative radicand');
}
if (s == 0) {
return n;
}
// x = 2^ceil(Bits(n)/2)
var x = n.constructor.ONE.shiftLeft(Math.ceil(n.bitLength()/2));
while (true) {
// y = floor((x + floor(n/x))/2)
var y = x.add(n.divide(x)).shiftRight(1);
if (y.compareTo(x) >= 0) {
return x;
}
x = y;
}
}
```

## 2. Correctness

The core of the algorithm is the iteration rule: \[ x_{i+1} = \left\lfloor \frac{x_i + \lfloor \frac{n}{x_i} \rfloor}{2} \right\rfloor \] where the floor functions are there only because we’re using integer division. Define an integer-valued function \(f(x)\) for the right side. Using basic properties of the floor function, you can show that you can remove the inner floor: \[ f(x) = \left\lfloor \frac{1}{2} (x + n/x) \right\rfloor \] which makes it a bit easier to analyze. Also, the properties of \(f(x)\) are closely related to its equivalent real-valued function: \[ g(x) = \frac{1}{2} (x + n/x)\text{.} \]

For starters, again using basic properties of the floor function, you can show that \(f(x) \le g(x)\), and for any integer \(m\), \(m \le f(x)\) if and only if \(m \le g(x)\).

Finally, let’s give a name to our desired output: let \(s =
\mathrm{isqrt}(n) = \lfloor \sqrt{n} \rfloor\).^{[4]}

Intuitively, \(f(x)\) and \(g(x)\) “average out” however far away their input \(x\) is from \(\sqrt{n}\). Conveniently, this “average” is never an undereestimate:

(Lemma 1.) For \(x \gt 0\), \(f(x) \ge s\).

Proof. By the basic properties of \(f(x)\) and \(g(x)\) above, it suffices to show that \(g(x) \ge s\). \(g'(x) = (1 - n/x^2)/2\) and \(g''(x) = n/x^3\). Therefore, \(g(x)\) is concave-up for \(x \gt 0\); in particular, its single positive extremum at \(x = \sqrt{n}\) is a minimum. But \(g(\sqrt{n}) = \sqrt{n} \ge s\). ∎

(You can also prove Lemma 1 without calculus; show that \(g(x) \ge s\) if and only if \(x^2 - 2sx + n \ge 0\), which is true when \(s^2 \le n\), which is true by definition.)

Furthermore, our initial estimate is always an overestimate:

(Lemma 2.) \(x_0 \gt s\).

Proof. \(\mathrm{Bits}(n) = \lfloor \lg n \rfloor + 1 \gt \lg n\). Therefore, \[ \begin{aligned} x_0 &= 2^{\lceil \mathrm{Bits}(n) / 2 \rceil} \\ &\ge 2^{\mathrm{Bits}(n) / 2} \\ &\gt 2^{\lg n / 2} \\ &= \sqrt{n} \\ &\ge s\text{.} \; \blacksquare \end{aligned} \]

(Note that any number greater than \(s\), say \(n\) or \(\lceil n/2 \rceil\), can be chosen for our initial guess without affecting correctness. However, the expression above is necessary to guarantee performance. Another possibility is \(2^{\lceil \lceil \lg n \rceil / 2 \rceil}\), which has the advantage that if \(n\) is an even power of \(2\), then \(x_0\) is immediately set to \(\sqrt{n}\). However, this is usually not worth the cost of checking that \(n\) is a power of \(2\), as is required to compute \(\lceil \lg n \rceil\).)

An easy consequence of Lemmas 1 and 2 is that the invariant \(x_i \ge s\) holds. That lets us prove partial correctness of \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\):

(Theorem 1.) If \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) terminates, it returns the value \(s\).

Proof. Assume it terminates. If it terminates in step \(1\), then we are done. Otherwise, it can only terminate in step \(3.2\) where it returns \(x_i\) such that \(x_{i+1} = f(x_i) \ge x_i\). This implies that \(g(x_i) = (x_i + n/x_i) / 2 \ge x_i\). Rearranging yields \(n \ge x_i^2\) and combining with our invariant we get \(\sqrt{n} \ge x_i \ge s\). But \(s + 1 \gt \sqrt{n}\), so that forces \(x_i\) to be \(s\), and thus \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) returns \(s\) if it terminates. ∎

For total correctness we also need to show that \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) terminates. But this is easy:

(Theorem 2.) \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) terminates.

Proof. Assume it doesn’t terminate. Then we have a strictly decreasing infinite sequence of integers \(\{ x_0, x_1, \ldots \}\). But this sequence is bounded below by \(s\), so it cannot decrease indefinitely. This is a contradiction, so \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) must terminate. ∎

We are done proving correctness, but you might wonder if the check \(x_{i+1} \ge x_i\) in step \(3.2\) is necessary. That is, can it be weakened to the check \(x_{i+1} = x_i\)? The answer is “no”; to see that, let \(k = n - s^2\). Since \(n \lt (s+1)^2\), \(k \lt 2s + 1\). On the other hand, consider the inequality \(f(x_i) \gt x_i\). Since that would cause the algorithm to terminate and return \(x_i\), that implies that \(x_i = s\). Therefore, that inequality is equivalent to \(f(s) \gt s\), which is equivalent to \(f(s) \ge s + 1\), which is equivalent to \(g(s) = (s + n/s) / 2 \ge s + 1\). Rearranging yields \(n \ge s^2 + 2s\). Substituting in \(n = s^2 + k\), we get \(s^2 + k \ge s^2 + 2s\), which is equivalent to \(k \ge 2s\). But since \(k \lt 2s + 1\), that forces \(k\) to equal \(2s\). That is the maximum value \(k\) can be, so therefore \(n\) must be one less than a perfect square. Indeed, for such numbers, weakening the check would cause the algorithm to oscillate between \(s\) and \(s + 1\). For example, \(n = 99\) would yield the sequence \(\{ 16, 11, 10, 9, 10, 9, \ldots \}\).

## 3. Run-time

We will show that \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) takes \(O(\lg \lg n)\) arithmetic operations. Since each loop iteration does only a fixed number of arithmetic operations (with the division of \(n\) by \(x\) being the most expensive), it suffices to show that our algorithm performs \(O(\lg \lg n)\) loop iterations.

It is well known that Newton’s method converges quadratically sufficiently close to a simple root. We can’t actually use this result directly, since it’s not clear that the convergence properties of Newton’s method are preserved when using integer operations, but we can do something similar.

Define \(\mathrm{Err}(x) = x^2/n - 1\) and let \(\epsilon_i = \mathrm{Err}(x_i)\). Intuitively, \(\mathrm{Err}(x)\) is a conveniently-scaled measure of the error of \(x\): it is less than \(1\) for most of the values we care about and it bounded below for integers greater than our target \(s\). Also, we will show that the \(\epsilon_i\) shrink quadratically. These facts will then let us show our bound for the iteration count.

First, let’s prove our lower bound for \(\epsilon_i\):

(Lemma 3.) \(x_i \ge s + 1\) if and only if \(\epsilon_i \ge 1/n\).

Proof. \(n \lt (s + 1)^2\), so \(n + 1 \le (s + 1)^2\), and therefore \((s + 1)^2/n - 1 \ge 1/n\). But the expression on the left side is just \(\mathrm{Err}(s + 1)\). \(x_i \ge s + 1\) if and only if \(\epsilon_i \ge \mathrm{Err}(s + 1)\), so the result immediately follows. ∎

Then we can use that to show that the \(\epsilon_i\) shrink quadratically:

(Lemma 4.) If \(x_i \ge s + 1\), then \(\epsilon_{i+1} \lt (\epsilon_i/2)^2\).

Proof. \(\epsilon_{i+1}\) is just \(\mathrm{Err}(f(x_i)) \le \mathrm{Err}(g(x_i))\), so it suffices to show that \(\mathrm{Err}(g(x_i)) \lt (\epsilon_i/2)^2\). Inverting \(\mathrm{Err}(x)\), we get that \(x_i = \sqrt{(\epsilon_i + 1) \cdot n}\). Expressing \(g(x_i)\) in terms of \(\epsilon_i\) we get \[ g(x_i) = \frac{\sqrt{n}}{2} \left( \frac{\epsilon_i + 2}{\sqrt{\epsilon_i + 1}} \right) \] and \[ \mathrm{Err}(g(x_i)) = \frac{(\epsilon_i/2)^2}{\epsilon_i+1}\text{.} \] Therefore, it suffices to show that the denominator is greater than \(1\). But \(x_i \ge s + 1\) implies \(\epsilon_i \gt 0\) by Lemma 3, so that follows immediately and the result is proved. ∎

Then let’s bound our initial values:

(Lemma 5.) \(x_0 \le 2s\), \(\epsilon_0 \le 3\), and \(\epsilon_1 \le 1\).

Proof. Let’s start with \(x_0\): \[ \begin{aligned} x_0 &= 2^{\lceil \mathrm{Bits}(n) / 2 \rceil} \\ &= 2^{\lfloor (\lfloor \lg n \rfloor + 1 + 1)/2 \rfloor} \\ &= 2^{\lfloor \lg n / 2 \rfloor + 1} \\ &= 2 \cdot 2^{\lfloor \lg n / 2 \rfloor}\text{.} \end{aligned} \] Then \(x_0/2 = 2^{\lfloor \lg n / 2 \rfloor} \le 2^{\lg n / 2} = \sqrt{n}\). Since \(x_0/2\) is an integer, \(x_0/2 \le \sqrt{n}\) if and only if \(x_0/2 \le \lfloor \sqrt{n} \rfloor = s\). Therefore, \(x_0 \le 2s\).

As for \(\epsilon_0\): \[ \begin{aligned} \epsilon_0 &= \mathrm{Err}(x_0) \\ &\le \mathrm{Err}(2s) \\ &= (2s)^2/n - 1 \\ &= 4s^2/n - 1\text{.} \end{aligned} \] Since \(s^2 \le n\), \(4s^2/n \le 4\) and thus \(\epsilon_0 \le 3\).

Finally, \(\epsilon_1\) is just \(\mathrm{Err}(f(x_0))\). Using calculations from Lemma 4, \[ \begin{aligned} \epsilon_1 &\le \mathrm{Err}(g(x_0)) \\ &= (\epsilon_0/2)^2/(\epsilon_0 + 1) \\ &\le (3/2)^2/(3 + 1) \\ &= 9/16\text{.} \end{aligned} \] Therefore, \(\epsilon_1 \le 1\). ∎

Finally, we can show our main result:

(Theorem 3.) \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) performs \(O(\lg \lg n)\) loop iterations.

Proof. Let \(k\) be the number of loop iterations performed when running the algorithm for \(n\) (i.e., \(x_k \ge x_{k-1}\)) and assume \(k \ge 4\). Then \(x_i \ge s + 1\) for \(i \lt k - 1\). Since \(\epsilon_1 \le 1\) by Lemma 5, \(\epsilon_2 \le 1/2\) and \(\epsilon_i \le (\epsilon_2)^{2^{i-2}}\) for \(2 \le i \lt k - 1\) by Lemma 4, then \(\epsilon_{k-2} \le 2^{-2^{k-4}}\). But \(1/n \le \epsilon_{k-2}\) by Lemma 3, so \(1/n \le 2^{-2^{k-4}}\). Taking logs to bring down the \(k\) yields \(k - 4 \le \lg \lg n\). Then \(k \le \lg \lg n + 4\), and thus \(k = O(\lg \lg n)\). ∎

Note that in general, an arithmetic operation is not constant-time,
and in fact has run-time \(\Omega(\lg n)\). Since the most expensive
arithmetic operation we do is division, we can say that
\(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) has run-time that is
both \(\Omega(\lg n)\) and \(O(D(n) \cdot \lg \lg n)\), where \(D(n)\)
is the run-time of dividing \(n\) by some number \(\le n\).^{[5]}

## 4. The Initial Guess

It’s also useful to show that if the initial guess \(x_0\) is bad, then the run-time degrades to \(\Theta(\lg n)\). We’ll do this by defining the function \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}'}\) to be like \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) except that it takes a function \(\mathrm{I{\small NITIAL}-G{\small UESS}}\) that is called with \(n\) and assigned to \(x_0\) in step 1. Then, we can treat \(\epsilon_0\) as a function of \(n\) and analyze how long \(\epsilon_i\) stays above \(1\) to show that \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}'}\) takes \(\Theta(\lg \epsilon_0(n)) + O(\lg \lg n)\) arithmetic operations.

First, we need a lemma:

(Lemma 6.) If \(\epsilon_i \gt 1\), then \(\epsilon_{i+1}\) exists and \(\frac{1}{8}\epsilon_i \le \epsilon_{i+1}\le \frac{1}{4}\epsilon_i\).

Proof. First, if \(\epsilon_i \gt 1\), then \(\epsilon_i \ge 1/n\) and so \(x_i \ge s + 1\), which implies that the main loop doesn’t terminate with \(x_i\), and thus \(x_{i+1}\) and \(\epsilon_{i+1}\) exist.

Then, \[ \begin{aligned} \mathrm{Err}(g(x_i)) &= \frac{(\epsilon_i/2)^2}{\epsilon_i+1} \\ &= \frac{1}{4} \epsilon_i - \frac{\epsilon_i}{4 (\epsilon_i + 1)} \\ &= \frac{1}{8} \epsilon_i + \frac{\epsilon_i(\epsilon_i - 1)}{8 (\epsilon_i + 1)}\text{,} \end{aligned} \] at which point the result follows. ∎

Then we can show:

(Theorem 4.) \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}'}\) performs \(\Theta(\lg \epsilon_0(n)) + O(\lg \lg n)\) loop iterations.

Proof. Assume that \(\epsilon_0(n) \gt 1\), \(\epsilon_i \gt 1\) for \(i \lt j\), and \(\epsilon_j \le 1\). Using Lemma 6, \[ \left( \frac{1}{8} \right)^j \epsilon_0(n) \le \epsilon_j \le 1\text{,} \] which implies \[ j \ge \frac{1}{3} \lg \epsilon_0(n)\text{.} \]

Similarly,

\[ \left( \frac{1}{4} \right)^{j-1} \epsilon_0(n) \ge \epsilon_{j-1} \gt 1\text{,} \] which implies \[ j \lt \frac{1}{2} \lg \epsilon_0(n) + 1\text{.} \]Therefore, \(j = \Theta(\lg \epsilon_0(n))\). Then Theorem 3 can be adapted to show that only \(O(\lg \lg n)\) more iterations of the loop are taken. ∎

Since \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\) uses an initial guess such that \(\epsilon_0(n) = \Theta(1)\), then Theorem 4 reduces to Theorem 3 in that case. However, if \(x_0\) is chosen to be \(\Theta(n)\), e.g. the initial guess is just \(n\) or \(n/k\) for some \(k\), then \(\epsilon_0(n)\) will also be \(\Theta(n)\), and so the run time will degrade to \(\Theta(\lg n)\). So having a good initial guess is important for the performance of \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\)!

[1] Aside from the Wikipedia article, the algorithm is described as Algorithm 9.2.11 in Prime Numbers: A Computational Perspective. ↩

[2] Note that only integer operations are used, which makes this algorithm suitable for arbitrary-precision integers. ↩

[3] Here, and in most of the article, we’ll implicitly assume that \(n \gt 0\). ↩

[4] \(D(n)\) is \(\Theta(\lg^2 n)\) using long division, but fancier division algorithms have better run-times. ↩