# Computing Integer Roots

## 1. The algorithm

Today I’m going to talk about the generalization of the integer square root algorithm to higher roots. That is, given \(n\) and \(p\), computing \(\mathrm{iroot}(n, p) = \lfloor \sqrt[p]{n} \rfloor\), or the greatest integer whose \(p\)th power is less than or equal to \(n\). The generalized algorithm is straightforward, and it’s easy to generalize the proof of correctness, but the run-time bound is a bit trickier, since it has a dependence on \(p\).

First, the algorithm, which we’ll call \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\):

- If \(n = 0\), return \(0\).
- If \(p \ge \mathrm{Bits}(n)\) return \(1\).
- Otherwise, set \(i\) to \(0\) and set \(x_0\) to \(2^{\lceil \mathrm{Bits}(n) / p\rceil}\).
- Repeat:
- Set \(x_{i+1}\) to \(\lfloor ((p - 1) x_i + \lfloor n/x_i^{p-1} \rfloor) / p \rfloor\).
- If \(x_{i+1} \ge x_i\), return \(x_i\). Otherwise, increment \(i\).

and its implementation in Javascript:^{[1]}

```
// iroot returns the greatest number x such that x^p <= n. The type of
// n must behave like BigInteger (e.g.,
// https://github.com/jasondavies/jsbn ), n must be non-negative, and
// p must be a positive integer.
//
// Example (open up the JS console on this page and type):
//
// iroot(new BigInteger("64"), 3).toString()
function iroot(n, p) {
var s = n.signum();
if (s < 0) {
throw new Error('negative radicand');
}
if (p <= 0) {
throw new Error('non-positive degree');
}
if (p !== (p|0)) {
throw new Error('non-integral degree');
}
if (s == 0) {
return n;
}
var b = n.bitLength();
if (p >= b) {
return n.constructor.ONE;
}
// x = 2^ceil(Bits(n)/p)
var x = n.constructor.ONE.shiftLeft(Math.ceil(b/p));
var pMinusOne = new n.constructor((p - 1).toString());
var pBig = new n.constructor(p.toString());
while (true) {
// y = floor(((p-1)x + floor(n/x^(p-1)))/p)
var y = pMinusOne.multiply(x).add(n.divide(x.pow(pMinusOne))).divide(pBig);
if (y.compareTo(x) >= 0) {
return x;
}
x = y;
}
}
```

This algorithm turns out to require \(\Theta(p) + O(\lg \lg n)\) loop iterations, with the run-time for a loop iteration depending on what kind of arithmetic operations are used.

## 2. Correctness

Finally, let’s give a name to our desired output: let \(s =
\mathrm{iroot}(n, p) = \lfloor \sqrt[p]{n} \rfloor\).^{[2]}

Unsurprisingly, \(f(x)\) never underestimates:

(Lemma 1.) For \(x \gt 0\), \(f(x) \ge s\).

Proof. By the basic properties of \(f(x)\) and \(g(x)\) above, it suffices to show that \(g(x) \ge s\). \(g'(x) = (1 - 1/p) (1 - n/x^p)\) and \(g''(x) = (p - 1) (n/x^{p+1})\). Therefore, \(g(x)\) is concave-up for \(x \gt 0\); in particular, its single positive extremum at \(x = \sqrt[p]{n}\) is a minimum. But \(g(\sqrt[p]{n}) = \sqrt[p]{n} \ge s\). ∎

Also, our initial guess is always an overestimate:

(Lemma 2.) \(x_0 \gt s\).

Proof. \(\mathrm{Bits}(n) = \lfloor \lg n \rfloor + 1 \gt \lg n\). Therefore, \[ \begin{aligned} x_0 &= 2^{\lceil \mathrm{Bits}(n) / p \rceil} \\ &\ge 2^{\mathrm{Bits}(n) / p} \\ &\gt 2^{\lg n / p} \\ &= \sqrt[p]{n} \\ &\ge s\text{.} \; \blacksquare \end{aligned} \]

Therefore, we again have the invariant that \(x_i \ge s\), which lets us prove partial correctness:

(Theorem 1.) If \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) terminates, it returns the value \(s\).

Proof. Assume it terminates. If it terminates in step \(1\) or \(2\), then we are done. Otherwise, it can only terminate in step \(4.2\) where it returns \(x_i\) such that \(x_{i+1} = f(x_i) \ge x_i\). This implies \(g(x_i) = ((p-1)x_i + n/x_i^{p-1}) / p \ge x_i\). Rearranging yields \(n \ge x_i^p\) and combining with our invariant we get \(\sqrt[p]{n} \ge x_i \ge s\). But \(s + 1 \gt \sqrt[p]{n}\), so that forces \(x_i\) to be \(s\), and thus \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) returns \(s\) if it terminates. ∎

Total correctness is also easy:

(Theorem 2.) \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) terminates.

Proof. Assume it doesn’t terminate. Then we have a strictly decreasing infinite sequence of integers \(\{ x_0, x_1, \ldots \}\). But this sequence is bounded below by \(s\), so it cannot decrease indefinitely. This is a contradiction, so \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) must terminate. ∎

Note that, like \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}}\), the check in step \(4.2\) cannot be weakened to \(x_{i+1} = x_i\), as doing so would cause the algorithm to oscillate. In fact, as \(p\) grows, so do the number of values of \(n\) that exhibit this behavior, and so do the number of possible oscillations. For example, \(n = 972\) with \(p = 3\) would yield the sequence \(\{ 16, 11, 10, 9, 10, 9, \ldots \}\), and \(n = 80\) with \(p = 4\) would yield the sequence \(\{ 4, 3, 2, 4, 3, 2, \ldots \}\).

## 3. Run-time

We will show that \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) takes \(\Theta(p) + O(\lg \lg n)\) loop iterations. Then we will analyze a single loop iteration and the arithmetic operations used to get a total run-time bound.

Analagous to the square root case, define \(\mathrm{Err}(x) = x^p/n - 1\) and let \(\epsilon_i = \mathrm{Err}(x_i)\). First, let’s prove our lower bound for \(\epsilon_i\), which translates directly from the square root case:

(Lemma 3.) \(x_i \ge s + 1\) if and only if \(\epsilon_i \ge 1/n\).

Proof. \(n \lt (s + 1)^p\), so \(n + 1 \le (s + 1)^p\), and therefore \((s + 1)^p/n - 1 \ge 1/n\). But the expression on the left side is just \(\mathrm{Err}(s + 1)\). \(x_i \ge s + 1\) if and only if \(\epsilon_i \ge \mathrm{Err}(s + 1)\), so the result immediately follows. ∎

Now for the next few lemmas we need to do some algebra and calculus. Inverting \(\mathrm{Err}(x)\), we get that \(x_i = \sqrt[p]{(\epsilon_i + 1) \cdot n}\). Expressing \(g(x_i)\) in terms of \(\epsilon_i\) and \(q = 1 - 1/p\) we get \[ g(x_i) = \sqrt[p]{n} \left( \frac{\epsilon_i q + 1}{(\epsilon_i + 1)^q} \right) \] and \[ \mathrm{Err}(g(x_i)) = \frac{(q \epsilon_i + 1)^p}{(\epsilon_i + 1)^{p-1}} - 1\text{.} \] Let \[ f(\epsilon) = \frac{(q \epsilon + 1)^p}{(\epsilon + 1)^{p-1}} - 1\text{.} \] Then computing derivatives, \[ \begin{aligned} f'(\epsilon) &= q \epsilon \frac{(q \epsilon + 1)^{p-1}}{(\epsilon + 1)^p}\text{,} \\ f''(\epsilon) &= q \frac{(q \epsilon + 1)^{p-2}}{(\epsilon + 1)^{p + 1}}\text{, and} \\ f'''(\epsilon) &= -q (2 + q (2 + 3 \epsilon)) \frac{(q \epsilon + 1)^{p-3}}{(\epsilon + 1)^{p + 2}}\text{.} \end{aligned} \] Note that \(f(0) = f'(0) = 0\), and \(f''(0) = q\). Also, for \(\epsilon > 0\), \(f'(\epsilon) \gt 0\), \(f''(\epsilon) \gt 0\), and \(f'''(\epsilon) < 0\).

Now we’re ready to show that the \(\epsilon_i\) shrink quadratically:

(Lemma 4.) \(f(\epsilon) \lt (\epsilon/\sqrt{2})^2\) for \(\epsilon \gt 0\).

Proof. Taylor-expand \(f(\epsilon)\) around \(0\) with the Lagrange remainder form to get \[ f(\epsilon) = f(0) + f'(0) \epsilon + \frac{f''(0)}{2} \epsilon^2 + \frac{f'''(\xi)}{6} \epsilon^3 \] for some some \(\xi\) such that \(0 \lt \xi \lt \epsilon\). Plugging in values, we see that \(f(\epsilon) = \frac{1}{2} q \epsilon^2 + \frac{1}{6} f'''(\xi) \epsilon^3\) with the last term being negative, so \(f(\epsilon) \lt \frac{1}{2} q \epsilon^2 \lt \frac{1}{2} \epsilon^2\). ∎

But this is only a useful upper bound when \(\epsilon_i \le 1\). In
the square root case this was okay, since \(\epsilon_1 \le 1\), but
that is not true for larger values of \(p\). In fact, in general, the
\(\epsilon_i\) start off shrinking *linearly*:

(Lemma 5.) For \(\epsilon \gt 1\), \(f(\epsilon) \gt \epsilon/8\).

Proof. Since \(f(0) = f'(0) = 0\), and \(f''(\epsilon) \gt 0\) for \(\epsilon \ge 0\), \(f'(\epsilon)\) and \(f(\epsilon)\) are increasing, and thus \(f(1) \gt 0\) and \(f(\epsilon)\) is a concave-up curve.

Then \((0, 0)\) and \((1, f(1))\) are two points on a concave-up curve, and thus geometrically the line \(y = f(1) \epsilon\) must lie below \(y = f(\epsilon)\) for \(\epsilon \gt 1\), and thus \(f(\epsilon) \gt f(1) \epsilon\) for \(\epsilon \gt 1\). Algebraically, this also follows from the definition of (strict) convexity (with \(x_1 = 0\), \(x_2 = \epsilon\), and \(t = 1 - 1/\epsilon\)).

But \(f(1) = (2 - 1/p)^p/2^{p-1} - 1 = 2 \left(1 - \frac{1}{2p}\right)^p - 1\), which is always increasing as a function of \(p\), as you can see by calculating its derivative. Therefore, its minimum is at \(p = 2\), which is \(1/8\), and so \(f(\epsilon) \gt f(1) \epsilon \ge \epsilon/8\). ∎

Finally, let’s bound our initial values:

(Lemma 6.) \(x_0 \le 2s\) and \(\epsilon_0 \le 2^p - 1\).

Proof. This is a straightforward generalization of the equivalent lemma from the square root case. Let’s start with \(x_0\): \[ \begin{aligned} x_0 &= 2^{\lceil \mathrm{Bits}(n) / p \rceil} \\ &= 2^{\lfloor (\lfloor \lg n \rfloor + 1 + p - 1)/p \rfloor} \\ &= 2^{\lfloor \lg n / p \rfloor + 1} \\ &= 2 \cdot 2^{\lfloor \lg n / p \rfloor}\text{.} \end{aligned} \] Then \(x_0/2 = 2^{\lfloor \lg n / p \rfloor} \le 2^{\lg n / p} = \sqrt[p]{n}\). Since \(x_0/2\) is an integer, \(x_0/2 \le \sqrt[p]{n}\) if and only if \(x_0/2 \le \lfloor \sqrt[p]{n} \rfloor = s\). Therefore, \(x_0 \le 2s\).

As for \(\epsilon_0\): \[ \begin{aligned} \epsilon_0 &= \mathrm{Err}(x_0) \\ &\le \mathrm{Err}(2s) \\ &= (2s)^p/n - 1 \\ &= 2^p s^p/n - 1\text{.} \end{aligned} \] Since \(s^p \le n\), \(2^p s^p/n \le 2^p\) and thus \(\epsilon_0 \le 2^p - 1\). ∎

Now we’re ready to show our main result, which involves calculating how long the \(\epsilon_i\) shrink linearly:

(Theorem 3.) \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) performs \(\Theta(p) + O(\lg \lg n)\) loop iterations.

Proof. Assume that \(\epsilon_i \gt 1\) for \(i \le j\), \(\epsilon_{j+1} \le 1\), and \(j+k\) is the number of loop iterations performed when running the algorithm for \(n\) and \(p\) (i.e., \(x_{j+k} \ge x_{j+k-1}\)). Using Lemma 5, \[ \left( \frac{1}{8} \right)^{j+1} \epsilon_0 \lt \epsilon_{j+1} \le 1\text{,} \] which implies \[ j \gt \frac{\lg \epsilon_0}{3} - 1\text{.} \]

Similarly,

\[ \left( \frac{1}{8} \right)^j \epsilon_0 \ge \epsilon_j \gt 1\text{,} \] which implies \[ j \lt \frac{\lg \epsilon_0}{3} \text{.} \]Therefore, \(j = \Theta(\lg \epsilon_0)\), which is \(\Theta(p)\) by Lemma 6.

Now assume \(k \ge 5\). Then \(x_i \ge s + 1\) for \(i \lt j + k - 1\). Since \(\epsilon_{j+1} \le 1\) by assumption, \(\epsilon_{j+3} \le 1/2\) and \(\epsilon_i \le (\epsilon_{j+3})^{2^{i-j-3}}\) for \(j + 3 \le i \lt j + k - 1\) by Lemma 4, then \(\epsilon_{j+k-2} \le 2^{-2^{k-5}}\). But \(1/n \le \epsilon_{j+k-2}\) by Lemma 3, so \(1/n \le 2^{-2^{k-5}}\). Taking logs to bring down the \(k\) yields \(k - 5 \le \lg \lg n\). Then \(k \le \lg \lg n + 5\), and thus \(k = O(\lg \lg n)\).

Therefore, the total number of loop iterations is \(\Theta(p) + O(\lg \lg n)\). ∎

Note that \(p \le \lg n\), so we can just say that \(\mathrm{N{\small EWTON}\text{-}I{\small ROOT}}\) performs \(\Theta(\lg n)\) operations. But that obscures rather than simplifies. Note that the proof above is very similar to the proof of the worse run-time of \(\mathrm{N{\small EWTON}\text{-}I{\small SQRT}'}\) where the initial guess varies. In this case, the error in our initial guess is magnified, since we raise it to the \((p-1)\)th power, and so that manifests as the \(\Theta(p)\) term.

Furthermore, unlike the square root case, the number of arithmetic operations in a loop iteration isn’t constant. In particular, the sub-step to compute \(x_i^{p-1}\) takes a number of arithmetic operations dependent on \(p - 1\). Using repeated squarings, this computation would take \(\Theta(\lg p)\) squarings and at most \(\Theta(\lg p)\) multiplications.

If the cost of an arithmetic operation is constant, e.g., we’re working with fixed-size integers, then the run-time bounds is the above multiplied by \(\Theta(\lg p)\).

Otherwise, if the cost of an arithmetic operation depends on the length of its arguments, then we only have to multiply by a constant factor to get the run-time bounds in terms of arithmetic operations. If the cost of multiplying two numbers \(\le x\) is \(M(x) = O(\lg^k x)\), then the cost of computing \(x^p\) is \(O((p \lg x)^k)\). But \(x\) is \(\Theta(n^{1/p})\), so the cost of computing \(x^p\) is \(O(\lg^k n)\), which is on the order of the cost of multiplying two numbers \(\le n\). Furthermore, note that we divide the result into \(n\), so we can stop once the computation of \(x_i^{p-1}\) exceeds \(n\). So in that case, we can treat a loop iteration as if it were performing a constant number of arithmetic operations on numbers of order \(n\), and so, like in the square root case, we pick up a factor of \(D(n)\), where \(D(n)\) is the run-time of dividing \(n\) by some number \(\le n\).