Brent's Method - Inverse Quadratic Interpolation

Introduction to Data Science Finding Roots of Equation Brent’s Method – Inverse Quadratic Interpolation

The inverse quadratic interpolation of Brent’s Method is the second of a two-part process which when combined, is Brent’s Method of finding roots.

This method converges faster than the linear one when the approximations are “near” the actual root, but is slow to converge when they are “far away” from the root.

An algorithm that solves equations of the form f(x) = 0 is called inverse quadratic interpolation in numerical analysis. Inverse f can be approximated using quadratic interpolation. Brent’s method uses this algorithm a lot, but it is rarely used alone.

Similarly to the secant method, inverse quadratic interpolation uses some initial points and constructs an equation to find the next estimate. Due to the use of quadratics, however, it provides a better estimation of the equation we are trying to solve, resulting in a more accurate estimate of the root.

A method of finding roots based on inverse quadratic interpolation is to use polynomials p(y) to interpolate x k as a function of y k . Then the next approximate solution is p(0).

For inverse quadratic interpolation, we use x=p(y) instead of y=p(x). As a result of this approach, we are guaranteed that the parabola passes through the x-axis at least once, since the x-intercepts are easy to find. Points usually three define a parabola. As shown in the figure below

Explanation

Typically, higher degree polynomials tend to provide better estimates, when looking at the error associated with the secant method. However, polynomials of 3rd degree or higher are too expensive to reasonably compute. Thus, quadratics are used as they provide a suitable balance between complexity and efficiency in finding roots.

Nevetheless, standard quadratics in the form \(y=ax^2 + bx + c \) are not used because they have minimum or maximum turning points and thus may not have a real root. This prevents them from converging towards the root of the function we are finding.

Instead, an inverse quadratic in the form \(x=q(y) \) is preferred, constructed using Lagrange Polynomials. As inverse quadratics are guaranteed to intersect the x-axis, there will always be a solution for y=0.

Worked Example

In the image below, the graph of \(f(x)=sin(x)+\frac{3}{4x} \) in red is also used in the previous topics on linear interpolation and the bisection method. The graph in green is the inverse quadratic derived from the points \(x=2.5, x=3.5 \) and \(x=4.5 \) while its formula is \(x=0.302268y^2-1.19653y+3.33104 \)

Note that only after 1 iteration, the approximated root (\(3.33104\ldots \), as highlighted by the green graph) is extremely close to the actual root (\(3.3627\ldots \)). This is due to the ability of quadratics to more closely approximate the shape of non-linear functions due to their curvature.

Efficiency

Iteration	x	y
–	4.5	-0.8108634509984304
–	2.5	0.8984721441039565
–	3.5	-0.13649751340390556
1	3.3310456395302936	0.03683282875913396
2	3.366898729297604	-0.0006477897671345878
3	3.36627587617705	5.806422448406678e-07

As evident from the above, quadratics are extremely efficient in converging to the root. Even after just 2 iterations, the approximation was accurate to 3 significant figures.

The math

Lagrange polynomials come in the form:

\(L(x)=\sum_{j=0}{k} y_j l_j (x) \)

where \(k \) is the number of points used to define the function.

Just like how defining a straight line requires 2 distinct points, defining a quadratic requires 3 distinct points.

Lagrange Basis Functions

Each lagrange polynomial of \(n \) degree is made up of \(n+1 \) basis functions in the form:

\(L_{n, k}(x) = \prod{i=0,i \neq k}{n} \frac{x-x_i}{x_k-x_i} \)

We therefore need a second degree Lagrange polynomial using three basis functions – \(L_{2,0}, L_{2,1}, L_{2,2} \) – one for each of the points.

\(L_{2,0} = \frac{(x-x_1)(x-x_2)}{(x_0-x_1)(x_0-x_2)} \)

\(L_{2,1} = \frac{(x-x_0)(x-x_2)}{(x_1-x_0)(x_1-x_2)} \)

\(L_{2,2} = \frac{(x-x_0)(x-x_1)}{(x_2-x_0)(x_2-x_1)} \)

Combining all of the basis functions into the main lagrange function as outlined above, we get:

\(L(x) = y_0 L_{2,0}(x) + y_1 L_{2,1}(x) + y_2 L_{2,2}(x) \)

Complete Derivation

Since we are calculating the inverted quadratic function, \(x=q(y) \), we need to swap the variables x and y.

\(L(y)=x_0L_{2,0}(y)+x_1L_{2,1}(y)+x_2L_{2,2}(y) \)

Furthermore, since we’re only interested in the value of x at a specific point \(y=0 \), we can substitute y=0 into the function.

\(L(0)=x_0L_{2,0}(0)+x_1 L_{2,1}(0)+x_2L_{2,2}(0) \)

\(=x_0\frac{(0-y_1)(0-y_2)}{(y_0-y_1)(y_0-y_2)} +x_1\frac{(0-y_0)(0-y_2)}{(y_1-y_0)(y_1-y_2)}+x_2\frac{0-y_0)(0-y_1)}{(y_2-y_0)(y_2-y_1)} \)

Since \(L(0) \) is an approximation for the value of x at the root, we can use that value to modify L to get an even closer estimate. More generally, the lagrange polynomial can be rewritten as:

\(x_{i+1}=x_{i-2} \frac{y_{i-1} y_i}{(y_{i-2}-y_{i-1})(y_{i-2}-y_i)} \)

\(+ x_{i-1} \frac{y_{i-2} y_i}{(y_{i-1}-y_{i-2})(y_{i-1}-y_i)} \)

\(+ x_i \frac{y_{i-2} y_{i-1}}{(y_i-y_{i-2})(y_i-y_{i-1})} \)

Brent's Method - Complete

Brent’s Method combines all three algorithms – bisection, linear secant interpolation and inverse quadratic interpolation – using a process that chooses which method to use at which stage.

Just like the aforementioned methods, we need to start with a bracketing interval that contains root \((a_0, b_0) \) such that \(f(a_0)f(b_0)<0\). We assume that \(|f(a_0)|<|f(b_0)| \), otherwise, the labeling is switched.

Each iteration has four points, one derived value and one pre-defined constant:

\(a_k \): alignment points that satisfy \(|f(a_k)|<|f(b_k)| \) and \(f(a_k) f(b_k)<0 \)
\(b_k \): root estimate of the current iteration
\(b_{k-1} \): root estimate of the previous iteration; initialised as \(a_0 \)
\(b_{k-2}\): root estimate of the second-last iteration
\(s \): an intermediate derived value in this iteration. it is given by linear interpolation if \(f(b_k) \neq f(b_{k-1}) \), otherwise it is given by the midpoint between \(f(a_k) \) and \(f(b_k) \).
𝜀: a pre-specified precision tolerance for convergence

Next, four inequalities are used to decide which method is used:

\(|\varepsilon|<|b_k-b_{k-1}| \)
\(|\varepsilon|<|b_{k-1}-b_{k-2}| \)
\(|s-b_k|<\frac{1}{2} |b_k-b_{k-1}| \)
\(|s-b_k|<\frac{1}{2} |b_{k-1}-b_{k-2}| \)

If any one of the below 5 conditions hold true, the bisection method is used. Otherwise, interpolation is used.

last iteration uses the bisection method and inequality 1 is false
last iteration uses the interpolation method and inequality 2 is false
last iteration uses the bisection method and inequality 3 is false
last iteration uses the interpolation method and inequality 4 is false
the temporary value, \(s \), calculated by interpolation is not in \([\frac{3a_k + b_k}{4},b_k]\)

The method of interpolation depends on the number of distinct points. If all three points are distinct, quadratic interpolation is used, otherwise linear interpolation.

Explanation

The pre-specified tolerance value was implemented by Brent in 1973 as there were certain cases where linear interpolation was performed for every iteration, resulting in a very slow convergence to the root. Adding this extra step of comparing results ensured that the optimal algorithm for each iteration was used.

Implementation

In NM Dev, the class BrentRoot implements the Brent’s method. It is a two-step procedure: solver construction and then call the solve function. The signatures of the functions are:

				
					**
* Construct an instance of Brent's root finding algorithm.
*
* @param tol the convergence tolerance
* @param maxIterations the maximum number of iterations
*/
public BrentRoot(double tol, int maxIterations)
/**
* Search for a root, <i>x</i>, in the interval <i>[lower, upper]</i> such
* that <i>f(x) = 0</i>.
*
* @param f a univariate function
* @param lower the lower bound of the bracketing interval
* @param upper the upper bound of the bracketing interval
* @param guess an initial guess of the root within <i>[lower, upper]</i>.
* Note that {@code guess} is a {@code double[]}.
* This signature allows multiple initial guesses for certain 
types of
* uniroot algorithms, e.g., Brent's algorithm.
* @return an approximate root
* @throws NoRootFoundException when the search fails to find a root
*/
public double solve(
 UnivariateRealFunction f,
 double lower,
 double upper,
 double... guess
) throws NoRootFoundException;

Here is an example of calling BrentRoot to solve the function in the previous topics, \(f(x) = sin(x) + \frac{3}{4x} \):

				
					UnivariateRealFunction f = new AbstractUnivariateRealFunction(){
    @Override
    public double evaluate(double x) {
        return Math.sin(x) + 3/(4*x); // x^2 - 3 = 0
    }
};
BrentRoot solver = new BrentRoot(1e-8, 10);
double root = solver.solve(f, 2.5, 4.5);
double fx = f.evaluate(root);
System.out.println(String.format("f(%f) = %f", root, fx));

Previous Topic

Back to Lesson

Next Topic