Solving System of Linear Equations

Introduction to Data Science Finding Roots of System of Equations Solving System of Linear Equations

Solving a System of Linear Equations

Currently, we have only been making use of graphical methods to obtain the solution, but there are ways to do it algebraically. We will be looking at how to obtain these solutions using Substitution and Gaussian Elimination.

Substitution Method

First, let’s take the example of this system of equations.

1. $3x+2y-z=1 $

2. $2x-2y+4z=-2 $

3. $-x+1/2 y-z=0 $

We can write $x $ in terms of other variables to substitute it into other equations. From the third equation,

$x=1/2y-z $

We then substitute this value into the second equation,

$2(1/2y-z)-2y+4z=-2 $

$2z-y=-2 $

We will get an equation of $y $ in terms of $z $.

$y=2z+2 $

Then we proceed to substitute the value of $y $ and $x $ into the first equation.

$3(1/2y-z)+2(2z+2)-z=1 $

$3/2y-3z+4z+4-z=1 $

$3/2(2z+2)+4=1 $

$3z+3+4=1 $

$z=-2 $

With this value of $z $, we can then solve for $y $ and then $x $ by substituting back the values of $z $.

$y=2(-2)+2 $

$y=-2 $

$x=1/2(-2)-(-2) $

$x=1 $

Therefore, the solution to this system is

$x=1 $

$y=-2 $

$z=-2 $

Gaussian Elimination Method

Gaussian elimination is an algorithm for solving systems of linear equations. It involves transforming a matrix, which will represent a linear system of equations, into Row Echleon form, where it can be easily used to solve a system.

Firstly, to represent a system with a matrix, we will follow the format of

$\begin{bmatrix}a_{1} & b_{1}& c_{1}&x \\a_{2} & b_{2}& c_{2}&y \\a_{3} & b_{3}& c_{3}&z\end{bmatrix} $

where the system of equations is

1. $a_{1}j+b_{1}q+c_{1}k=x $

2. $a_{2}j+b_{2}q+c_{2}k=y $

3. $a_{3}j+b_{3}q+c_{3}k=z $

The columns correspond to the individual variables (e.g. the first column is the coefficients for $j $, second for $q $, etc.) and the rows represent the equation(first row represents the first equation, etc.).

Row Echleon Form

By plotting the system of equations into a matrix, we are able to simplify the matrix into row echleon form to compute the solution. Firstly, what is row echleon form? A matrix is in row echleon form if it fulfills these conditions.

1. All rows consisting of only zeroes are at the bottom.

2. The leading coefficient (also called the pivot) of a nonzero row is always
to the right of the leading coefficient of the row above.

In summary, a matrix in row echleon form will look something like this:

$\begin{bmatrix}a_{1} & b_{1}& c_{1}&x \\0&b_{2}& c_{2}&y \\0&0&c_{3}&z\end{bmatrix} $

From this matrix, one can then determine the value of $c_{3}k=z $, and thus back substitute the value of $k $ into the second equation to solve for $q $, and so on, until you solve for all the variables.

Reduced Row Echleon Form

You can further simplfy the matrix into a reduced row echleon form when the matrix the leading entry in each nonzero row is a 1 (called a leading 1) and the values above a leading entry is 0. The matrix will therefore look like this:

$\begin{bmatrix}1 & 0& 0&a \\0&1& 0&b \\0&0&1&c\end{bmatrix} $

which will allow us to simply derive the values of $x $, $y $ and $z $ by just looking at the equation.

Thinking Time:

Which is more efficient to use, Row Echleon form or Reduced Row Echleon form? Would there be situations where the Row Echleon form would be enough to straight away solve a system of equations without further reducing to a reduced row echleon form?

Transformation Operations

There are mainly three ways to simplfy a matrix into Row Echleon form, which are

1. Row switching: switch one row with another row
2. Row multiplication: multiply a row by a non–zero constant
3. Row addition/subtraction: add the multiple of another row to a row.

Row Switching would simply switch the position of one row with another. In this case,

$\begin{bmatrix}1&1&1&6\\0&0&2&6\\0&2&1&7\end{bmatrix} $

We can switch Row 2 and Row 3 to turn the matrix into Row Elchleon Form, where it will look like

$\begin{bmatrix}1&1&1&6\\0&2&1&7\\0&0&2&6\end{bmatrix} $

Row Multiplication can be used to scale the rows by a factor of a non-zero constant,n.

$\begin{bmatrix}a_{1} & b_{1}& c_{1}&x \\2a_{2} & 2b_{2}& 2c_{2}&2y \\a_{3} & b_{3}& c_{3}&z\end{bmatrix} $

In this example, we scaled the second row by a factor of 2.

Row Addition/Subtraction can be used to add or subtract values from one row with that of another row. A matrix

$\begin{bmatrix}1&1&1&6\\0&2&1&7\\0&1&2&8\end{bmatrix} $

can be converted into Row Echleon form by subtracting Row 3 with 1/2 of Row 2, resulting in

$\begin{bmatrix}1&1&1&6\\0&2&1&7\\0&1-1/2*2&2-1/2*1&8-1/2*7\end{bmatrix}= \begin{bmatrix}1&1&1&6\\0&2&1&7\\0&0&1.5&4.5\end{bmatrix} $

Real-life Example

Lets use a real-life example to illustrate turning a system of equations into matrix form, then simplifying to row echleon form and reduced row echleon form.

Assume that you want to buy a computer from this peculiar computer shop, where you can freely customise all of the parts by removing or adding them to your laptop, which will add or subtract from the cost of the laptop. You are able to customise three main components, which are the CPU, SSD and GPU, which we will define as $x $, $y $ and $z $ respectively. You wanted to find out how much each component actually costs, and hence decided to do an experiment. You tested 3 combinations, where you either added or removed one of these three components from your laptop. In the first post, you added 2 CPUs and a SSD, while removing a GPU, which costed $7. In the second, you added 2 CPUs and 2 GPUs, while removing a SSD, which costed $6. While for the last, you added a CPU and a GPU, while removing 2 SSDs, costing $0.

You can write the three combinations as this system of equation,

1. $2x+y-z=7 $

2. $2x-y+2z=6 $

3. $x-2y+z=0 $

which can be represented in the matrix as

$\begin{bmatrix}2&1&-1&7\\2&-1&2&6\\1&-2&1&0\end{bmatrix} $

This system is now represented as a Matrix. To simplfy it to Row Echleon Form, we will first need to remove the leading coefficient in Row 2, which is a 2. We will simply perform the operation of subtracting Row 2 with Row 1 to turn the leading coefficient in Row 2 to a 0.

$\begin{bmatrix}2&1&-1&7\\2-2&-1-1&2-(-1)&6-7\\1&-2&1&0\end{bmatrix}=\begin{bmatrix}2&1&-1&7\\0&-2&3&-1\\1&-2&1&0\end{bmatrix} $

Next, we will have to turn the leading coefficient in Row 3 to a 0. This can be done by subtracting Row 3 with 1/2 of Row 1

$\begin{bmatrix}2&1&-1&7\\0&-2&3&-1\\1-1&-2-1/2&1-(-1/2)&0-7/2\end{bmatrix} = \begin{bmatrix}2&1&-1&7\\0&-2&3&-1\\0&-2.5&1.5&-3.5\end{bmatrix} $

Next, we will remove the leading coefficient from Row 3, which is now -2.5. This is done by subtracting Row 3 with 5/4 of Row 2

$\begin{bmatrix}2&1&-1&7\\0&-2&3&-1\\0&-2.5-(-2*5/4)&1.5-(3*5/4)&-3.5-(-1*5/4)\end{bmatrix} =\begin{bmatrix}2&1&-1&7\\0&-2&3&-1\\0&0&-9/4&-9/4\end{bmatrix} $

The matrix is now in Row Elchleon Form. We can also do the same by using the NM Dev class GaussianElimination, which performs elementary row operations to reduce a matrix to the row echleon form.

				
					%use s2
//var a is the matrix for the linear system
var a = DenseMatrix(arrayOf(
    doubleArrayOf(2.0, 1.0, -1.0, 7.0),
    doubleArrayOf(2.0, -1.0, 2.0 ,6.0),
    doubleArrayOf(1.0, -2.0, 1.0,0.0)
))
// GaussianElimination(A,usePivoting,epsilon) takes in 3 parameters:
//  A - a matrix
//  usePivoting - true if to use partial pivoting, e.g., for numerical stability.
//  epsilon - a precision parameter: when a number |x| ≤ ε, it is considered 0
val ops:GaussianElimination = GaussianElimination(a, true, 0.0)
//GaussianElimination has 4 methods, but we will only be looking at 1,
//which is U, as it returns to us a upper triangular matrix, 
//which is the matrix in row elchleon form
val ans:Matrix = ops.U()
print(ans)

output: 3x4
     [,1] [,2] [,3] [,4] 
[1,] 2.000000, 1.000000, -1.000000, 7.000000, 
[2,] 0.000000, -2.500000, 1.500000, -3.500000, 
[3,] 0.000000, 0.000000, 1.800000, 1.800000,

This will return a different matrix that is also in Row Echleon Form, which looks like

$\begin{bmatrix}2&1&-1&7\\0&-2.5&1.5&-3.5\\0&0&1.8&1.8\end{bmatrix} $

Now, we can begin deriving a solution from this simplified matrix. From the first matrix, we can already tell that

$1.8z = 1.8 $

$z = 1 $

Using the value $z = 1$, we can then derive the value of $y $ using Row 2,

$-2.5y+1.5z=-3.5 $

$-2.5y=-5 $

$y=2 $

Lastly, we can use $z = 1$ and $y = 2$ to derive the value of $x$ in Row 1,

$2x+y-z=7 $

$2x+2-1=7 $

$2x=6 $

$x=3 $

Therefore, using a matrix in Row Echleon Form, we have solved for the solution, where

$x=3 $

$y=2 $

$z=1 $

We can use NM Dev class BackwardSubstitution to implement a backward substitution algorithm that will compute the answer in the same way as our manual working. It works on an upper triangular matrix, which in our case means a matrix in Row Echleon Form, and starts first by computing $X_{n} $, then substitutes that backward into the next equation to solve for $X_{n-1} $, and repeats until $X_{1} $. An example of the code is shown below.

				
					val tri:UpperTriangularMatrix = UpperTriangularMatrix(
arrayOf(
    doubleArrayOf(2.0, 1.0, -1.0),
    doubleArrayOf(-2.5, 1.5),
    doubleArrayOf(1.8)
)) // matrix of the variables in upper triangular shape
val b:Vector = ans.getColumn(4) //column for constants
val solver:BackwardSubstitution  = BackwardSubstitution()
print(solver.solve(tri,b))

output:[3.000000, 2.000000, 1.000000]

This gives us the same answer of

$x=3 $

$y=2 $

$z=1 $

We have now deduced that a component of CPU costs $3, while SSD costs $2 and a GPU costs $1.

We are also able to further reduce the matrix into Reduced Row Echleon Form. Starting from this matrix,

$\begin{bmatrix}2&1&-1&7\\0&-2.5&1.5&-3.5\\0&0&1.8&1.8\end{bmatrix} $

we will try to manipulate the leading entry to a 1 while ensuring that each column containing a leading 1 has zeros in all its other entries. Firstly, we can divide Row 3 by 1.8,

$\begin{bmatrix}2&1&-1&7\\0&-2.5&1.5&-3.5\\0&0&1.8/1.8&1.8/1.8\end{bmatrix} = \begin{bmatrix}2&1&-1&7\\0&-2.5&1.5&-3.5\\0&0&1&1\end{bmatrix} $

then we can divide Row 2 by -2.5,

$\begin{bmatrix}2&1&-1&7\\0&-2.5/-2.5&1.5/-2.5&-3.5/-2.5\\0&0&1&1\end{bmatrix} = \begin{bmatrix}2&1&-1&7\\0&1&-0.6&1.4\\0&0&1&1\end{bmatrix} $

then we can subtract Row 1 with Row 2 to give us

$\begin{bmatrix}2&1-1&-1+0.6&7-1.4\\0&1&-0.6&1.4\\0&0&1&1\end{bmatrix} = \begin{bmatrix}2&0&-0.4&5.6\\0&1&-0.6&1.4\\0&0&1&1\end{bmatrix} $

From here, we can divide Row 1 by 2

$\begin{bmatrix}2/2&0&-0.4/2&5.6/2\\0&1&-0.6&1.4\\0&0&1&1\end{bmatrix} = \begin{bmatrix}1&0&-0.2&2.8\\0&1&-0.6&1.4\\0&0&1&1\end{bmatrix} $

Now, we will have to subtract Row 2 with -0.6 of Row 3

$\begin{bmatrix}1&0&-0.2&2.8\\0&1&-0.6+0.6&1.4+0.6\\0&0&1&1\end{bmatrix} = \begin{bmatrix}1&0&-0.2&2.8\\0&1&0&2\\0&0&1&1\end{bmatrix} $

And add 0.2 of Row 3 to Row 1

$\begin{bmatrix}1&0&-0.2+0.2&2.8+0.2\\0&1&0&2\\0&0&1&1\end{bmatrix}=\begin{bmatrix}1&0&0&3\\0&1&0&2\\0&0&1&1\end{bmatrix} $

From here, the Matrix is finally now in Reduced Row Echleon Form, and we can easily deduce the answer from looking at the matrix

$x=3 $

$y=2 $

$z=1 $

Though reducing the matrix into Reduced Row Echleon Form takes a longer time to compute, it is does not require backward substitution and the solution is obvious when the matrix is in such a form, while it may not be so for a matrix in Row Echleon Form. Taking the same matrix, we can use the NM Dev class GaussJordanElimination to simplfy our matrix into Reduced Row Echleon Form and obtain the answers for $x$, $y$ and $z$.

				
					%use s2
//var a the matrix for the linear system
var a = DenseMatrix(arrayOf(
    doubleArrayOf(2.0, 1.0, -1.0, 7.0),
    doubleArrayOf(2.0, -1.0, 2.0 ,6.0),
    doubleArrayOf(1.0, -2.0, 1.0,0.0)
))
// GaussJordanElimination(A,usePivoting,epsilon) takes in 3 parameters:
//  A - a matrix
//  usePivoting - true if to use partial pivoting, e.g., for numerical stability.
//  epsilon - a precision parameter: when a number |x| ≤ ε, it is considered 0
val ops:GaussJordanElimination = GaussJordanElimination(a, true, 0.0)
val ans:Matrix = ops.U()
print(ans)

output:3x4
	[,1] [,2] [,3] [,4] 
[1,] 1.000000, 0.000000, 0.000000, 3.000000, 
[2,] 0.000000, 1.000000, 0.000000, 2.000000, 
[3,] 0.000000, 0.000000, 1.000000, 1.000000,

Homogeneous and non-Homogeneous systems

Usually, in a consistent system of linear equations, the number of equations is smaller than or equal to the number of variables. Otherwise, it is an over determined system that usually results in no solutions. For example, a system with 4 equations that are linearly independent but only 3 variables is a overdetermined system. We can represent the system in a similar way,

$Ax=b $

where $A $ is the matrix of coefficients of unknowns for the system of equations, $x $ is the vector of the values of the unknowns, and $b $ is the constant vector. For a system

1. $a_{1}x+b_{1}y+c_{1}z=1 $

2. $a_{2}x+b_{2}y+c_{2}z=2 $

3. $a_{3}x+b_{3} y+c_{3}z=3 $

It can be represented in this form as

$\begin{bmatrix}a_{1} & b_{1} &c_{1} \\a_{2} & b_{2} &c_{2} \\a_{3} & b_{3} &c_{3} \end{bmatrix}\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}1\\2\\3\end{bmatrix} $

We can view $A $ as a linear transformation or a function that maps from one linear space $V $ to another linear space $W $. The kernel of a linear map, $ker(A) $, is the linear subspace of the domain such that the function maps all its elements to 0. In essence, it maps all elements of $x $ where $A(x)=0 $.

For any homogeneous equation,

$Ax=0 $

there exists a solution where zero is assigned to the each of the variables. If $A $ is non-singular($det(A)\neq0 $), then it is the only solution. Otherwise, the dimensions or rank of kernel of $A $ is not $0 $, $ker(A)>0 $, there is a solution set with infinite number of solutions. Every vector in the kernel is a non-trivial solution. This solution set also have these properties:

1. If $u $ and $v $ are vectors representing solutions to the homogeneous system, then the vector sum $u+v$ is also a solution

2. If $u $ is a solution to the homogeneous system, then $cu $ is also a solution, provided $c $ is a scalar

For a non-homogeneous equation,

$Ax=b $

we can assume $p $ is a particular solution to the linear system such that $Ap=b $. In such a case, assuming $v $ is an element of $Ker(A) $, $p+v $ is the solution set of the homogeneous system. Geometrically, the solution set for $Ax=b $ is a translation from the solution set of $Ax=0 $. More specifically, the Euclidean subspace for the first system can be obtained by translating the linear subspace of the homogeneous system by the vector $p $. This only applies if the system has at least one solution and occurs if and only if vector $b $ lies in the image of the linear transformation $A $.

In NM Dev, the class LinearSystemSolver solves a system of equation in the form of $Ax=b $, which assumes, after row reduction, $A $ no more rows than columns. Hence the system must not be overdetermined. The system is solved in 2 parts:

1. Solve $Ax=0 $ for non-trivial solutions

2. Solve $Ax=b $ for particular solutions

If $A $ has full rank, it will then solve the system by LU decomposition. Otherwise, a particular solution is found by $p=Tb $, where $T $ is the transformation matrix of $A $ to reduced row echleon form.

Note: A matrix is not overdetermined as long as the number of linearly independent equations are lesser than the number of variables. For example, this matrix of 3 equations but 2 variables

$\begin{bmatrix}1 & 1 \\-1& 0 \\2&2\end{bmatrix}x=\begin{bmatrix}2\\-1\\4 \end{bmatrix} $

is not overdetermined, since the first and third equation are linearly dependent.

The final solution can be found by translating the nullspace of $A $ by the particular solution, $p $

$p+ker(A) $

To illustrate, we can use this linear system

$\begin{bmatrix}0&1&2&-1\\1&0&1&1\\-1&1&0&-1\\0&2&3&-1\end{bmatrix}x=\begin{bmatrix}1\\4\\2\\7\end{bmatrix} $

The basis for the kernel is such that

$\begin{bmatrix}0&1&2&-1\\1&0&1&1\\-1&1&0&-1\\0&2&3&-1\end{bmatrix}\begin{bmatrix}-2\\-1\\1\\1\end{bmatrix}=\begin{bmatrix}0\\0\\0\\0\end{bmatrix} $

while the particular solution is

$\begin{bmatrix}0&1&2&-1\\1&0&1&1\\-1&1&0&-1\\0&2&3&-1\end{bmatrix}\begin{bmatrix}9\\11\\-5\\0\end{bmatrix}=\begin{bmatrix}1\\4\\2\\7\end{bmatrix} $

Overall, the particular solution $p $, as well as all vectors of $p $ that is transformed by $ker(A) $, are solutions for the system.

				
					%use s2
var A = DenseMatrix(arrayOf(
    doubleArrayOf(0.0, 1.0, 2.0, -1.0),
    doubleArrayOf(1.0, 0.0, 1.0 , 1.0),
    doubleArrayOf(-1.0, 1.0, 0.0, -1.0),
    doubleArrayOf(0.0, 2.0, 3.0, -1.0)
))
var b = DenseVector(arrayOf(1.0, 4.0, 2.0 ,7.0))
// construct a linear system solver
val solver:LinearSystemSolver = LinearSystemSolver(1e-15) // precision
// solve the homogenous linear system
val soln: LinearSystemSolver.Solution = solver.solve(A)
// get a particular solution 
val p:Vector = soln.getParticularSolution(b)
println("Particular solution: $p")
//proof Ap=b
var Ap = A.multiply(p)
println("Ap = $Ap")
println("Ap = b: " + (Ap==b)) // hence p is a valid solution for the system Ax=b

// get the ker(A)
// kernel is stored as a list of different vectors which
// result in the value of Ax=0
val kernel:List = soln.getHomogeneousSoln()
println("Ker(A) = $kernel")
// from this, the kernel has only a size of 1
// another solution of A would be ker(A) + p
val k:Vector = kernel[0]
// proof that A(p+k) = b
var x = k.add(p)
println("p+k = $x")
var Ax = A.multiply(x)
println("A(p+k) = $Ax")
println("Ax = b: "+(Ax==b)) // since Ax==b, p+k is a valid solution for the system Ax=b

output:
Particular solution: [9.000000, 11.000000, -5.000000, 0.000000] 
Ap = [1.000000, 4.000000, 2.000000, 7.000000] 
Ap = b: true
Ker(A) = [[-2.000000, -1.000000, 1.000000, 1.000000] ]
p+k = [7.000000, 10.000000, -4.000000, 1.000000] 
A(p+k) = [1.000000, 4.000000, 2.000000, 7.000000] 
Ax = b: true

Note: the solve function of the class LinearSystemSolver returns an object of class LinearSystemSolver.Solution.

Previous Topic

Back to Lesson

Next Topic