Matrix Algebra - NM EDUCATION

Introduction to Data Science Linear Algebra Matrix Algebra

Matrix is a way of writing similar things together to handle and manipulate them as per our requirements easily. In Data Science, it is generally used to store information like weights in an Artificial Neural Network while training various algorithms. You will be able to understand my point by the end of this article.

Technically, a matrix is a 2-D array of numbers (as far as Data Science is concerned). For example look at the matrix A below.
1 2 3
4 5 6
7 8 9

Generally, rows are denoted by ‘i’ and column are denoted by ‘j’. The elements are indexed by ‘i’th row and ‘j’th column.We denote the matrix by some alphabet e.g. A and its elements by A(ij).

In above matrix
A12 = 2
To reach to the result, go along first row and reach to second column.

Terms related to Matrix
Order of matrix – If a matrix has 3 rows and 4 columns, order of the matrix is 3*4 i.e. row*column.

Square matrix – The matrix in which the number of rows is equal to the number of columns.
Diagonal matrix – A matrix with all the non-diagonal elements equal to 0 is called a diagonal matrix.
Upper triangular matrix – Square matrix with all the elements below diagonal equal to 0.
Lower triangular matrix – Square matrix with all the elements above the diagonal equal to 0.
Scalar matrix – Square matrix with all the diagonal elements equal to some constant k.
Identity matrix – Square matrix with all the diagonal elements equal to 1 and all the non-diagonal elements equal to 0.
Column matrix – The matrix which consists of only 1 column. Sometimes, it is used to represent a vector.
Row matrix – A matrix consisting only of row.
Trace – It is the sum of all the diagonal elements of a square matrix.

Matrix Operations

Matrix Addition

Addition – Addition of matrices is almost similar to basic arithmetic addition. All you need is the order of all the matrices being added should be same. This point will become obvious once you will do matrix addition by yourself.

Suppose we have 2 matrices ‘A’ and ‘B’ and the resultant matrix after the addition is ‘C’. Then
Cij = Aij + Bij
For example, let’s take two matrices and solve them.
A = B=
1 2 3 2
4 5 8 6

Then,
C =
4 4
12 11

				
					%use s2
// define the given matrices
var A = DenseMatrix(arrayOf(
    doubleArrayOf(1.0, 2.0),
    doubleArrayOf(4.0, 5.0)))
var B = DenseMatrix(arrayOf(
    doubleArrayOf(3.0, 2.0),
    doubleArrayOf(8.0, 6.0)))

//Perform Addition
// C = A+B
val C = A.add(B)
println(C)

2x2
	[,1] [,2] 
[1,] 4.000000, 4.000000, 
[2,] 12.000000, 11.000000,

Scalar Multiplication

Scalar Multiplication – Multiplication of a matrix with a scalar constant is called scalar multiplication. All we have to do in a scalar multiplication is to multiply each element of the matrix with the given constant. Suppose we have a constant scalar ‘c’ and a matrix ‘A’. Then multiplying ‘c’ with ‘A’ gives-
c[Aij] = [c*Aij]

				
					%use s2
// define matrix A
var A = DenseMatrix(arrayOf(
    doubleArrayOf(1.0, 2.0),
    doubleArrayOf(4.0, 5.0)))
//define C
val C = 2.0
// M = AC
val M = A.scaled(C)
println(M)

2x2
	[,1] [,2] 
[1,] 2.000000, 4.000000, 
[2,] 8.000000, 10.000000,

Matrix-Vector Multiplication

To define multiplication between a matrix A and a vector x (i.e., the matrix-vector product), we need to view the vector as a column matrix. We define the matrix-vector product only for the case when the number of columns in A equals the number of rows in x. So, if A is an m×n matrix (i.e., with n columns), then the product Ax is defined for n×1 column vectors x. If we let Ax=b, then b is an m×1 column vector. In other words, the number of rows in A (which can be anything) determines the number of rows in the product b.

For a clear understanding, refer to the example below:

If matrix \(A = \begin{bmatrix}1 & 2 & 3\\4 & 5 & 6\end{bmatrix}\) and vector \(v=\begin{bmatrix}1\\2\\3\end{bmatrix}\), then \(Av = 1\begin{bmatrix}1\\4\end{bmatrix}+2\begin{bmatrix}2\\5\end{bmatrix}+3\begin{bmatrix}3\\6\end{bmatrix} = \begin{bmatrix}14\\32\end{bmatrix}\).

Let us illustrate the same using S2.

				
					%use s2
// define matrix A
var A = DenseMatrix(arrayOf(
    doubleArrayOf(1.0, 2.0, 3.0),
    doubleArrayOf(4.0, 5.0, 6.0)))
// define a vector
var v = DenseVector(arrayOf(1.0, 2.0, 3.0))

// B = Av
val B = A.multiply(v)
println(B)

[14.000000, 32.000000]

Matrix Matrix Multiplication

Matrix multiplication is one of the most frequently used operations in linear algebra. We will learn to multiply two matrices as well as go through its important properties.

Before landing to algorithms, there are a few points to be kept in mind.
The multiplication of two matrices of orders i*j and j*k results into a matrix of order i*k. Just keep the outer indices in order to get the indices of the final matrix.
Two matrices will be compatible for multiplication only if the number of columns of the first matrix and the number of rows of the second one are same.
The third point is that order of multiplication matters.
Don’t worry if you can’t get these points. You will be able to understand by the end of this section.
Suppose, we are given two matrices A and B to multiply. I will write the final expression first and then will explain the steps.

I have picked this image from Wikipedia for your better understanding.

In the first illustration, we know that the order of the resulting matrix should be 3*3. So first of all, create a matrix of order 3*3. To determine (AB)ij , multiply each element of ‘i’th row of A with ‘j’th column of B one at a time and add all the terms.

Properties of matrix multiplication

Matrix multiplication is associative provided the given matrices are compatible for multiplication

Matrix multiplication is not commutative i.e. AB and BA are not equal. We have verified this result above.

Matrix multiplication is used in linear and logistic regression when we calculate the value of output variable by parameterized vector method. As we have learned the basics of matrices, it’s time to apply them.

If \(A\) is an \(m \times n\) matrix and \(B\) is an \(n \times p\) matrix, then the matrix product of these two matrices \(AB\) is an \(m \times p\) matrix \(C\) whose \(k^{th}\) column is defined to be the product of \(A\) and the \(k^{th}\) column of \(B\).

For example, if \(A = \begin{bmatrix}1 & 2\\4 & 5\end{bmatrix}\) and \(B = \begin{bmatrix}3 & 4\\6 & 7\end{bmatrix}\), then the matrix product of \(A, B\) is:

\(AB = C = \begin{bmatrix}\begin{bmatrix}1 & 2\\4 & 5\end{bmatrix}\begin{bmatrix}3\\6\end{bmatrix} & \begin{bmatrix}1 & 2\\4 & 5\end{bmatrix}\begin{bmatrix}4\\7\end{bmatrix}\end{bmatrix}=\begin{bmatrix}3\begin{bmatrix}1\\4\end{bmatrix}+6\begin{bmatrix}2\\5\end{bmatrix} & 4\begin{bmatrix}1\\4\end{bmatrix}+7\begin{bmatrix}2\\5\end{bmatrix}\end{bmatrix}\)

Therefore, \(C = \begin{bmatrix}15 & 18\\42 & 21\end{bmatrix}\)

Implementing using S2 is much easier as you can see below.

				
					%use s2
// define matrices A &amp; B
var A = DenseMatrix(arrayOf(
    doubleArrayOf(1.0, 2.0),
    doubleArrayOf(4.0, 5.0)))
var B = DenseMatrix(arrayOf(
    doubleArrayOf(3.0, 4.0),
    doubleArrayOf(6.0, 7.0)))

// C = AB
val C = A.multiply(B)
println(C)

2x2
	[,1] [,2] 
[1,] 15.000000, 18.000000, 
[2,] 42.000000, 51.000000,

The Inverse of a Matrix

The inverse of an invertible matrix is another matrix which on multiplication with the given matrix results in an identity matrix.

For a matrix \(A\), \(A^{-1}\) is its inverse and these two matrices \(A\) and \(A^{-1}\) satisfy the equation: \(AA^{-1} = A^{-1}A = I\) where \(I\) denotes the \(n \times n\) identity matrix which has ones along the diagonal starting at the top left entry and zeros elsewhere.

Invertible Matrix Theorem

Statement: Suppose that \(A\) is an \(n \times n\) matrix, then the following are equivalent(that is, for a given matrix they are either all true or all false).

The transformation \(x \rightarrow Ax\) from \(\mathbb{R}^n\) to \(\mathbb{R}^n\) is bijective.
The range of \(A\) is \(\mathbb{R}^n\).
The null space of \(A\) is \(\left\{0\right\}\).

Proof:

Let us begin by showing that (2) and (3) are equivalent.
If the columns of \(A\) are linearly dependent, then the range of \(A\) is spanned by fewer than \(n\) vectors.
Therefore, if the rank of \(A\) is equal to \(n\), then the columns of \(A\) are linearly independent.
This implies that a linear combination of the columns is equal to the zero vector only if the weights are all zero. In other words, the only solution of the equation \(Ax = 0\) is the zero vector.
In other words, the null space of \(A\) is \(\left\{0\right\}\).
Conversely, if the null space of \(A\) is \(\left\{0\right\}\), then the columns of \(A\) are linearly independent, and the rank of \(A\) is therefore equal to \(n\).
By definition of bijectivity, (2) and (3) together imply (1), and (1) implies (2) and (3). Therefore, the
three given statements are equivalent.

Computing inverse of an invertible matrix using S2:

				
					%use s2

// Create a matrix
val A = DenseMatrix(arrayOf(
    doubleArrayOf(5.0, 4.0, 4.0, 1.0, 5.0, 4.0, 2.0, 4.0, 1.0, 1.0), 
    doubleArrayOf(4.0, 5.0, 2.0, 2.0, 1.0, 2.0, 4.0, 5.0, 5.0, 2.0), 
    doubleArrayOf(5.0, 5.0, 3.0, 3.0, 5.0, 2.0, 3.0, 4.0, 1.0, 3.0), 
    doubleArrayOf(1.0, 2.0, 5.0, 5.0, 3.0, 1.0, 4.0, 3.0, 3.0, 3.0), 
    doubleArrayOf(2.0, 2.0, 4.0, 2.0, 3.0, 1.0, 3.0, 5.0, 4.0, 4.0), 
    doubleArrayOf(5.0, 4.0, 5.0, 1.0, 1.0, 3.0, 2.0, 3.0, 3.0, 4.0), 
    doubleArrayOf(3.0, 4.0, 4.0, 3.0, 4.0, 3.0, 2.0, 5.0, 5.0, 5.0), 
    doubleArrayOf(3.0, 4.0, 3.0, 3.0, 2.0, 1.0, 4.0, 2.0, 2.0, 1.0), 
    doubleArrayOf(4.0, 1.0, 1.0, 1.0, 1.0, 4.0, 4.0, 2.0, 2.0, 1.0), 
    doubleArrayOf(1.0, 5.0, 5.0, 5.0, 1.0, 1.0, 2.0, 4.0, 1.0, 4.0)))

// Compute the inverse
val Ainv = Inverse(A)
println("inverse: $Ainv\n")

//verification
val I = A.multiply(Ainv) // A*A_inverse = I, the identity matrix
println("A*Ainv = I: $I\n")
val det: Double = MatrixMeasure.det(I)
println("determinant of I: $det")

inverse: 10x10
	[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] 
[1,] -0.053795, 0.540354, 0.510748, 0.754387, -0.463907, 0.379793, -0.429973, -0.927418, -0.199307, -0.302318, 
[2,] 0.001053, -0.302488, -0.254928, -0.616193, 0.184844, -0.173248, 0.320145, 0.765180, 0.035960, 0.192260, 
[3,] 0.147015, -0.039141, -0.159575, 0.114186, 0.055723, 0.143278, -0.125604, 0.033857, -0.110351, -0.006015, 
[4,] -0.038542, 0.511186, 0.375881, 0.827975, -0.618041, 0.145380, -0.229813, -0.892096, -0.138295, -0.131325, 
[5,] 0.043452, -0.227621, -0.015788, -0.167973, 0.124814, -0.146489, 0.205971, 0.313659, -0.016667, -0.069269, 
[6,] 0.105785, -0.243783, -0.286237, -0.295539, 0.006206, -0.156237, 0.295182, 0.273379, 0.264409, 0.178383, 
[7,] -0.086773, -0.457569, -0.250290, -0.616079, 0.523478, -0.228678, 0.128076, 0.843330, 0.287830, 0.162570, 
[8,] 0.124988, 0.342720, 0.122915, 0.217325, 0.045236, -0.040330, -0.304115, -0.571875, -0.039731, 0.070352, 
[9,] -0.012258, 0.205563, -0.049505, 0.243281, -0.199434, 0.080622, 0.113154, -0.106082, -0.144272, -0.205092, 
[10,] -0.239306, -0.394510, 0.010704, -0.450285, 0.297947, -0.029015, 0.238082, 0.378149, 0.185634, 0.129286, 

A*Ainv = I: 10x10
	[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] 
[1,] 1.000000, -0.000000, 0.000000, -0.000000, -0.000000, 0.000000, 0.000000, -0.000000, 0.000000, 0.000000, 
[2,] 0.000000, 1.000000, 0.000000, 0.000000, 0.000000, 0.000000, -0.000000, -0.000000, 0.000000, 0.000000, 
[3,] 0.000000, 0.000000, 1.000000, 0.000000, -0.000000, 0.000000, 0.000000, -0.000000, 0.000000, 0.000000, 
[4,] -0.000000, -0.000000, 0.000000, 1.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, -0.000000, 
[5,] 0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000, 0.000000, 0.000000, -0.000000, 0.000000, 
[6,] 0.000000, -0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000, -0.000000, 0.000000, 0.000000, 
[7,] 0.000000, -0.000000, 0.000000, 0.000000, -0.000000, -0.000000, 1.000000, -0.000000, 0.000000, 0.000000, 
[8,] 0.000000, -0.000000, 0.000000, -0.000000, 0.000000, -0.000000, 0.000000, 1.000000, 0.000000, 0.000000, 
[9,] 0.000000, -0.000000, 0.000000, -0.000000, 0.000000, -0.000000, -0.000000, 0.000000, 1.000000, 0.000000, 
[10,] 0.000000, -0.000000, 0.000000, 0.000000, 0.000000, 0.000000, -0.000000, -0.000000, -0.000000, 1.000000, 

determinant of I: 1.0000000000000016

Previous Topic

Back to Lesson

Next Topic