Random Numbers are used widely – such as encrypting sensitive information eg. bank details sent to web servers! Many implementations of encryption algorithms use long, random sequence of numbers (ie. high entropy) as keys (inputs). 

For our random numbers to be secure 🔒 – ie. they should not be biased towards a certain range, making guessing them harder. This is also known as uniformity as each and every number (inclusively in a given series) has an equal chance of being chosen.

Linear Congruential Methods (LCM) / Generators

Our first type of pseudorandom number generator is the Linear Congruential Generator(LCG).

A LCG generates these numbers from a initial value – the seed, and which we use as a input into a recurrence relation. Similar to a recursive function, each call made to the function takes in the previously generated number, and returns a output in a long sequence of seemingly “random” numbers

A recurrence relation is a way of expressing each element of a sequence as a function of the previous ones

This is also related to the idea of recursion – a concept that solves a problem by solving a smaller version of itself each time. For a recursion to occur, there must be a base or terminating case, and defined in terms of calling the function itself.

Example of Recusion: What is 3! (factorial)?

The linear congruential generator is defined by the equation below:

\(X_{n+1} = \left( a X_n + c \right)\bmod m\)

  • \(X_n\) refers to a random number (part of a long random number sequence)
  • \(X_{n+1}\) refers to the next term in the series (in this case, the output of the equation i.e, the random number) 
  • \(m\) , where \(0<m\) , the modulus
  • \(a\) , where \(0<a<m\), the multiplier
  • \(c,\,0 \le c<m\) , where \(c\) is the increment
  • \(X_{n=0},\,0 \le X_0 < m\) which is the starting (aka “seed”) value

Observations: We realise that the power of the largest variable \(X_n\) is 1, indicating that this is a first degree polynomial. This is known as a linear recurrence relation.

According to the equation, we realise that the formula continuously cycles between \(0\) and \(m-1\). We can demonstrate this through the use of a trace program (useful for exploring recursive cases). First, let’s create a function LinearCongruentialGenerator based on the formula above.

				
					// LCGs take in a dataclass type consisting of the various parameters
data class VarNums(val a: Int, var seed:Int, val c:Int, val modulus:Int)

fun LinearCongruentialGenerator(vn: VarNums) {
    var count = 0
    while (count≤vn.modulus) {
        vn.seed = (vn.a * vn.seed + vn.c) % vn.modulus
        count++
        println(vn.seed)
    }
}

LinearCongruentialGenerator((VarNums(23,12,17,37)))
				
			
				
					
3
28
23
24
18
25
12
3
28
23
24
18
25
12
3
28
23
24
18
25
12
3
28
23
24
18
25
12
3
				
			

Notice that the numbers seem to be random at first (3, 28, 23…) and then they repeat. This is known as the period of the function. This period is important as it dictates how long an algorithm can continue to churn out unique sequences of numbers before returning to the first value in a sequence, as this affects the viability of the “pseudo” random number generator.

A popular adaptation of the LCM is the Lehmer random number generator. It sets \(m\), the modulus to a prime number (or multiples of it), and sets the increment to 0. Most importantly, the initial seed \(X_0\) must be coprime to the modulus \(m\) which is not required for other Linear Congruential generators

Recall: What is co-prime?

Co-prime describe a relationship between 2 numbers whereby the only common positive integer that divides them (aka, greatest common divisor) is 1.Notice in the image how the line does not intersect at any other point​

Other examples of Linear Congruential generators can be found here . We can generate a RNG using LCGs such as the Lehmer generator below!

				
					var rng1 = Lehmer()
rng1.seed(1234567890L)
rng1.nextLong()

				
			
				
					1435150771
				
			

The methods available for the Lehmer generator are found here under Method Summary.

  • In this example, the Long type is used to store numbers (64 bit integers = ±2^63 = ± 9.22 * 10^18) which exceed the range of integers (ranges from 32 bit integers = ±2^31 = ±2 147 483 648).
  • Notice the it is calculated to the power-1 (eg. for 64-bit \(2^{64-1}=2^{63}\)) as one of the bits is used to store the sign (+ or -) for signed integers.
  • Additionally, when we assign the Linear function to the variable rng1, we need not specify the type in kotlin, as kotlin has type inference.

Try generating numbers using other generators such as LEcuyer . To sample from the distribution, they scale integers to within the same unit interval.

Now, let us use another common algorithm, Mersenne Twister. It had a vastly longer period of 219937 -1 is much more than 232 of the Linear Congruential Methods. Thus, it is one of the most widely used.

We can make use of kotlin to measure the time needed.

				
					// remember to import s2 library for RNG/sampling functions!
%use s2

var rng = MersenneTwister()

//get computer's system time
val startTime: Long = System.nanoTime()
val N: Int = 1000000

//generate 1 million random numbers
for (i in 0..N) {
    rng.nextDouble()
} 

//take curerent time and subtract from earlier time
val endTime = System.nanoTime()
val duration = (endTime-startTime)
val ms = duration.toDouble()/1000000

println("%f millisecs to generate %d random numbers".format(ms, N))
				
			
				
					44.066598 millisecs to generate 1000000 random numbers
				
			

Practice time: Contrast this with the time needed for the Lehmer generator.

Which is shorter? 🤔

Be careful! Parameters of a LCG matter significantly

We can try to plot some of the generated random numbers using the Linear Congruential generator using a scatter plot.

A scatter plot can allow us to quickly visualise the spread of the random numbers and see if there are any any underlying patterns. For instance, if we were to pick random numbers without proper consideration, we can end up with obviously not “random numbers”.

%use s2, lets-plot
import java.util.Arrays;
// Parameters taken from https://www.mimuw.edu.pl/~apalczew/CFP_lecture3.pdf   val a = 1229 val modulus = 2048 val c = 5000 val seed = 1234567890   data class VarNums(val a: Int, var seed:Int, val c:Int, val modulus:Int)   fun LinearCongruentialGenerator(vn: VarNums): IntArray{   var rngs = IntArray(2048) var count = 0 // initialise increment in recurrence relation while (count < vn.modulus) { vn.seed = (vn.a * vn.seed + vn.c) % vn.modulus rngs[count] = vn.seed //println(vn.seed) //println(count) count++   } return rngs }   var rngs = LinearCongruentialGenerator((VarNums(a, seed, c, modulus)))   // Split data arbitrarily for plotting with x and y coords val data1 = mapOf( "xvar" to Arrays.copyOfRange(rngs, 0,1020).toList(), "yvar" to Arrays.copyOfRange(rngs, 1020,2040).toList() )   // construct a scatter plot val p2 = ggplot(data1) { x = "xvar"; y = "yvar"} + // supply data; set legends ggsize(500, 250) + // plot size // draw a scatter plot geom_point() p2

As you can see from the diagram below, a quick visualisation can allow to see if the numbers generated appear random. Patterns can be seen on the 2 charts below, meaning that the sequence of random number are anything but random!

Parameters used for 2 charts below:

  • a = 1229
  • modulus = 2048
  • c = 5000
  • seed = 1234567890
c=500
c = 3

The chart below uses different parameters, and the points below seem much more random

Parameters used for chart below:

  • a = 1103515245
  • modulus = 2 ^32
  • c = 19851
  • seed = 1234567890
c = 19851

In 2016, Samsung Pay was found to have a vulnerability in which tokens translating credit card data were not well randomised and could possibly be predicted. This allowed potential hackers to eavesdrop on these magnetic signals when a user activated their Samsung Pay and reverse engineer their credit card data for fradulent purposes.

While it is beyond the scope of this chapter to explore best practices for picking seeds or optimal parameters, one must always be careful and use tried and tested methods before picking their parameters, or using in built ones from NMDev to ensure that randomization occurs properly!