ARIMA Model - NM EDUCATION

Introduction to Data Science Time Series Analysis ARIMA Model

What is ARIMA?

The acronym ARIMA refers to “autoregressive integrated moving average.”

In statistics and economics, it’s a method of estimating events that take place over time. A series of historical data is used as a model to predict future results. It’s used when a metric is recorded consistently, such as every fraction of a second, every day, every week, or every month. Box-Jenkins method is one type of ARIMA model.

A mathematical model, the Box-Jenkins, uses inputs from a given time series to predict data ranges. For forecasting purposes, the Box-Jenkins Model can be applied to various kinds of time series data.

To determine outcomes, it uses differences between data points. It uses auto regression, moving averages, and seasonal differencing to identify trends and generate forecasts. ARIMA models may be used to forecast future stock prices or earnings by analyzing a company’s past performance.

To smooth time series data, ARIMA uses lagged moving averages. The forecasting of future security prices is a common application of technical analysis. In autoregressive models, the future is assumed to be similar to the past. In the event of a financial crisis or rapid technological change, such forecasts can therefore prove inaccurate.

In Simple words ARIMA is used for:

In ARIMA, historical data is analyzed to predict or forecast future outcomes. Data points from the past influence data points in the future, which are a concept known as serial correlation.

Working of ARIMA forecasting?

The ARIMA algorithm is utilized to forecast the variable of interest by incorporating time series data. Statistical software determines how many lags should be applied to the data or the amount of differencing required for the data and checks for stationarity. A regression model will produce results that are often interpreted similarly to those from a multiple linear regression.

Understanding Autoregressive Integrated Moving Average

An As a form of regression analysis, autoregressive integrated moving average analyses is used to measure the strength of one dependent variable in relation to other variables under change. By examining differences between values instead of the actual values in a series, the model tries to predict future securities or market moves.

Each of the components of an ARIMA model can be explained as follows

Auto regression: An AR model reflects the lagged value of a variable regressed on its own current value.
Integrated (I): Allows for the tracking of time series to become stationary by differencing raw observations (data values are replaced with the difference between the data values and the prior values)
Moving Average :A moving average model applied to lag observations incorporates the dependency between observations and residual errors.

Moving and autoregressive averages are combined in ARIMA. AR(1) autoregressive processes, for instance, are based on the last value, while AR(2) processes are based on the two values preceding the current value. In order to smooth out the influence of outliers, the moving average is calculated by creating several averages of different subsets of the full data set. ARIMA models can incorporate trends, cycles, seasonality, and other types of non-static information when making forecasts because of this combination of techniques.

Parameters in ARIMA

The parameters in ARIMA are described using a standard notation. For ARIMA models, a standard notation would be ARIMA with p, d, and q, which specify the type of the ARIMA model by substituting integer values for the parameters. The parameters can be defined as:

p: the number of observed lagged measurements in a model (lag order).

d: the number of times each observation has been differencing (degree of differencing)

q: the size of the moving average window(order of the moving average)

When building a linear regression model, the types and number of terms are included. The 0 value, which is a parameter, would indicate that the component should not be considered in the model. By creating an ARIMA model in this way, it can be constructed to perform the same functions as an ARMA model or even a simple AutoRegression, Integrated, or Moving average model.

Explaining Autoregressive Integrated Moving Average (ARIMA) and Stationarity

In To make the data stationary, they are differenced in an autoregressive integrated moving average model. Models that show stationarity are those in which data remains constant throughout a period of time. In order to remove trends in economic and market data, differencing is used.

Seasonality, or the regularity of data patterns that repeat over the course of a calendar year, could adversely affect the regression model. It is difficult to do many computations throughout the process with great efficiency if a trend appears and stationarity is unclear.

In an ARIMA model, a shock will affect future values indefinitely. As a result, today’s autoregressive models carry the legacy of the financial crisis.

Case Study: How ARIMA made a difference

In ARIMA models, current values or future values are determined by residual effects resulting from past values. When deciding how much to trade for a security, an investor using an ARIMA model to forecast stock prices will assume that recent market transactions will influence new buyers and sellers.

In many circumstances, this assumption is correct, but it is not always true. Most investors, for example, did not realize the risks posed by mortgage-backed securities (MBS) held by many financial institutions prior to the 2008 Financial Crisis.

Investors using an autoregressive model at the time would have had good reason to predict that U.S. financial stocks would continue to trend upward. As soon as it became known to the people that many financial institutions were at risk of imminent collapse, investors began to care far less about recent stock prices and far more about the underlying risk in the company. A model based on autoregression would have been completely confused.

Dataset

Below you can see that how data is visible which shows that data is of almost 1K product withing the time duration of almost 120 days. Within the data we also have around 100 key products which are more popular and distributed with around 13490 customers.

Coding

Lets dive into the coding part for our anima which in this case is used to predict the sales of key products over a specific period of time and for the same we have imported and initialized all the necessary and useful libraries below.

				
					%use s2
// all the import libraries has been included
import kotlin.Throws
import kotlin.jvm.JvmStatic
import java.util.StringTokenizer
import org.apache.commons.math3.stat.regression.SimpleRegression
import java.io.*
import java.lang.Exception

Here we are trying to predict the list of product by applying various condition and methods within class on the dataset and later initializing the tokenizers.

				
					class prediction {
    @Throws(IOException::class)

    // configuring with the dataset
    fun main() {
        var file: BufferedReader
        file = BufferedReader(FileReader("product_distribution_training_set.txt")) // loading data
        val output = File("output.txt")// the output file specified
        output.delete()
        var lines = 0
        var tokens: Int
        var i = 0
        var ID: Int
        var j: Int
        var k: Int
        var l: Int
        val first = 88
        val last = 117
        while (file.readLine() != null) lines++
        file.close()
        file = BufferedReader(FileReader("product_distribution_training_set.txt"))// dataset for training the data
        var product = file.readLine()
        val inputfile = IntArray(118) // araay for total series of days
        val sales = Array(117) { DoubleArray(2) }
        forecasttotal()
        //initialising tokenizers according to the condition
        try {
            i = 0
            while (i &lt; lines) {
                tokens = 0
                val token = StringTokenizer(product)// declaring tokenizer 
                ID = token.nextToken().toInt()
                while (token.hasMoreTokens()) { // for tokens to adds up
                    inputfile[tokens++] = token.nextToken().toInt()
                }
                l = 0
                j = l
                k = l
                while (l &lt; tokens - 1) { 
                    sales[j++][0] = inputfile[l].toDouble()
                    sales[k++][1] = inputfile[l + 1].toDouble()
                    l++
                }
                Ar1model(sales, ID, first, last) // extract the product with mentioned features
                product = file.readLine()
                i++
            }
        } catch (Except: Exception) { // for exception in condition
            println(Except)
        }
    }

In this section we are trying to perform some operation on tested data and trying to get some result by adapting similar approach as above.

				
					 @Throws(IOException::class)
    fun forecasttotal() {
        // TODO Auto-generated method stub
        val file = BufferedReader(FileReader("./test.txt"))// reading the tested file
        val inputfile = IntArray(118) // passing the arrays of days
        val total = IntArray(118)
        var i = 0
        var j: Int
        var tokens: Int
        while (i &lt; 29) {
            total[i] = 0
            i++
        }
        i = 0
        val sales = Array(118) { DoubleArray(2) }
        var product = file.readLine() // returing the product with satisfied condition
        try {
            while (product != null) {
                tokens = 0
                val token = StringTokenizer(product)
                token.nextToken().toInt()
                while (token.hasMoreTokens()) {
                    inputfile[tokens++] = token.nextToken().toInt()
                }
                j = 0
                i = 0
                while (i &lt; tokens) {
                    total[j++] += inputfile[i]
                    i++
                }
                product = file.readLine()
            }
        } catch (except: Exception) {
            println(except)
        }
        file.close()
        var k = 0
        while (k &lt; total.size - 1) {
            sales[k][0] = total[k].toDouble()
            sales[k][1] = total[k + 1].toDouble()
            k++
        }
        val writer = BufferedWriter(FileWriter(&quot;output.txt&quot;, true))
        writer.append(Integer.toString(0)) // making approapraite changes to the data
        Ar1model(sales, 0, 88, 117) // trying to extract the list of products within range
    }

In this block of code we can see that with use of regression algorithm we are trying to find out the outcome by appending the values and and started prediction by different means.

				
					   // for input/output exception
    @Throws(IOException::class)
    fun Ar1model(product: Array, ID: Int, startpred: Int, lastpred: Int) {
        val writer = BufferedWriter(FileWriter("output.txt", true)) // importing data
        writer.append(Integer.toString(ID))
        val training = SimpleRegression() // using regression on the data as algorithm
        training.addData(product)
        var i = 0
        i = startpred
        // here we are adding the generated values in order to predict
        while (i &lt; lastpred) {
            val value = training.predict(product[i][1])
            writer.append(&quot;\t&quot;)
            writer.append(Integer.toString(Math.floor(value).toInt()))
            i++
        }
        writer.append(System.lineSeparator())
        writer.close()
    }
}

In this last section we are simply passing a condition to predict the final outcome as same.

				
					val neural = prediction()//defining a varialble for prediction
            
            for (i in 0..299999) {
                neural.main()
            }
            // useing the for loop for extraction the values in the range

Output

Summarizing ARIMA

The economic sectors have started to value the ability of forecasting.The ARIMA model is, for instance, used by manufacturing companies for business. The company’s supply chain and production activities can be severely disrupted if its forecast is incorrect. Lowering costs and meeting the expectations of customers will be easier with accurate predictions.

Besides climate studies, the ARIMA model can also be used for studying weather patterns that could lead to severe storms, such as those associated with greenhouse gases.

Previous Topic

Back to Lesson

Next Lesson