0% found this document useful (0 votes)
10 views8 pages

Nabeel Research Paper

just to download docs oops

Uploaded by

Muhammad Saqib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

Nabeel Research Paper

just to download docs oops

Uploaded by

Muhammad Saqib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Principle Component Analysis of Non-stationary Time Series Data

Nabeel Ahmed, Dr Zahid Hussain Shaikh.

1. Abstract
This article explores the use of Principal Component Analysis (PCA) on non-stationary time series data. The essence of
understanding basic patterns is essential due to the fact that dynamic and complex data sets are increasingly on rise in
various areas like healthcare, Environmental studies as well as finance. The study utilized principal component analysis
as a powerful statistical method to identify main characteristics and diminish the intricacy of non-stationary time series
data. PCA represents a statistical method which transforms original variables into new ones that have no correlations
among themselves [1]. Find out the main patterns of variation within the datasets using these new variables. They offer
a clear explanation of the analytical and preparatory stages in the dynamics of data such as the Karachi exchange
100(KSE-100). We recognize the principal component analysis (PCA) theoretical foundations in the context of moving
average.

Keywords: PCA, Non-stationary, VAR,KSE-100.

2. Introduction
Financial markets change often and quickly and so they provide investors, policymakers and analysts with opportunities
and challenges. Advanced tools and methodologies are necessary in order to study and predict them because their
turbulence depends on many factors like economic statistics, political occurrences together with investor sentiments.
The KSE 100 Index is an important time series dataset for investigating non-stationary time series data properties in
financial markets, since it is a stock market index representing Pakistan’s major stock exchange. This article proposes an
innovative approach to analyzing and forecasting values of the KSE 100 Index using Principal Component Analysis (PCA)
in conjunction with the Vector Auto Regression (VAR) model technique.

A considerable challenge in conventional time series analysis techniques arises from non-stationary time series data.
Most often than not, actual financial time series data never gets to the stationary state usually presupposed by common
methods hence leading to imprecise prediction and evaluations. In order to resolve this challenge, our study employs
Principal Component Analysis (PCA), which offers a less complex method for reducing dimensionality [1], in order to
identify and abstract the important features from KSE 100 Index data. Using principal components will simplify the
analysis while improving its quality by converting the original dataset to a number of linearly independent variables. This
allows for a more manageable and informative examination of the data's fundamental structure and behavior. Building
on the work laid out by PCA, we now introduce the VAR model on the processed data set. Given its ability to recognize
linear associations among many time-series variables including financial indices (which is fundamental) particularly
prominent ones (implying large correlations), one may say that this particular forecasting technique suits best for
prediction of financial indices. Also, so as to completely incorporate non-stationarity in the original data space accurately
within the transformed components, we employ these transformed components as our input variables into a vector
auto regression model for capturing non-stationarity fundamentally in the KSE 100 Index, hence improving the precision
and dependability of our forecasts.

3. Literature Review
This project will use an AR (1) model, VAR model, and Principal Component Analysis. We will provide concise
explanations for all of these models and techniques.
Autoregressive (AR) Model [2]
An AR (1) model, also known as an Auto Regressive model of order one, is a basic time series model that
represents the current value of a variable as a linear combination of its previous values, with a one-time
period delay. Mathematically, an autoregressive model of order 1 (AR(1)) may be expressed as:

X t =∅ X t −1 +∈t

where

 X t Is the value of the time series at time t


 ∅ is the autoregressive parameter, representing the influence of the previous time

on the current one,

 X t −1 is the value of the time series at the previous time (lag 1),
 ∈t is the white noise or error term at time t , representing the random shocks or disturbances. The
autoregressive parameter ∅ determines the strength and direction of the relationship between the
immediate past value and the current value. If |ϕ|< 1, then the model is stable and the earlier values
will have little influence over time. For time series analysis, AR(1) model is used widely so that
processes involving forecast can be modeled according to cases that depend lineally on their
immediate past periods
VAR model [3]
A Vector Autoregressive (VAR) model is a statistical model used to analyze the interdependencies among
several time series variables. The multivariate autoregressive (AR) model is an expansion of the univariate AR
model that allows for the simultaneous analysis of numerous time series variables.

Below is an analysis of the fundamental elements and attributes of a VAR model:

Vector: Vector: The name "vector" signifies that the model stresses on processing multiple time series
variables. These variables are arrayed in the form of a vector where each part stands as an individual variable.

Autoregressive: Every variable in VAR model is regressed on its own lagged values as well as on the lagged
values of other variables in the system, just like in the univariate autoregressive model.

Multivariate: VAR models are Multivariate and can process a variety of related variables simultaneously,
allowing for their interdependence and feedback impacts’ analysis.

Order: A variety of related variables can be processed simultaneously by VAR models, which are Multivariate,
allowing for their interdependence and feedback impacts’ analysis

Parameter estimation: Parameter estimate in a VAR model is often accomplished through approaches as
ordinary least squares (OLS), maximum likelihood estimation (MLE), or Bayesian processes.

Impulse response functions: Impulse response functions are computed using VAR models to analyze the
dynamic reactions of the system to shocks.
Granger Causality [4]: VAR models may be used to examine Granger causality, a method that evaluates
whether previous values of one variable provide valuable insights into forecasting another variable.

A Vector Autoregressive (VAR) model may be formally defined as follows:

Let p be the order of the VAR model, and k represent the number of time series variables. Subsequently, a p-
order vector autoregressive (VAR) model with k variables may be represented in the following manner:

For each variable y i where i=1 , 2 ,3 , … … k , the p-order VAR model can be written as:
k p
y i ,t =c i + ∑ ∑ A il, j y j , t−l +∈i ,t
j=1 l=1
where,
y i ,t represents the value of the ith variable at time t .
c i is the intercept term for the ith variable.
Ail , j is jth variable in the equation for the ith variable.
p is the number of lagged values in the VAR model.
∈i ,t is the ith variable’s error term at time t, capturing the part of y i ,t that is not explained
by the lagged values of the variables.

Principal Component Analysis [1]


Principal component analysis (PCA) is a statistical method of extracting the key variables from a complex
dataset. It frequently is used in various domains, such as machine learning, pattern recognition, and image
analysis. The main objective of PCA is to transform a dataset of possibly correlated variables into a new set of
variables that are uncorrelated; these new variables are called principal components. Principle components
are constructed by creating mixtures of the initial variables in a linear model and ranking them on the quantity
of variation each explains in the data.

4. Methodology
Data Preprocessing
Identify and acquire non-stationary time series data from suitable sources. Perform data cleaning and
preparation, including resolving missing values and outliers. Detrend the time series data using suitable
procedures to make it more stationary.

Data Engineering
Initially we have taken the five variables.
We then created following variables.

a) P7: Price after week


b) Momentum: Momentum is the rate of acceleration of a security's price—that is, the
speed at which the price is changing.
M =P7 −P
c) Volatility:
V =Std(P)
d) Price Return:
Pclose−P open
P= ×100
Popen
e) Range: Range=ln ( Open )−ln (close)

Correlation matrix
Now checking their correlation between them by finding their variance-covariance matrix function.

Figure 1

Some variables are highly correlated. We have neglected them and introduced some new
variables which are uncorrelated.
Figure 2

Component selection
Now we will select the components by scree plot test.

Figure 3

This graph shows that we must select 3 components.

Principal Component Analysis (PCA)


After finding components we will find its loadings.
Figure 04

After PCA, we got these loadings of the respective components.

Figure 05

Through analysis, we concluded that Factor 1 (containing Price and its range), is stationary.
Then, we just jumped into the forecasting of the Price Return of the Karachi Stock Market. We
use VAR model for the prediction of the model.

5. Results and Discussion


We have forecasted the results by applying our selected model (i.e. VAR).
It was shown that the PCA is very applicable in analyzing non-stationary time series data. Especially when
dealing with complex and dynamic datasets such as those in finance, healthcare or environmental studies, the
PCA, with its capability to lower dimensionality while bringing out essential patterns, is advantageous.

PCA ability to recognize and separate major patterns of variation in the data was emphasized in the study. This
is important particularly in non-stationary time series since ordinary time series analysis techniques might not
be able to reveal such patterns owing to the dynamic nature of the data.

With KSE-100 index, the overall trends and periodic fluctuations can be identified for instance which give
investors some hints on the way can forward in terms of their investments decisions as well as how they can
manage risks.

References
[1]. Lansangan, J. R. G., & Barrios, E. B. (2009). Principal components analysis of nonstationary time series
data. Statistics and Computing, 19, 173-187.
[2]. Hamilton, J. D. (2020). Time series analysis. Princeton university press

[3]. Lütkepohl, H. (2005). New introduction to multiple time series analysis. Springer Science & Business Media.

[4]. Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods.
Econometrica: journal of the Econometric Society, 424-438.

You might also like