Time Series: Introduction

A starting point for time-series data analysis

Idit Cohen
6 min readMay 2, 2022
Image by anncapictures from Pixabay

A time series is a series of data points ordered in time[9]. Time series adds an explicit order dependence between observations: a time dimension[6]. In a normal machine learning dataset, the dataset is a collection of observations that are treated equally when the future is being predicted. In time series the order of observations provides a critical source of information that should be analyzed and used in the prediction process. Time series are typically assumed to be generated at regularly spaced intervals of time (e.g. daily temperature), and so are called regular time series. But the data in a time series doesn’t have to come in regular time intervals. In that case, it is called irregular time series. In irregular time series, the data follows a temporal sequence, but the measurements might not occur at regular time intervals. For example, the data might be generated as a burst or with varying time intervals [1]. Account deposits or withdrawals from an ATM are examples of irregular time series.

Time series can have one or more variables that change over time.

If there is only one variable varying over time, we call it Univariate time series. If there is more than one variable it is called Multivariate time series. For example, a tri-axial accelerometer. There are three accelerations variables, one for each axis (x,y,z) and they vary simultaneously over time.

Time-series applications

Time series are used in various fields such as mathematical finance, manufacturing, event data (e.g. clickstreams and application events), IoT data, and generally in any domain of applied science and engineering which involves temporal measurements. Time series DBMS (database management systems) is the fastest-growing segment in the database industry, can testify on the growing need for time series forecasting in the industry.

The trend of the last 24 months [2] by By author

Time series analysis

Time series analysis extracts meaningful statistics and other characteristics of the dataset in order to understand it. Time series analysis can help to make better predictions, but this is not necessarily the main goal of the analysis. In practice, a suitable model is fitted to a given time series, and (in the case of supervised learning) the corresponding parameters are estimated using the known data values [3]. The time series analysis process comprises methods that attempt to understand the nature of the series and is often useful for future forecasting and simulation.

Time series forecasting

Time series forecasting involves taking models that fit historical data (the training set) and using them to predict future observations (the test set). In the first step past observations are collected and analyzed to develop a suitable mathematical model which captures the underlying data generating process for the series. In the second step, the future events are predicted using the model. This approach is particularly useful when there is a lack of a satisfactory explanatory model. Making predictions about the future is called extrapolation in the classical statistical handling of time series data. More modern fields focus on the topic and refer to it as time series forecasting. The skill of a time series forecasting model is determined by its future prediction performance. Time series forecasting has important applications in various fields. Over the past several decades, many efforts have been made by researchers to the development and improvement of suitable time series forecasting models. This is often at the expense of being able to explain why a specific prediction was made, confidence intervals, and an even better understanding of the underlying causes behind the problem[6].

Types of Time Series

Deterministic time series

A deterministic time series is one that can be expressed explicitly by an analytic expression. It has no random or probabilistic aspects. In mathematical terms, it can be described exactly for all time in terms of a Taylor series expansion provided that all its derivatives are known at some arbitrary time. Its past and future are completely specified by the values of these derivatives at that time. If so, then we can always predict its future behavior and state how it behaved in the past.

Non-deterministic time series

A non-deterministic time series is one that cannot be described by an analytic expression. It has some random aspect that prevents its behavior from being described explicitly. A time series may be non-deterministic because:

  1. All the information necessary to describe it explicitly is not available, although it might be in principle.
  2. The nature of the generating process is inherently random.

Since non-deterministic time series have a random aspect, it follows probabilistic laws. Thus, the data is defined by statistical terms, i.e. by probability distributions and averages of various forms, such as means and variances.

Stationary time series

A stationary time series is one whose statistical properties such as mean, variance, autocorrelation, etc, do not depend upon time. A stationary series is relatively easy to predict: you simply forecast that its statistical properties will be the same in the future as they have been in the past. Thus, most statistical forecasting methods are based on the assumption that the time series is approximately stationary.

most statistical forecasting methods assume that the series can be rendered (approximately) stationary through mathematical transformations.

Non-stationary time series

Non-stationary series is one whose statistical properties change over time. There are an infinite number of ways for a time series to be non-stationarity, such as changing variance, level shifts, seasonality in the 6th moment, etc. Here are the most common non-stationarity patterns:

Trend component: The trend shows the general tendency of the data to increase or decrease during a long period of time. A trend is a smooth, general, long-term, average tendency. It is not always necessary that the increase or decrease is in the same direction throughout the given period of time[8]. If a time series does not show an increasing or decreasing pattern then the series is stationary in the mean.

Time series with trend [4]

Cyclical component: Any pattern showing an up and down movement around a given trend is identified as a cyclical pattern. In a cyclical pattern, the up and down movements do not occur in constant time intervals, they can not be predicted.

Seasonal component: If the series peaks and troughs occur in regular intervals the pattern is called a seasonal pattern (e.g. sales of ice cream).

Random component: the residual is what’s left over when all the patterns have been removed. Residuals are random fluctuations. You can think of them as a noise component.

On the left Seasonal pattern on the right the random residual [4]

Summery

Most statistical forecasting methods are based on the assumption that the time series is approximately stationary. A stationary series is relatively easy to predict: you simply forecast that its statistical properties will be the same in the future as they have been in the past. Analysis of time series patterns is the first step of converting non-stationary data into stationary data (for example by trend removal) so that the statistical forecasting methods could be applied [5]. There are three fundamental steps in building a quality forecasting time series model: making the data stationary, selecting the right model, and evaluating model accuracy.

--

--