![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
How do weather forecasters predict tomorrow’s weather, or stock market analysts analyze future market trends? It all comes down to a powerful statistical technique known as time series forecasting.
By analyzing past observations of a time series data, this method can predict its future values. It has found wide-ranging applications in fields such as finance, economics, medicine, weather forecasting, earthquake prediction and more.
Implementing time series forecasting techniques can empower businesses to make informed decisions, anticipate customer demands well in advance, and gain a competitive edge in their respective markets.
Time series forecasting can provide insights into a wide range of questions, such as:
Here, we’ll dive deeper into the fascinating world of time series forecasting, learning the steps taken to make forecasts with time series data, and which methods are most commonly used.
Preparing the data for time series forecasting is a critical step in the modeling process. Properly preparing the data can involve different people or teams, depending on the complexity of the data and the organization.
Data scientists or analysts may collect, clean, and prepare the data in some cases. Domain or subject matter experts may also assist in identifying relevant variables and formatting the data appropriately. Data engineers may help set up data pipelines and automate data collection and storage. A cross-functional team, including individuals with expertise in data science, domain knowledge and engineering, may be responsible for preparing data for time series forecasting.
This can help to improve the accuracy of the model and ensure that it can make useful predictions on new data.
Data should be collected at regular intervals; you will need to select an appropriate time interval to ensure accuracy. Imputing values or interpolating missing data can be used to handle incomplete data.
To prepare the data for time series forecasting, you need to follow these steps:
It is also important to split the data into training and testing sets. The training set is used to fit the model, while the testing set is used to evaluate the performance of the model on new data. This can help to ensure that the model is not overfitting to the training data and can generalize well to new data.
Once the data is prepared, it’s time to choose a forecasting method. The method you use to analyze the data and make forecasts depends on the problem you are trying to solve and the nature of the data.
Time series forecasting uses a variety of statistical and machine learning-based methods. Statistical methods typically involve modeling the underlying patterns and trends in the data, while machine learning methods use algorithms to learn patterns and make predictions.
Some popular statistical methods for time series forecasting include:
Machine learning methods for time series forecasting include:
When selecting a time-series forecasting method, it is important to consider the balance between accuracy and interpretability. While some methods may offer greater accuracy, they may be more intricate and challenging to interpret, while simpler techniques may be more straightforward to understand but could compromise accuracy.
Each method carries advantages and disadvantages. Method selection is based on factors such as data characteristics, forecasting horizon (i.e., the length of time into the future being forecast), and computational complexity. Here’s a look at some of the most commonly used methods.
ARIMA is a popular statistical method for time series forecasting that models the autocorrelation of the data using three components: autoregression (AR), differencing (I), and moving average (MA).
The AR component captures the dependence of the current value on previous values, while the MA component captures the dependence on the previous error terms. The I component is used to remove trends and seasonality from the data.
ARIMA is a flexible method that can handle a wide range of time-series patterns, making it popular in fields such as finance, economics and marketing.
SARIMA (seasonal autoregressive integrated moving average) is a variation of the ARIMA model that is specifically designed to handle time series data with seasonality. It includes the same three components as the ARIMA model (AR, I, and MA) but also includes additional seasonal components.
The seasonal component captures the dependence of the current value on previous values from the same season (such as, the same month of the year). The seasonal differencing component removes the seasonal patterns from the data, and the seasonal moving average component captures the dependence on the previous error terms for the seasonal component.
However, SARIMA models can be more complex than standard ARIMA models and may require more data and computational resources to train.
Advantages of an ARIMA Model
Disadvantages of an ARIMA Model
Exponential smoothing is a simple time series forecasting method that assigns more weight to recent data points while gradually decreasing the weight for older data points. It is a popular method for short-term forecasting, as it can quickly adapt to changes in the data.
Several different techniques fall under the umbrella of exponential smoothing, including simple exponential smoothing, double exponential smoothing, and triple exponential smoothing.
Simple exponential smoothing is the most basic form of this method, and it is used to forecast a time series that does not exhibit any trend or seasonality. Simple exponential smoothing uses a single smoothing parameter, alpha, which controls the weight given to the past observations. The forecast for the next period is a weighted average of the past observations, with more weight given to the most recent observation.
Double exponential smoothing, also known as Holt’s method, is used to forecast a time series that exhibits a trend but no seasonality. Double exponential smoothing uses two smoothing parameters, alpha and beta, which control the weights given to the past observations and the past trends, respectively. The forecast for the next period is a weighted average of past observations and past trends.
Triple exponential smoothing, also known as the Holt-Winters method, is used to forecast a time series that exhibits both trend and seasonality. Triple exponential smoothing uses three smoothing parameters — alpha, beta and gamma — that control the weights given to previous observations, trends and seasonal variations, respectively. The forecast for the next period is a weighted average of past observations, trends and seasonal variations.
The TBATS Model
TBATS (trigonometric seasonality, Box-Cox transformation, ARMA errors, trend, and seasonal components) is a state-of-the-art time series forecasting model that extends the basic exponential smoothing framework. It’s a hybrid model that combines the strengths of exponential smoothing and other time series forecasting techniques, such as ARIMA and Fourier analysis, to capture a wide range of temporal patterns in the data.
TBATS model is based on exponential smoothing, in the sense that it uses a similar framework of exponentially weighted moving averages to model the level, trend, and seasonality of the time series. However, TBATS also includes additional components to model more complex temporal patterns, such as multiple seasonal periods, long-term trends, and non-linear relationships between the predictors and the response variable.
Advantages of Exponential Smoothing
Disadvantages of Exponential Smoothing
Seasonal decomposition is a method that separates the time series data into its trend, seasonal and residual components. The trend component represents the long-term pattern in the data, while the seasonal component represents the repeating patterns over time. The residual component represents the noise or irregular variation in the data.
Seasonal decomposition can provide insights into the underlying patterns and trends in the data, making it useful for understanding the seasonality and trend of a time series.
Facebook Prophet is an open source, time series forecasting library published by Facebook that is based on decomposable models, specifically trends, seasonality, and holidays. Prophet is designed to be flexible, scalable and easy to use, and it can be applied to a wide range of time series forecasting problems.
Prophet uses a generalized additive model (GAM) framework, which allows for non-linear relationships between the predictors and the response variable. The model includes several components:
Prophet also provides several features that can help with model selection and tuning, including automatic selection of changepoints, hyperparameter optimization, and cross-validation. Additionally, Prophet provides uncertainty estimates for the forecasts, which can help with decision-making and risk management.
One of the key advantages of Prophet is its ease of use and its ability to handle complex time series data without requiring extensive domain knowledge or data preprocessing.
The vision for the future of Facebook Prophet is outlined in this article published in February on Medium, by Cuong Duong, a staff data scientist at Canva who is also a Prophet maintainer.
Advantages of Seasonal Decomposition
Disadvantages of Seasonal Decomposition
In traditional time series forecasting, a univariate time series is used to make predictions, which means that only one variable or feature is considered. However, in many real-world applications, there are often multiple variables that are interdependent and can affect each other.
Neural networks are a type of machine learning model that consists of interconnected layers of nodes, each of which performs a specific mathematical operation on the input data. They are popular in time-series forecasting as they can capture non-linear patterns and interactions between variables. Neural networks can be trained to learn the patterns in the data and then use those patterns to make forecasts.
Long short-term memory (LSTM) and gated recurrent unit (GRU) are both types of recurrent neural networks (RNNs) that are capable of processing sequential data. They are designed to handle the vanishing gradient problem that can occur in traditional RNNs, where information is lost as it passes through multiple layers of the network.
LSTM and GRU networks use gated units to selectively remember or forget information from previous time steps. This allows them to effectively capture long-term dependencies and patterns in the time series.
1D CNNs — convolutional neural networks — are a type of neural network architecture commonly used in image processing. However, they can also be applied to time series forecasting by treating the time series as a 1D image. The convolutional layers in the network can learn to extract features from the time series, such as trends and patterns, which can then be used to make predictions.
To use these neural network architectures for multivariate time series forecasting, the input data must be structured appropriately. Each observation in the time series should be represented as a vector of features, and the target variable should be included as one of these features. The input data can then be split into training and testing sets, and the neural network can be trained on the training data using a suitable loss function and optimization algorithm.
Once the neural network is trained, it can be used to make predictions on the test data. The predicted values can then be compared to the actual values to evaluate the performance of the model. In multivariate time series forecasting, it is important to use appropriate evaluation metrics that take into account the interdependence of the variables.
Additionally, 1D CNNs are often faster and more computationally efficient than traditional time series forecasting methods, such as ARIMA or exponential smoothing, especially when dealing with large datasets.
Advantages of Neural Networks
Disadvantages of Neural Networks
Deep learning models are a subset of neural networks that consists of multiple layers of interconnected nodes, with each layer learning more abstract representations of the input data. They are popular in time-series forecasting as they can capture very complex patterns and relationships in the data.
Deep learning models require a large amount of training data and computational resources, but they can provide highly accurate forecasts.
Besides RNNs and CNNs, several techniques are commonly used in deep learning models, including:
Autoencoders
A type of deep learning model that can be used for unsupervised learning and dimensionality reduction. Autoencoders learn to reconstruct the input data from a compressed representation, allowing them to capture the most important features of the data.
In time series forecasting, autoencoders can be used for tasks such as anomaly detection and noise reduction.
Deep Belief Networks (DBNs)
DBNs are a type of deep learning model that consists of multiple layers of Restricted Boltzmann Machines (RBMs). DBNs can be used for unsupervised learning and feature extraction, and they have been applied to time series forecasting tasks.
Transformer Models
A type of deep learning model that was originally developed for natural language processing tasks, but they have also been successfully applied to time series forecasting. Transformer models use self-attention mechanisms to capture dependencies between different time steps, making them well-suited for forecasting tasks with long-term dependencies.
Advantages of Deep Learning Models
Disadvantages of Deep Learning Models
Now that you’ve selected your forecasting method, here are the crucial steps to take to generate predictions for future periods.
In addition to MAE, RMSE and MAPE, other metrics can be used to evaluate the accuracy of time-series forecasts. For example, mean absolute scaled error (MASE) compares the forecast to a naive forecast, such as the previous observation or the average of past observations. This provides a measure of how well the forecast model is performing compared to a simple baseline.
Another metric is the symmetric mean absolute percentage error (SMAPE), which measures the percentage difference between the forecast and the actual values. Unlike MAPE, SMAPE is symmetric, meaning that it gives equal weight to over- and under-forecasting errors.
This makes it useful when the cost of over- and under-forecasting is similar. However, SMAPE has some limitations, such as a tendency to produce infinite values when the actual value is zero. Therefore, it is important to use multiple metrics to evaluate the accuracy of time series forecasts.
Visualizing the forecast results can also provide additional insights into the data. Time series plots can show any trends or patterns in the data, while residual plots can reveal any systematic errors or biases in the forecast model. Quantile-quantile plots can also be used to check whether the forecast residuals follow a normal distribution.
Interpreting the forecast results involves considering the context and purpose of the forecast. For example, a forecast for a short time horizon may require more accurate and precise forecasts, while a longer-term forecast may be more concerned with identifying general trends and patterns.
It is also important to consider any external factors or events that may impact the forecast, such as changes in market conditions, new competitors or unexpected events.
Several additional considerations should be taken into account when performing time series forecasting:
Data quality. The accuracy and completeness of the data can have a significant impact on the accuracy of the forecast. It is important to ensure that the data is clean, consistent and representative of the underlying phenomena being modeled.
Domain knowledge. Understanding the underlying domain and the factors that may influence the variable being forecast can help to inform the choice of method and parameter selection. For example, in demand forecasting for a seasonal product, knowledge of seasonal patterns and trends may inform the choice of a forecasting method.
Uncertainty and risk. Forecasting inherently involves some degree of uncertainty, and it is important to consider the potential risks associated with inaccurate forecasts. Sensitivity analysis and scenario planning can help to identify potential risks and mitigate their impact.
Updating the model. As new data becomes available, the forecasting model should be updated to incorporate the latest information. This may involve re-training the model with the updated data or using adaptive methods that can update the model in real-time.