The pandemic that broke in 2020 March, disrupted a lot of businesses across globe resulting in a huge change in the behavior of the KPIs (key performance indicators) measured. For example, the revenue generated by the airline companies dropped down while the e-commerce grew faster than ever.
Since the 2021.Q1 quarterly release we improved the way a Predictive Scenario analyzes time series data. Our goal is to provide relevant and accurate predictive forecasts including when there are disruptions. Our first improvement (
2021.Q1 QRC), was to leverage the most recent data to reduce the effect of the disruption. In our roadmap, you can also see other improvements such as the detection of
change points and the introduction of piece-wise trend. In this article, I am describing how we included a new class of algorithm named exponential smoothing in our automated time series forecasting technique. This feature is now available in
2021.Q2 QRC release for Smart Predict and Predictive Planning.
Why Exponential Smoothing?
Exponential Smoothing technique was introduced to adapt to significant and recent changes made in the data. Using smoothing factors gives more weight to recent data. It is exactly what we expect with the disruptive data mentioned above.
The implementation of Exponential Smoothing inside a Predictive Scenario is optimized to:
- Produce multiple predictive forecast periods.
- Test multiple seasonality’s internally.
- Supply better results when there are variations of the amplitude in the component cycle of predictive model.
- Automate the predictive model creation process and keep transparency for the SAP Analytics Cloud end user.
Now it will be good to see exponential smoothing in action inside a time series forecasting scenario. To explain this, I have used an example related to the evolution of the price of houses in Dallas . I collected the historical data on a monthly basis since 2006 to forecast the 2021 prices.
I created a time series forecasting Predictive Scenario and requested twelve forecasts. The Horizon-Wide MAPE (Mean Absolute Percentage Error) of the predictive model was found to be 3.10%. This means that the model is making is less errors. Predictive forecasts are shown in the figure below.
Fig 1: Predictive forecasts for 2021
In the Explanation tab, you can see the breakdown of the predictive model. The predictive model has been obtained using a smoothing technique.
Fig 2: Predictive model based on an exponential smoothing technique
Predictive scenarios combine transparent results with accurate models. Trustworthy, self-explained forecasts are presented to the SAP Analytics Cloud users so that they can understand the predictive forecasts easily and take relevant actions and decisions for their business.
The dataset of this examples is available
here.
Understand Exponential Smoothing
In this blog, I am just giving an overview explanation of Exponential Smoothing as there are already a lot of papers explaining detailed descriptions. You can find such references at the end. This blog is mostly to explain the principle with simple formulas.
Usually, three types of exponential smoothing techniques are being used: simple, double, and triple. Let me explain their principle.
Simple Exponential Smoothing
It is a progressive and additive formula where the predictive forecast at time t (noted F
_{t}) depends on the actual value at time t-1 (noted A
_{t-1}) along with the predictive forecast at time t-1 (noted F
_{t-1}).
If there is no prediction for the first data of the time series measured at t
_{0}, we assume that the predictive forecast at t
_{0} (noted F
_{0}) is equal to the actual value at t
_{0} (noted A
_{0}).
Using the above statement, the initial condition can be written as:
F_{0} = A_{0}
The formula of the simple exponential smoothing is described as shown below:
The predictive forecast at time t is the sum of the predictive forecast at time t-1 and the error on the previous predictive forecast multiplied by a smoothing factor named α whose value is between 0 and 1.
F_{t} = F_{t-1} + α(A_{t-1} – F_{t-1})
If we factorize on the predictive forecast at time t-1, we get the following formula:
F_{t} = αA_{t-1} + (1 – α) F_{t-1}
From this formula, we see that a smoothing factor near 1 reduces the weight of the oldest data and gives a greater importance to recent data. On the other hand, a factor near 0 increases the smoothing and reduces the weight of recent values. Thus, in case of disruption in dataset, we can choose a smoothing factor near 1.
The previous formula provides the explanation of the term “exponential” in this technique. If we replace the predictive forecast at time t-1 by its value in the formula, we obtain the following formula:
F_{t} = αA_{t-1} + α (1 – α)A_{t-2} + (1 – α)^{2} F_{t-2}
If we continue these substitutions, at the end this formula we obtain:
F_{t} = α[A_{t-1} + (1 – α)A_{t-2} + (1 – α)^{2} A_{t-3} + … + (1 – α)^{t-2}A_{1}] + (1 – α)t A_{0}
And because the weight of the actual data decreases following an exponential function, the term “exponential” appears in the name of the technique.
Double Exponential Smoothing
One limitation with the simple exponential smoothing is that it does not provide reliable results when there is a trend in the historical data. The predictive forecasts are either underestimated or overestimated depending on the slope of the trend. The goal of the double exponential smoothing is to smooth this trend to minimize the effect on the predictive forecasts.
Here, we use the same initial assumption:
F_{0} = A_{0}
Consider that the predictive forecast has two components:
- A level (noted L) to measure how high the time series is and,
- A slope (noted T)
With this notation, we can write:
F_{t} = L_{t} + h . T_{t}
where h is the number of forecasts requested in the future (it is the horizon).
The level and the slope at time t are given by these formulas:
L_{t} = α A_{t} + (1 – α)(L_{t-1 }+ T_{t-1})
T_{t} = β(L_{t} – L_{t-1}) + (1 – β)T_{t-1}
α and β are smoothing factors whose value is between 0 and 1.
To compute level and slope, it is necessary to add a new assumption which is:
L_{1} = A_{1}
T_{1} = A_{1} – A_{0}
To be consistent with the existing terminology used in the Explanation panel of a Predictive Scenario, the component shown is a trend. Its definition in the context of exponential smoothing is a linear combination of Slope at time t and Level.
Trend_{t} = aT_{t} * t + L_{t}
Triple Exponential Smoothing
If the historical data has
seasonal cycles, double exponential smoothing will not provide reliable results. To prevent this, triple exponential smoothing smoothens the trend and cycles to minimize their effects on the predictive forecasts.
Consider that the predictive model has three components:
- A level (noted L),
- A slope (noted T),
- A cycle (noted S).
With this notation, we can write:
F_{t+k} = (L_{t} + kT_{t})S_{t-M+k}
Where:
• k = 1 to the horizon h we want to predict in the future and,
• M is a seasonality parameter that stands for the size of a cycle (number of historical data in a cycle). Here, we try various seasonalities such as every day, every month, and every quarter. Refer to the
online help to get the complete list.
The level, the slope, and the cycle at time t are given by these formulas:
L_{t} = α A_{t}/S_{t-M} + (1 – α)(L_{t-1 }+ T_{t-1})
T_{t} = β(L_{t} – L_{t-1}) + (1 – β)T_{t-1}
S_{t} = δ A_{t}/L_{t} + (1 – δ)S_{t-M}
α, β, and δ are smoothing factors whose value is between 0 and 1.
To conclude this section, the Excel file “
Exponential Smoothing Calculations.xlsx” is provided for your reference that shows examples of these three exponential smoothing methods. You can see the following:
- The initial conditions
- The formulas
- And the graphical representations of the predictive forecasts, the level, the trend, and the cycle.
Integration of Exponential Smoothing in a Predictive Scenario
In a Predictive Scenario, time series data can come from either a
planning model or from a
dataset. This
blog explains how historical data is analyzed, prior to the introduction of exponential smoothing technique.
The Predictive Scenario splits the historical dataset in a training dataset (first 75% of the data) and a validation dataset (last 25% of the data). Then it breaks out the training dataset into three components to find simple predictive models for the:
- Trend – there are three types of trends: differential, linear, and polynomial
- Cycles – these are defined based on periodicity and on seasonality
- Fluctuation – it is obtained by building an auto-regression model on what remains of training dataset when trend and cycles have been removed from it.
The Predictive Scenario builds combinations of these simple predictive models. These combinations are evaluated against the validation dataset to determine the one which gives the best accuracy.
Finally, the best combination is refitted and adjusted on the complete historical dataset to consider the most recent data.
This process is kept with exponential smoothing technique.
With the introduction of exponential smoothing, we have new predictive models based on level, trend, and cycles. In a first step, predictive models built with double exponential smoothing and with triple exponential smoothing are in competition by evaluating several period lengths based on time granularity. Note that simple exponential smoothing does not enter in this competition. In a second step, the winner of the first step enters in the internal automated machine learning competition as described in section “Model selection and quality of model” of this
blog. The one with the highest accuracy becomes the winning predictive model presented to the SAP Analytics Cloud end user.
We make these two techniques compete to best adapt to the shape of the time series. If there are disruptions in the data, we might get better results with exponential smoothing.
Conclusion
Predictive scenarios are constantly evolved to bring you the best combination of accuracy, trust, and simplicity, as illustrated by the delivery of this new feature in
2021.Q2 QRC. This means that SAP Analytics Cloud Business Intelligence (BI) and Planning users benefit transparency of this feature to get the best accurate and relevant predictive forecasts and explanations. I hope this serves your business needs in disrupted times.
If you appreciated reading this blog, I would be grateful if you can like, share, and provide your comments. Thank you.
References: