Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
Showing results for 
Search instead for 
Did you mean: 
Product and Topic Expert
Product and Topic Expert


Question: what is Financial Planning and Analysis (FP&A)?

Answer: read more here:

Question: what is XP&A and how does it compare to FP&A?

Answer: read more here:

Question: What are leading analysts and thought leaders saying about FP&A?

Answer: read these white papers:

More resources can be found on SAP FP&A page.

Question: What is SAP’s strategic direction for Financial Planning & Analysis (FP&A)?

Answer: SAP Analytics Cloud is the recommended solution for organizations looking for a financial planning and analysis solution. It is our one solution for Collaborative Enterprise Planning aligning  finance, HR, marketing, sales, and supply chain plans to respond faster to market changes.

Read the Deep Dive Strategy for Enterprise Planning blog by Matthias Kraemer (Head of SAP Analytics Cloud for Planning) 

Question: What are the major analysts saying about SAP Analytics Cloud FP&A capabilities?

Please see latest insights from IDC, Gartner and BARC.

Question: What is Predictive Planning?

Answer: Smart Predict as part of SAP Analytics Cloud offers time series forecasting, classification & regression scenarios to augment stories with predictions.

Predictive Planning in SAP Analytics Cloud is the ability to run time series forecasting scenarios directly on top of planning-enabled models to offer a smart baseline for the forecasting activities.

Check the intro video below or read more to this here

Question: What is the difference between Smart Predict and Predictive Planning?

Answer: Predictive Planning in short is the ability to run Smart Predict automated time series techniques on top of data stored in SAP Analytics Cloud planning models and write back the predictive forecasts to the same planning models to help with the forecasting activities. To make a long story short, we call this Predictive Planning :-).

It's also worth noting that it's not possible at this stage to run Smart Predict classification & regression techniques directly on top of SAC planning models.

Question: is Predictive Planning already available in SAP Analytics Cloud?

Answer: yes, it was delivered with SAP Analytics Cloud third quarterly release in 2020.

Since then, a number of incremental improvements have been delivered:

  • In 2020.Q4, local currencies are now supported and user-friendly descriptions are supported to display entity names. Read more about the currency support here.

  • In 2021.01 and 2021.Q1 QRC time series models have been improved to better leverage the most recent data points.

  • In 2021.04, data smoothing techniques are made "first-class citizens" to increase the overall model accuracy, provide simpler models and better handle disrupted data conditions.

  • More improvements have been delivered in 2021, along the lines of being smart, self-service & trusted. Stay tuned and you can refer to SAP Analytics Cloud roadmap explorer to know more.

Question: Shall we treat Predictive Planning as a new module when it comes to planning?

Answer: no, the Predictive Planning experience is as closely integrated as possible to the planning experience in SAP Analytics Cloud as it can use planning models as a source, and write back to private versions making this the shortest possible loop.

Question: How much of statistics skills should be known to utilize this tool effectively?

Answer: You do not need advanced knowledge to use and benefit from Predictive Planning.

Complexity is hidden so business users can create prediction without the support of a data scientist. The way we surfaced the result make it understandable and trustable.

The only concept that needs to be understood is the HW-MAPE (Horizon-Wide Mean Absolute Percentage Error), which measures the accuracy of a time series model and represents the average error that the predictive model is likely to commit when used in the future.

Question: I am a SAP Analytics Cloud customer. Do I need a specific license to leverage Smart Predict or Predictive Planning?

Answer: Smart Predict and Predictive Planning capabilities are available on SAP Analytics Cloud Cloud Foundry deployments, based on the world region, the hyperscaler can be AWS (Amazon Web Services), Alibaba Cloud or others. These capabilities are not available on SAP Analytics Cloud Neo deployments.

All new SAP Analytics Cloud subscriptions purchased in regions where a Cloud Foundry deployment option is available have Smart Predict provisioned as part of their subscription by default.

The official availability note is here.

The capabilities are available to every BI or Planning license in SAP Analytics Cloud, on the Cloud Foundry deployments. Please refer to the Pricing page to know more.

Predictive Planning time series forecasting are available for SAP Analytics Cloud planning-enabled models. Planning models can only be created by end-users having access to Planning Professional Licenses.

Actuals and predictive forecasts can be reported on using any type of SAC BI / Planning license.

You can refer to the different features available per license type for planning models here.

Question: Is Smart Predict/Predictive Planning available to customers through SAP partners & re-sellers?


  • Partners who are authorized to resell SAC benefit from Smart Predict/Predictive Planning, which is provided as part of the standard package with Cloud Foundry licenses.

  • The Cloud Foundry version of the SAC Test & Development (T&D) licence is available since October 2019.

Question: I have a specific need that's not served by Predictive Planning existing capabilities. How can I ask enhancement requests to SAP?

Answer: any SAP Analytics Cloud enhancement request should be raised via our Influence portal. Please search for what others have already entered, up-vote enhancement requests and enter your very own enhancement requests if they do not exist today. SAP Analytics Cloud product management teams welcome your ideas and thoughts! Read more from my colleague christian.happel here

Here are some existing Predictive Planning enhancement requests that you might be interested to vote:

Question: I want to learn more about Predictive Planning. Where I can find more information?


Question: It is possible to view Predictive Planning demos?

Answer: Yes, please see:

Also this one from Xavier Hacking (SAP Partner Interdobs)

Question: what are the major Predictive Planning use cases?

Answer: The use cases most often seen with Predictive Planning so far are

  • Expense & Cost Planning

  • Revenue and Sales Planning

  • Headcount planning

Other less recurrent use cases include resource management, cash flow forecasting, volume forecasting... basically whenever you have sufficient historical data and want to use predictive (time series) to support your planning activities.

Question: are there examples of customers using Predictive Planning today?

Answer: Yes.

You can refer to the SAP Corporate Controlling testimony here.

You can also read more to how Roche is using Predictive Planning here.

(SAP Innovation Awards 2021 submission)

Question: are there any known restrictions? How much historical data is required for Predictive Planning to be effective?

Answer: Please see the detailed documentation here.

Some important points:

  • You can generate your forecasting entities including a maximum of 5 dimensions and/or attributes at a given time.

  • You can create up to 1000 time series models in one go and a maximum of 500 forecasts.

  • To be in the sweet spot, you need 5 times more actuals than forecasts you need. If I want to forecast January to December 2021, ideally I should provide actuals from January 2016 to December 2020 as one example. In case you have less history, predictive forecasts will still be generated but you will receive warnings as you can be less confident in some of the predictive forecasts. For instance you have 40 months of history and you want to predict 12 months, then the confidence you can place in month 11 and month 12 predictions will be less.

Please refer to the best practice here to handle the cases where you need to generate 1000+ entities.

Question: can you please tell more about the automated time series forecasting used behind the scenes? Is it possible to understand the  algorithm used?

Answer: The logic of the algorithm used for automated time series is described in this blog.

It has been crafted & tuned to be robust while accurate and yielding results that are explainable to business users and result for 20+ years of product investment, and proven by customer success.

Similar to the modern car industry, our focus is not so much of explaining the logic of the engine to the end-user but rather focusing on the driving pleasure - here giving a user-friendly and simple user interface and providing the full transparency on the predictive models that have been detected (trends, cycles etc) so that more end-users can spend more time experimenting, validating the business value and using to serve the use cases.

The logic we use behind the scenes is constantly evolving and is being improved over time as evidenced by two recently delivered user stories - which form part of an ongoing product investment plan to mitigate the disruption caused by the COVID-19 pandemic to the ability to plan & predict.

  • In 2021.01 and 2021.Q1 QRC time series models have been improved to better leverage the most recent data points.

  • In 2021.04, data smoothing techniques are made "first-class citizens" to increase the overall model accuracy, provide simpler models and better handle disrupted data conditions.

Customers cannot replace the automated time series forecasting logic with custom algorithms (R or Python). This is actually on purpose since Predictive Planning is geared towards business users.

Such users will typically not have the skill set to delve into data science details & algorithms, but are happy to be able to solve complex and labor-intensive forecasting problems by themselves without requiring help by data scientists who are themselves scarce in most organizations.

There are other SAP solutions like SAP Data Intelligence or SAP HANA supporting custom algorithms & approaches to time series. You can read more to the overall SAP AI landscape in vriddhishetty excellent blog.

Question: which data sources can be used in context of SAP Analytics Cloud Planning?

Answer: SAP Analytics Cloud can connect to various on-premise and cloud data sources including SAP HANA, SAP BW, SAP S/4HANA, SAP BPCOData, Google BigQuerySQL and more.

Since Predictive Planning is based on SAC planning models, all data sources that can be leveraged by SAC planning models are automatically supported, with the exception of BPC. 

This includes data from many on-premise data sources (SAP BW, SAP HANA, S/4, Universes, SQL databases, file servers and Odata services) as well as cloud data sources from SAP (like SuccessFactors, Fieldglass, Concur, Hybris, ByDesign) as well as from external provides (like, Google Drive or Google Big Query).

Check out our connecting to data page for more information. Find help on setting up your connections in the connection guide.

You might also find the following resources useful:

S/4HANA Cloud and SAP Analytics Cloud Planning

Financial Planning combining SAP S/4HANA Cloud and SAP Analytics Cloud

SAP Integrated Business Planning and SAP Analytics Cloud Planning

Question: I would like to write-back data from SAC / Planning to source systems. On which systems is this possible? 

Answer: please refer to the main page "Exporting Models and Data" in the official product documentation. This covers different options including:

  • File


  • Odata services (BW/4HANA 2.0 and BW 750)


  • SAP Integrated Business Planning

Main SAP note:

SAP Business Warehouse (BW) 7.5. See

SAP BW/4HANA 2.0. See (part 1) (part 2)

As a take-away all these data sources can be used in combination with SAC / Planning and Predictive Planning:

  • data can be imported and scheduled refreshes can be put in place

  • it's then possible to plan, predict and visualize all in SAP Analytics Cloud

  • it's possible to write back to the underlying source systems

As mentioned in an earlier question, we have proven examples and success stories of customers doing this on top of SAP BW, as one example.

Also please note the following planned enhancement in the product roadmap.

Question: is data being replicated into SAP Analytics Cloud (aka acquired) or live?

Answer: All planning models in SAP Analytics Cloud are based on acquired/replicated data with the exception of SAP BPC Embedded (Business Planning and Consolidation). This means Predictive Planning also deals with acquired/replicated data.

Scheduled data refresh is possible for many of these data sources. Refer here.

Please note that Smart Predict offers a live integration with SAP HANA on-premise since Q4 2019. This can be beneficial to support complementary predictive and planning scenarios. Refer here and  there.

Please also note that SAP Analytics Cloud planning models based on BPC are not supported by Predictive Planning, regardless of the BPC version.

Question: I am a SAP BPC customer, why would I want to extend my investment with SAP Analytics Cloud Planning?

Answer: please read more here and see the top 10 reasons here

You can also read this interesting comparison between SAP Analytics Cloud and SAP BPC.

Question: What are the data acquisition and data preparation limits when it comes to datasets, models and stories?

Answer: please refer to the official help page here.

Question: how to integrate outcomes from Smart Predict classification and regression models into the planning process?

Answer: there are multiple ways this can be done.

Question: can I run Predictive Planning on SAC / Planning models that use fiscal year concepts?

Answer: Yes.

Question: does Predictive Planning support local currencies? is it possible to predict and plan using local currencies?

Answer: Yes. Please see the detailed blog here:

Question: How to influence the time series predictive scenario with other datasets say Sales data + Temperature datasets from other file?

Answer: Please see the detailed blog here:

Here is the suggested approach to validate the business benefits of using influencers when creating forecasts:

Based on this comparison, determine if it’s worth it to include influencers as data models are by nature more complex to maintain over time.

We do plan to offer support of influencers in Predictive Planning in the second half of 2021.

Question: Will it be possible to use influencers for predictive planning in later releases?

Answer: When Smart Predict time series is using datasets as an underlying data foundation, additional drivers (aka influencers) can be used and might bring a benefit to the predictive forecasting activity. We do plan to add this capability to Predictive Planning in the course of 2021.

Question: On relevant data I rather run quickly into the 1000 entity limit. Customer/Product combinations are common, but also a limit for predictive. Why?

Answer: The limit of 1000 has been defined to avoid consuming too much processing power of SAP Analytics Cloud for one given predictive run and keep the ability to display per-entity results in the user interface.

If you face this limit in context of your use cases, you can use the best practice to proceed with multiple, independent runs that you can then combine in a unique private version.


Question: Do you consider making the meaning of Trend, Cycles etc even more clear in Predictive Planning?

Answer: Short answer is yes :-). Long answer is we feel Predictive Planning is about combining and not compromising on three pillars: Smart, Self-Service and Trust:

  • Smart means being accurate and we invest in the first half of this year 2021.

  • Self-service means being intimately integrated in the end-to-end planning experience. While we have an excellent support of all core concepts today, we will continue with for instance support of parent-child hierarchies planned in 2021.

  • To your question, the trust arises when models are clearly understood and acted upon. We will continue to bring the maximum transparency. For instance in Q2 2021 QRC release, we'll let planners publish predictive forecasts for past period to stories, in order to compare the predictions to the actuals more easily.

You can find the official roadmap for the entire SAP Analytics Cloud capabilities here

This specific filter will show you the roadmap plans for Augmented Analytics capabilities including Smart Predict / Predictive Planning.

Question: Is there is any info about the accuracy of the predictive model?

Answer: yes, we do offer a simple measure of model accuracy named the HW-MAPE (standing for Horizon-Wide Mean Absolute Percentage Error), which measures the accuracy of a time series model and represents the average error that the predictive model is likely to commit when used in the future.

In addition it's possible to deliver predictive forecasts into stories where they can be freely compared with actuals, plans and budget using the strong calculation capabilities of SAP Analytics Cloud.

Question: Besides Horizon-Wide MAPE, are other indicators of predictions validation available, e.g. R Square (R2), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), mean absolute percentage error (MAPE), others?

Answer: Other indicators are not available natively in Predictive Planning. The guidance here would be to use story capabilities and create custom indicators as appropriate, this can help with post-validation of the predictive forecasts.

Question: Let's say I am too scarce in terms of historical data I can provide. For instance I want to predict the whole of 2021 and 2022 at monthly level but I only have 5 years history (60 months) from 2016 to 2020. Should I predict 2021 first, then predict 2022 combining actuals from 2016 to 2020 and predictions from 2021?

Answer: while this may sound a workable workaround I would not suggest to engage in this route. The 2022 forecasts would be based on 2021 predictions and thus uncertainty might amplify in a way that's difficult to evaluate beforehand. The sweet spot for Predictive Planning monthly forecasting scenarios probably stands between 6 months and 18 months, based on the historical data that's available and respecting the 5 to 1 ratio (between actuals and predictions) that was explained earlier on.

Question: How does the machine learning engine use the historical data to predict future forecasts? Does this continue to learn, if it predicts for a few periods and then gets actual data, does it use that new data to make better new forecasts?

Answer: The overall mechanism is detailed in the blog here: To your point, the idea is each time there are new actuals, ideally you should refresh your predictive model and thus the forecasts. In the case of monthly predictions the predictive model should be refreshed every month.

Question: How much Data is needed to predict e.g. 3 month and how good will that prediction compared to HI (human intelligence)?

Answer: Typically the ideal ratio (historical data to predictive forecasts) is 5 to 1. In my view we humans should always make sure that human intelligence rules over the machine 🙂 Pun apart, it's about comparing the Predictive Planning to the current planning processes in place,  their accuracy, speed and efficiency and compare this to a new process "augmented" with the use of predictive. Good thing is that SAP Analytics Cloud stories make it easy to perform a data-driven comparison.

Question: What are the differences between SAP Analytics Cloud Predictive Planning (aka Smart Predict on planning models) and and SAP Analytics Cloud Smart Predict using datasets?

Answer: SAC Smart Predict has initially been part of SAP Analytics Cloud (SAC),  integrating with SAC's datasets which are themselves flat, tabular data structures unlike the more OLAP-oriented SAC BI models or SAC planning models.

With datasets, SAC Smart Predict supports regression, classification and time-series forecasting scenarios. For time-series, candidate influencers are supported. For datasets, only a subset of data sources is supported as compared to the list of data sources for creating BI or planning models. For a detailed list of restrictions, check the documentation.

For Predictive Planning, i.e. the integration of SAC Smart Predict with SAC planning models, only time-series forecasting is supported. This makes ample sense semantically since planning is all about the development of KPIs over time time, i.e. time series. In Predictive Planning, candidate influencers are not yet supported, but their support is part of the roadmap. There are ways forecasts including influencers can be reported on in context of stories - see

Question: How does Smart Predict deal with top-down and bottom-up approaches for budgeting?  

Answer: You are free to do both since predictions can either be done on a leaf-level and aggregated up; alternatively customers are free to forecast on intermediate or top-level of the dimension hierarchy by leveraging dimension attributes for defining the forecasting entity. Forecast results would then be spread down to leaf members through disaggregation that itself can be influenced by the customer. The video Using Entities in the advanced knowledge learning track goes into very much detail on the different possibilities.

Question: Is there a way to simulate different future scenarios with Predictive Planning?  

Answer: First of all, simulation and forecasting are really two different cups of tea and should not be mixed. Therefore the straight answer is that this is only possible indirectly and should be used with caution. If you still want to do this, you have two options:

  1. within Predictive Planning, you can consider preparing a private version of the planning model that describes your scenario consistently in the past. Predictive Planning can use this base data for training and hence generate the respective forecast for it

  2. If your scenario involves a second variable (how much ice cream will I sell if this summar is hot? How much if it's cold?), you are required to go back to time-series based on datasets. This is necessary in order to leverage that second variable (warm/cold weather) as a "candidate influencer". You could therefore prepare several datasets with different values of the candidate influencer to represent the different scenarios. After saving the forecasts back to a dataset, you can transfer the predictions back to the planning model by virtue of data actions with a copy step as described in this blog. Another way of doing this is this:

Question: How can I refresh my predictions at a later time?  What automation possibilities exist? 

Answer: Once new data is in your planning model, you can always go back to your predictive scenario and retrain the predictive model on the latest data. Since all the configurations are preserved, this boils down to just pressing a button to retrain the model and save the predictive forecasts.

It is currently not possible to fully automate the end-to-end process through e.g. scheduling. This is not necessarily a big problem, since

  • Predictive forecasts are typically not generated every day but rather every other month according to the schedule of the controlling department. Since there is already a high-level of forecasting automation by generating many forecasts at the same time along the chosen forecasting dimensions, the pressing of single button every other month should not be an issue

  • Controllers would typically want to double-check the output of the predictive planning process in detail before saving results out the planning model. Fully automating the process without any user intervention is not always desirable.

Question: what is the time granularity for which predictive forecasts get generated? 

Answer: Let's consider the most simple cases first.

I create a planning model and the granularity of the date dimension can be either Year, Quarter, Month or Day (read more about the Date Dimension here).

Let's also assume the data is being stored at the same level. In some cases I could define the Date Dimension on a certain level but the data is being stored at a different level (example a daily model as per the date dimension but the data is stored on a weekly basis).

In this case Predictive Planning will display the proper time granularity and produce forecasts with the corresponding granularity.

Again let's continue on the same simple example:

  • The granularity of the date dimension of my planning model is defined as monthly.

  • I do have data for months going from January 2016 to December 2020, 60 months

  • I ask 12 forecasts to Predictive Planning

In this case Predictive Planning will generate forecasts from January 2021 to December 2021.

Similarly if I have a date dimension at daily level, I have data from January 1st 2016 to December 31st, 2020 and ask for 365 forecasts ahead, I will then get predictive forecasts from January 1st, 2021 to December 31st, 2021.

The description I did is nicely summarized in the official help here: Time granularity: The time series predictive model is trained and applied based on the level of time granularity available in the planning model data source. If the planning model lowest level is daily, then Smart Predict will create daily predictive forecasts."

Let's now move to somewhat more complex cases where the time granularity in the date dimension and the effective time granularity of the data differs.

One example could be a model with a daily time granularity defined in the date dimension but the data is effectively stored every month or every week for instance. In this scenario Predictive Planning will give priority to the effective granularity of the data.

Let's take two concrete examples there:

Example 1: I have a model with daily granularity in the Date dimension - from January 1st, 2016 to December 31st, 2021. In practice though the data is stored at monthly level - I do have one row of data from January 2016, one for February 2016 etc. if I ask for 12 forecasts, they will be generated for January 2021 to December 2021, not for January 1st, 2021 to January 12nd, 2021.

Example 2: I have a model with daily granularity in the Date dimension - from January 1st, 2016 to December 31st, 2021. In practice though the data is stored at weekly level - I do have one row of data from January, 1st 2016, one for January 8th 2016 etc. if I ask for 12 forecasts, they will be generated for the first 12 weeks of 2021 (depending when the end of the last week of 2020 falls!)

Now in the most complex scenarios it could be that the data granularity is different from regular patterns. For instance I might have data stored every 10 days. Again Predictive Planning will identify the data granularity and reproduce it for the future, generating one predictive forecast every 10 days.

Finally the “hourly” time granularity is currently not supported – the corresponding SAC enhancement request Weekly granularity is on the works - see here.

Question: is it possible to write back predictive forecasts for past periods? 

Answer: in Q2 2021 QRC release, we plan to provide planners with the ability to publish predictive forecasts for past period to stories, in order to compare the predictions to the actuals more easily.

Question: is it possible to exclude certain entities with data issues, be it for lack of information, low number of training records, or large gaps in time series?

Answer: exclusion can be handled in multiple ways:

  • Using specific attributes to remove specific dimension members. You can refer to the video here to know more. In 2021.H2 we delivered hierarchy support when defining the entity to facilitate this exclusion / filtering.

  • The source version in the planning model used to train the predictive model can be pre-processed to filter out some parts of the data and avoid the corresponding entities to be created.

also see

Question: what do you suggest as a way to control the quality of the data, before using it to forecats, as the quality of the data may strongly vary accross entities?

Answer:  The suggested best practice is to make ample use of stories before forecasts are generated to pre-validate the quality of the data and exclude what should be excluded from the predictive forecasting scope up-front (pre-processing).

In post-processing there are different ways the quality of the predictions should be gauged before they are effectively exposed to planners' attention – the Horizon-Wide MAPE indicator in the predictive scenario directly measures the expected error. This can be complemented, creating  custom indicators, calculations or variance analysis in stories.

Question: when writing back the predictive forecasts to the private version, I received an error that not all forecasts could be written back and that the planning model time range should be extended. I am not sure why I receive this error and what to do. Can you please help?

Answer: irregular time series might cause the predictive forecasts to extend beyond what the end-user expects as a range. If I have only one data point filled every two years, Predictive Planning will reproduce this similar pattern in the future. The typical way to prevent this message from appearing is to exclude the entities causing this problem up-front or fixing the fact that data is too sparse. Such issues can be spotted via a dedicated SAC story to pre-analyze the data or using the “Row Count” per entity once the Smart Predict model has been trained.

Question: As a SAP Analytics Cloud user, what role is required to create predictive forecasts with Smart Predict / Predictive Planning?


  • You need to have a "Predictive Content Creator" role in SAP Analytics Cloud

  • Help reference

Question: I want to use predictive capabilities for strategic planning and forecast multiple years ahead. Is this possible using Predictive Planning?


It is  important to understand is that there is a limit to what predictive can do in sense of long-term forecasting.

Predictive is well suited for forecasts in the range of 6 to 18 months assuming sufficient historical data is provided (with the ideal 5:1 ratio means 30 months up to 90 months history).

If you provide 5 years history, and ask for 4 years forecast using predictive, then the accuracy for years 2/3/4 will be very low.

If the multiple -year requirement is fixed, then they would be better off forecasting the first year, then using formulas to extend to years 2/3/4. It's not possible to expect predictive to deliver accurate predictions multiple years ahead.

Question: I am not fully sure to understand the concept of entities and whether I should be predicting at high level (aggregated) or detailed level using entities? Where can I better understand this concept?

Answer: the general rule of thumb and the question you have to ask yourself is at which level you need accurate predictive forecasts. Is this more important to get accurate predictions globally, per country, per country and product you sell? Based on this information you can then determine at which level you need the data to be aggregated and at which level you want to generate the predictive forecasts.

To understand more on the concept of entities and how they are being used in the process just described, please refer to the helps links - link 1 and link 2.

Question: how to best choose the right level of data aggregation to create the predictive forecasts?

Answer: This is a tricky question to resolve as we have both choices to do and constraints when making this choice:

  • We can generate up to 1000 predictive models maximum

  • The more we go detailed, the more chances we have to face the “curse of multidimensionality” e.g. products that are less frequently sold in certain areas of the business as an example, or sometimes we face increased data quality issues at the lowest level

  • If we go too aggregated (then spread to the lowest levels) our local forecasts might lose precision.

Ultimately the balance has to be found by doing trial & errors (experiments), finding where we maximize the forecast accuracy, while at the same time being able to generate robust predictions in an acceptable time frame.

Doing the forecasts at intermediate level is not a problem to obtain numbers at lowest level – as we can use the trick here to meaningfully allocate the numbers to the lowest levels.

In terms of data science literature, it’s very clear & has a consensus towards the fact that we’ll usually get the most precision when we use the data at a level where we want the predictive accuracy to be maximized.

Examples: if my main goal is for the predicted numbers to be optimized for every product in every region, then it’s probably better to use data (and create entities) for each product * region combination. If it’s more important to do this for the region level, then I should use data for the region, aggregating the value for all products. Conversely it might be interested to maximize for all products worldwide etc..

In my view where business can help is in providing some information to guide us in this process at what level they plan, and what’s the most important for them. At the same time the ultimate say should come from those operating the Predictive Planning part to offer the best possibility.

As always, 80% of the key activities when it comes to data science actually relate to the data part and getting this right – we face this here as well.

Question: at what stage would you recommend to involve the business owners?

Answer: it’s always beneficial to involve “power users” or “early adopters” from the business as early as possible – once we feel we are ready to share initial insights with them of course, as they can guide us with expectations, the way they would be leveraging the experience, what are their expectations, at which level they need the predicted numbers most etc… It's preferable to show an imperfect or WIP experience early in order to gain feedback.

Question: Is predictive model training data accessible in stories, e.g. training overview, forecast, and signal analysis?

Answer: partly yes. In wave 2021.06 (will be part of 2021.Q2 QRC, the May release) we will allow to write back predictions for past periods in stories.

Outliers, error min / max and signal analysis will remain only accessible in the Smart Predict user interface as we do not have a solution to technically store such data in the planning model.

Question: Can data intersections (entities) with issues be exclude, e.g., zero volumes, low number of training records, large gaps in time series?

Answer: Practically exclusion can be handled in multiple ways:

  • Using specific attributes to remove specific dimension members. In H2 2021 we plan to offer support of parent-child hierarchies in the entity definition field to facilitate the definition of the forecasting scope.

  • The source version for the planning model can be pre-processed to not include certain areas in the data scope  - I think we named this “rationalizing the actuals” or “pre-processing” in the call.

Some customers massively rely on data actions during this phase to prepare actuals so that they are in the best possible shape for predictive activities.

Right now it’s not possible to exclude entities at an individual level.

Question: Is an Excel export of valid predictions the way to control the data source, i.e. exclude/include data for final predictions variance to plan?

Answer: the suggested best practice is to make ample use of stories beforehand to pre-validate the quality of the data and exclude what should be excluded from the predictive forecasting scope up-front (pre-processing). In post-processing there are different ways the quality of the predictions should be gauged – could be using the HW-MAPE in Smart Predict, could be using custom indicators, calculations or variance analysis in stories (like the R-Square example)

Question: can I use level-based hierarchies as entities?

Answer: yes, please see

Question: can I forecast calculated accounts?

Answer: no, you should forecast the base accounts that form part of the calculation - see

Question: can I forecast accounts in local currencies?

Answer: yes, see this blog

Question: do you have best practices to handle the life-cycle management of versions?

Answer: please see

Question: do you have best practices and recommendations to compare the accuracy of forecast models?

Answer: please see

Question: do you have best practices and recommendations to compare actuals and predictive forecasts?

Answer: please see

Question: how to best influence the disaggregation of predictive forecasts to the lowest levels?

Answer: please see

Question: do you have some tutorials to recommend for me to go hands-on?


Last updated: December, 21th, 2021