on ‎2021 Sep 22 10:51 PM
We are currently on Hana 2.0 SP05 and planning to implement the Hana Machine Learning Algorithms (specifically PAL).
What guidelines are available for sizing the capacity for the SAP Hana PAL implementation?
Request clarification before answering.
Hi Shankar,
SAP HANA Cloud requires an additional vCPU (3 instead of 2) for running the scriptserver process (but so does the document store). Application Function Libraries (AFL) of which the Predictive Analysis Library (PAL) is part is a database implementation for machine learning (statistical analysis). PAL does not require GPUs. Usual sizing recommendations apply
To my knowledge there is no sizing document specific to PAL but let me ask around a bit
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thank you for your response and for checking further if any sizing guideline is available specific to PAL.
Hi Shankar,
One step back: we interact with relational database management systems (RDBMS) using SQL (Standard/Structured Question/Response Language). Often, processing logic is required (IF ... THEN ... ELSE). For this each RDBMS adds its own procedural language: SQLScript for SAP HANA, T-SQL for Microsoft SQL Server, PL/SQL for Oracle etc.
Sometimes, processing requires too much resources (i.e. either you have to wait a long time or you will have to spend a lot of money for the request). For those scenarios, some RDBMS make it possible to execute compiled code (C/C++). Side effect, a minor bug can crash the database kernel. Production down. For this reason, external procedures are executed in a separate process. In case of SAP HANA, we call these Application Function Libraries (AFL). PAL is a subset of AFL.
To answer your question:
As documented
See also

For a gentle introduction into the topic of SAP HANA architecture, see

Hi Shankar,
sizing of Machine Learning scenario is depending on many factors, algorithms, input data size / type / cardinality, expected performance, concurrency of algorithm invocations ... as you can see for the APL sizing (see here). As PAL offers many more algorithms, it is hardly to predict as generic as your inquiry is.
There are PAL SAP application- as well as customer and partner application scenarios, where the PAL (or APL) processing goes almost unseen on the system, other's where there is a monthly one-day peak in e.g. forecasting workload which requires significant additional resources.
One approach would be to schedule the PAL workload to times on the system where there is the required processing capacity available, or as I understand you are looking at an on-premise HANA 2.0 installation such workload could potentially also be offloaded to a HANA Cloud instance providing the additional capacity. Furthermore you can certainly guardrail the PAL/APL processing workload using HANA workload management (e.g. limit the threads available to a PAL invocation).
The inference using the PAL_*_PREDICT functions in supervised learning scenarios (regression, classification) should not impact your sizing, unless you seek to serve a larger number of PAL models in parallel using PAL model state (here).
I recommend to start prototype your scenario more specifically (regarding data, algorithms, etc.), that would be the basis and input for your sizing efforts.
Best regards,
Christoph
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thank you for your response - I fully understand that multiple factors come into play with Machine Learning Algorithm performance.
Our team seem to have received earlier of a SAP guideline to maintain 35% to 50% Free Memory Capacity for our BW on Hana implementation -- wanted to understand if any such guideline is available specifically for PAL implementation on CPU / Memory overall, apart from the needs that arise based on other factors like data volume, usage scenario, etc.
There is a plethora of information about SAP HANA sizing, including (in particular) for SAP BW/4HANA, see
Documentation, tools, specialised training, etc.
For greenfield (completely new) implementations in the (very) early days of SAP HANA, rule of thumb T-shirt sizing was not uncommon. But is always best to size, i.e. prototype your scenario, as explained by Christoph.
The advice to keep 35-50% of a very expensive machine unused is curious (but makes sense when you are selling hardware, for example).
| User | Count |
|---|---|
| 15 | |
| 9 | |
| 6 | |
| 5 | |
| 4 | |
| 4 | |
| 3 | |
| 2 | |
| 2 | |
| 2 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.