Technology Blog Posts by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
ChristophMorgen
Product and Topic Expert
Product and Topic Expert
1,484

With the 2024 Q2 database release, several new features have been released in the SAP HANA Cloud Predictive Analysis Library (PAL), an enhancement summary is available in the What’s new document for SAP HANA Cloud database 2024.14 (QRC 2/2024).

The feature highlights for the current release are described in more detail below

Classification and Regression enhancements

A new multi-task multilayer perceptron (MLP) function is introduced, enabling multi-label classification or multi-target regression scenarios. Using a multi-task learning neural network, a single ML model allows to predict multiple, related target columns at once, as the model captures both common features across tasks as well as task-specific information with the same prediction model.

  • It leverages the commonalities between related tasks to improve the performance,
    generalization, and training efficiency, and moreover enables efficient use of data,
    better feature extraction, knowledge transfer, regularization, and end-to-end learning.
  • Furthermore, the function supports early stopping using validation data to avoid overfitting.
  • Users of the new multi-task MLP also benefit from shorter MLP training times
    and models with improved accuracy.

The new function provides unique prediction model capabilities for example scenarios like

  • automated multi-field value proposals or pre-filling of forms
    (e.g. Sales Order Automation)
  • or predicting multiple price-/sales-targets (average, minimum, etc.)
    in a single model.

ChristophMorgen_0-1718707839586.png

ChristophMorgen_1-1718707839591.png

Multi-target predictions using the multi-task MLP function

For a more detailed introduction to the new algorithm see the following blog post Advancing to Multi-task Multilayer Perceptron: a new Neural Network design in HANA Machine Learning ...

 

In further improve regression models, a new outlier detection for regression function is added,

  • providing the ability to detect point outliers in data used when training linear or tree-based regression models using MLR and HGBT regressors.
  • Outliers in training data get identified based on residual analysis and outlier-score evaluation, then can be excluded for model training and allowing to build improved regression model predictions.

ChristophMorgen_0-1718710832222.png

Regression outlier detection

Text Processing

A new ML model-based text classification function is introduced.
It leverages

  • Random Decision Tree (RDT) models underneath,
  • aids much faster classification results, especially in case of classifying high numbers of new text documents
  • as well as providing a significantly improved text classification accuracy.

A detailed introduction to this new function is provided in the following blog post Inference Acceleration - Random Decision Tree Models for Text Classification

Python ML client (hana-ml) enhancements

The full list of new methods and enhancements with hana_ml 2.21  is summarized in the changelog for hana-ml 2.21.240618 as part of the documentation. The key enhancements in this release include

Dataframe methods

  • create_dataframe_from_pandas now fully supports creating and upserting to SAP HANA Cloud tables with columns of type REAL_VECTOR, thus vector embeddings prepared in pandas dataframe can easily be imported in SAP HANA Cloud

AutoML configuration and methods enhancements

  • Finetuning option of the best pipeline in the AutoML scenario
  • Pipeline explainability enhanced with SHAPGlobal surrogate, a light-weight model for faster explanation of AutoML and pipeline model prediction results
  • Visual editor support for the AutoML scenario configuration:

ChristophMorgen_0-1718711542441.png

 

Text processing

  • New text classification with model function, supporting RDT-based text classification
  • Massive Text Mining, implicit parallel analysis of multiple row vectors

 

Financial analysis methods

  • New Hull White simulation function, a trending financial math function used in modeling interest rates, vital in modeling of various financial instruments and risk
  • Benford Analysis function, trending algorithm used to detect anomalies in numerical datasets like e.g. financial transactions

You can find a Python Jupyter notebook example illustrating the highlighted feature enhancements here 24QRC02_2.21.ipynb.