cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Creating a Multi - input and Multi -output model

Kangkana
Product and Topic Expert
Product and Topic Expert
0 Likes
196

SAP provides a wonderful service over BTP DAR(Document Attribute Recommender ) which allows multi input and multi output model . However some Financial Companies handles extremely sensitive financial data, including personal information and transaction details.
They might be concerned about data privacy and security and are often subject to strict regulatory requirements regarding data handling and storage. 
In such a Scenario when they do not want to transfer any data over any other platform but make it in house using the following steps.

Let's understand each stage of the code to achieve this feature.

1. Data Preparation

First, you need to prepare your data. Let's assume you have a dataset with both numerical and categorical features.

2. Preprocessing

You need to preprocess your data to convert categorical variables into numerical ones. This is done using encoders like "OneHotEncoder" for categorical features and StandardScaler for numerical features.

Both OneHotEncoder and StandardScaler are available in Python through the scikit-learn library.

You can install scikit-learn using pip 

pip install scikit-learn

3. Model Training

Next, you create a pipeline that includes both the preprocessing steps and the model. This ensures that the same transformations are applied during both training and inference.

4. Model Inference

For inference, you need to preprocess the input data in the same way as during training, make predictions, and then convert the predictions back to their original categorical values.

So here  is the entire coding

import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.multioutput import MultiOutputClassifier
from sklearn.tree import DecisionTreeClassifier

# Example input data
X_train = pd.DataFrame({
    'Company_code': ['A', 'B', 'A'],
    'Vendor': ['X', 'Y', 'Z'],
    'Amount': [100, 200, 150],
    'InvoiceDocumenttype': ['Type1', 'Type2', 'Type1'],
    'Fiscalyear': [2021, 2022, 2021]
})

# Example output data
y_train = pd.DataFrame({
    'HKONT': ['Account1', 'Account2', 'Account1'],
    'KOSTL': ['Cost1', 'Cost2', 'Cost1'],
    'Profitcenters': ['Center1', 'Center2', 'Center1'],
    'Paymentterms': ['Term1', 'Term2', 'Term1'],
    'Partnerbanktype': ['Bank1', 'Bank2', 'Bank1'],
    'Taxcode': ['Tax1', 'Tax2', 'Tax1']
})

# Preprocessing for numerical and categorical data
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), ['Amount', 'Fiscalyear']),
        ('cat', OneHotEncoder(), ['Company_code', 'Vendor', 'InvoiceDocumenttype'])
    ])

# Multi-output model
model = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('classifier', MultiOutputClassifier(DecisionTreeClassifier()))
])

# Fit the model
model.fit(X_train, y_train)

# Example input for prediction
X_test = pd.DataFrame({
    'Company_code': ['A'],
    'Vendor': ['X'],
    'Amount': [120],
    'InvoiceDocumenttype': ['Type1'],
'Fiscalyear': [2021] }) # Predict predictions = model.predict(X_test) # Convert predictions to a readable format fields = ['HKONT', 'KOSTL', 'Profitcenters', 'Paymentterms', 'Partnerbanktype', 'Taxcode'] prediction_dict = dict(zip(fields, predictions[0])) print(prediction_dict)

 

 

Accepted Solutions (0)

Answers (0)