Artificial Intelligence and Machine Learning Blogs
Explore AI and ML blogs. Discover use cases, advancements, and the transformative potential of AI for businesses. Stay informed of trends and applications.
cancel
Showing results for 
Search instead for 
Did you mean: 
YatseaLi
Product and Topic Expert
Product and Topic Expert
1,053

Ludwig is a popular open-source low-code framework for building custom AI models like LLMs and other deep neural networks. It simplifies the machine learning process with a low-code approach, enabling non-ML roles(application developers, business users etc) and empowering the ML professionals(ML Engineers, Data Scientists etc) to quickly experiment, build, train, and deploy models. In this blog post, I would like to share my experiences and findings on my exploration of Low-Code Machine Learning with Ludwig, and its integration and SAP AI Core with some of custom machine learning models from my previous work stream. The second part of the blog post is here.

Blog post series of Low-Code Machine Learning with Ludwig and SAP AI Core
Part 1 - End-to-End Low-Code Machine Learning with Ludwig and SAP AI Core (this blog post)
Part 2 - Defect Detection with Image Segmentation and Sound-based Predictive Maintenance

What is Ludwig AI?

Below is the introduction from official Ludwig website.
Ludwig a low-code framework for building custom AI models like LLMs and other deep neural networks.

  • 🛠Build custom models with ease: a declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data. Support for multi-task and multi-modality learning. Comprehensive config validation detects invalid parameter combinations and prevents runtime failures.
  • Optimized for scale and efficiency: automatic batch size selection, distributed training (DDP, DeepSpeed), parameter efficient fine-tuning (PEFT), 4-bit quantization (QLoRA), and larger-than-memory datasets.
  • 📐Expert level control: retain full control of your models down to the activation functions. Support for hyperparameter optimization, explainability, and rich metric visualizations.
  • 🧱 Modular and extensible: experiment with different model architectures, tasks, features, and modalities with just a few parameter changes in the config. Think building blocks for deep learning.
  • 🚢Engineered for production: prebuilt Docker containers, native support for running with Ray on Kubernetes, export models to Torchscript and Triton, upload to HuggingFace with one command.

Ludwig is hosted by the Linux Foundation AI & Data and released under Apache-2.0 license.

Here are some samples of custom ML models in yaml for your first impression, which defines a ML model architecture. Yes, with just a declarative yaml configuration file and its associated dataset, and run the ludwig command line to experiment, hyper-optimize, train and serve a machine learning model. Sound too good to be true?

YatseaLi_0-1726640394890.gif

End-to-end Low-Code Machine Learning with Ludwig

Okay, seeing is believing. Let's test it out. There are some well-documented how-to get started and examples,

I would like to replicate some custom ML models with ludwig from our previous work stream, and compare low-code machine learning with ludwig against pro-code machine learning with python etc.

In short, I am quite amazed at the ease and efficiency of experimenting, training and serving a custom ML model with given dataset and business objective. In this blog post, I will walk you through the end-to-end journey of Low-Code Machine Learning with Ludwig and SAP AI Core based on the first use case about book genre classification. The other two use cases about defect detection and sound-based predictive maintenance are covered in the second blog post.

Use Case#1: Book Genre Classification

The original use case from our BTP Data-to-Value workshop is about Book Genre Clustering, however, given the known limitation, clustering(self-supervised learning) is not supported in current ludwig. So I have adapted the use case to book genre classification instead based a book dataset from Kaggle with CC0: Public Domain license.

Dataset: EDA & Data Pre-Process

Let's have a glimpse at the book dataset saved as dataset.csv. The objective of the model is to predict the genre(Category) of a book based on its description(Description).

YatseaLi_0-1726664470067.png

We have seen 32% of Description and have [null] value, so do 25% of Category, which should be eliminated from the dataset in data pre-process. This could be easily achieved with Ludwig in the declarative way. We'll see it in the next section. 

In addition, the Category is a string of multiple genre separated by "<SPACE>,<SPACE>", e.g. "Peotry , General", "General , Fiction", in which " General" and "General " are wrongly considered as two different book genres but actually they are both semantically referred to book genre "General", and causing decrease of model performance. Therefore, we need to replace "<SPACE>,<SPACE>"  with "," during data pre-process by code.

 

import pandas as pd

# Load the dataset
df = pd.read_csv("books_dataset_kaggle.csv")

# Function to process the Category column
def process_categories(categories):
    if isinstance(categories, str):  # Check if the input is a string
        # Step 1: Strip whitespace for each category
        categories = ",".join([category.strip() for category in categories.split(",")])
    return categories  # Return the processed or original value

# Apply the function to the Category column
df["Category"] = df["Category"].apply(process_categories)

# Save the processed dataset
df.to_csv("dataset.csv", index=False)

 

As a result, "Peotry , General", "General , Fiction" etc in Catogery will be transformed into "Peotry,General", "General,Fiction", which can be easily pre-processed by separating by comma with ludwig in next step.

A declarative configuration as Config.yaml

Here is the config.yaml, in which it defines:

  • Description of books as the input feature of text, pre-process the data by dropping the row of a book record if the value of Description column is missing, and the text of book description is encoded with parallel_cnn encoder.
  • Category of books as the output feature of set, a list of book genres separated by comma.

Please refer to ludwig's document for detail about configuration.

 

input_features:
  - name: Description
    type: text
    preprocessing:
      missing_value_strategy: drop_row
    encoder:
      type: parallel_cnn

output_features:
  - name: Category
    type: set
    preprocessing:
      missing_value_strategy: drop_row
      tokenizer: comma

trainer:
  epochs: 100
  early_stop: 10

 

Model Experiment (Training, Testing and Evaluation)

Based on the config.yaml above, we can start the experiment with command

 

ludwig experiment --config config.yaml --dataset dataset.csv --model_name bgc_parallel_cnn

 

After around 2 hours training on Mac Pro(M1), the best model checkpoint has been saved as model_weights, model_hyperparameters.json and training_set_metadata.json as below.

YatseaLi_0-1726705109392.png

Review and visualize the logs and metrics of training, testing and validation with tensorboard via command

 

tensorboard --logdir results/experiment_bgc_parallel_cnn/model/logs

 

YatseaLi_1-1726748245864.png

For output feature Category as set feature, its metrics are calculated every epoch and are available for set features

  • jaccard (counts the number of elements in the intersection of prediction and label divided by number of elements in the union)
  • the loss.

Note that the metrics vary on the type of output feature, and ludwig also provide virtualization capability on metrics for analysis and comparison.

Model Serving and Inference

Once the model training or experiment is completed, we can serve the model with ludwig serve command:

 

ludwig serve --model_path=results/experiment_bgc_parallel_cnn/model

# As a result, the inference server is up and running at default port 8000
INFO:     Started server process [22710]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

 

Now, we can test the inference API with a python http client program on one of my favorite sci-fiction book "The Three-Body Problem",  which is not part of the dataset.

YatseaLi_0-1727187566860.jpeg

 

import requests

print("\n=============================================================")
print("API Test#1: Predict the geners of a book with its description")

url = "http://0.0.0.0:8000/predict"
headers = {"content-type": "application/x-www-form-urlencoded"}
book_1_descr = "Three Body Problem: A past, present, and future wherein Earth encounters an alien civilization from a nearby system of three Sun-like stars orbiting one another, a representative example of the three-body problem in orbital mechanics."

response = requests.post(url, data=payload, headers=headers)
print(f"Book Description:\n{book_1_descr}\nPredicted Genres:\n{response.json()}")

 

The prediction output as below. It correctly classifies the genres of "Three Body Problem" as General(99.4%), Fiction(99.9%) and Science Fiction(78.5%).

 

{
  "Category_predictions": [
    "General",
    "Fiction",
    "Science Fiction"
  ],
  "Category_probabilities": [
    0.9944374561309814,
    0.9987589120864868,
    0.7850141525268555
  ]
}

 

As you can see, We can train and serve a ML model for book genre classification just within a few hours and with a proper dataset. The only code I have done is the config.yaml. Behind the scene, based on my configuration, ludwig helps me with the heavy lift such as

  • pre-processing data: drop the row if Description of the book is missing, and split the dataset for training, test and validation.
  • generate the metadata of dataset, such as the mapping of idx2str for Description and Category(book genre)
  • perform feature engineering for input feature for the text of Book Description and output feature for Book Category(Genre).
  • generate the model architecture, and initiate the configuration for the training.
  • create and launch a server for inference API.

Fast Iterations of Model Experiments toward Production-Ready

In a real-life ML project, it doesn't stop here. You will need to quickly iterate the experiments until reaching the desired metrics of the model candidates. The faster, the better.

  • In some case, a complicated data pre-process with code in a data pipeline(out of ludwig scope) may be required though ludwig can help with some basic data pre-processing.
  • experiment different model architectures and configuration.
  • hyperparameter optimization and tuning of model performance
  • train and serve in production with GPU acceleration at scale

All of these except complicated data pre-process can be easily achieved in low-code fashion with a config yaml file and ludwig cli commands.

For example, the choice of text encoder for Description is influential on the model performance, and worthy on experiments of different options. After consulting with ChatGPT by asking "I would like to build a book genre classification model, input with book description, output with book genre. And my book dataset only has around 10000 records, please kindly recommend the text encoder, if possible, provide a full config.yaml

Here are the text encoder available in ludwig:
'embed', 'parallel_cnn', 'stacked_cnn', 'stacked_parallel_cnn', 'rnn', 'cnnrnn', 'transformer', 'albert', 'xlmroberta', 'bert', 'deberta', 'gpt', 'gpt2', 'roberta', 'transformer_xl', 'xlnet', 'distilbert', 'camembert', 't5', 'flaubert', 'electra', 'longformer', 'auto_transformer', 'tf_idf', 'llm' "

YatseaLi_0-1726727631985.png

And ChatGPT recommends parallel_cnn and DistillBERT, also respond with a config.yaml. With some small revision of config.yaml, I can quickly iterate the process with DistillBERT as below, and compare its result with parallel_cnn encoder. Also its variance with trainable property as false. 

 

input_features:
  - name: Description
    type: text
    preprocessing:
      missing_value_strategy: drop_row
    encoder:
      type: distilbert
      trainable: true
output_features:
  - name: Category
    type: set
    preprocessing:
      tokenizer: comma
trainer:
  epochs: 10
  learning_rate: auto

 

Hyperparameter Optimization for Model Tuning

Alongside the trainable parameter in the DistilBert encoder, there are huge space of the hyperparameter optimization for searching the optimal parameters, resulting in optimal training efficiency and model performance. The following sample configuration is to as trainer.learning_rate, trainer.optimizer, combiner.num_fc_layers and  distilbert text encoder for Description, which aims to maximize the metric jaccard for Category.

 

input_features:
  - name: Description
    type: text
    preprocessing:
      missing_value_strategy: drop_row
    encoder:
      type: distilbert
      trainable: false

combiner:
  type: concat
  num_fc_layers: 1

output_features:
  - name: Category
    type: set
    preprocessing:
      missing_value_strategy: drop_row
      tokenizer: comma

trainer:
  learning_rate: 0.001
  optimizer:
    type: adam

hyperopt:
  goal: maximize
  output_feature: Category
  metric: jaccard
  split: validation
  parameters:
    trainer.learning_rate:
      space: loguniform
      lower: 0.0001
      upper: 0.1
    training.optimizer.type:
      space: choice
      categories: [adam, adamw, adagrad]
    combiner.num_fc_layers:
      space: randint
      lower: 1
      upper: 5
    # Description.encoder.trainable:
    #   space: choice
    #   categories: [True, False]

  search_alg:
    type: variant_generator

  executor:
    type: ray
    num_samples: 10
    time_budget_s: 360000
    cpu_resources_per_trial: 1
    gpu_resources_per_trial: 1

 

then we can run the hyperparameter optimization task with command:

 

ludwig hyperopt --config hyperopt.yaml --dataset dataset.csv --experiment_name hyperopt --model_name bgc_disbert --logging_level debug

 

 

Ludwig API

Alongside the ludwig cli commands(ludwig experiment/train/serve/virtualize etc), ludwig also provide the python API to access the same functions via programing. Here is an example about text classification with both ludwig cli and python API for your reference. With Ludwig API, ludwig can be integrated with third-party solutions or platforms.

For example, the ludwig experiment can be programmatically accessed through the method experiment of LudwigModel class with exact the same inputs and outputs. The ludwig configuration will be prepared as dictionary and passed on LudwigModel initialization. The dataset will be prepared as pandas.dataframe and passed on to LudwigModel.experiment().

Support of Cloud Storage

Ludwig provides out-of-the-box support for reading and writing the dataset or model to cloud object storage systems like Amazon S3, Azure Blob Storage, and Good Cloud Storage. For example, ludwig commands can work with config file and dataset in Amazon S3.  Please refer to this for more detail.

 

Integrate Ludwig with SAP

In this section, we will explore the integration possibilities of ludwig with SAP products for low-code machine learning, such as using SAP HANA Cloud or SAP Datasphere as data source for model training with ludwig, and integrating and running ludwig's AI workloads with SAP AI Core.

Integrate SAP HANA Cloud or SAP Datasphere with Ludwig

As mentioned above, ludwig experiment access its dataset(training/test/validation) as an instance of pandas.dataframe. In an SAP customer's system landscape, SAP HANA Cloud or SAP Datasphere are often used for central data management. With its built-in hana_ml python library and hana_ml.dataframe, we can query and retrieve data from SAP HANA Cloud or SAP Datasphere, and feed them into ludwig model training workflows without the need of extracting and moving the data outside of SAP. 

In our original use case, the books data of the bookshop solution are stored in SAP Datasphere. Here is some sample code about retrieving the books from SAP Datasphere, and using them in ludwig model experiment.

 

import hana_ml
from hana_ml import dataframe
import logging
from ludwig.api import LudwigModel

conn = hana_ml.dataframe.ConnectionContext(
    '<YOUR_HANA_CLOUD_HOST>',
    '<YOUR_HANA_CLOUD_PORT>',
    '<YOUR_HANA_CLOUD_USER>',
    '<YOUR_HANA_CLOUD_PASSWORD>',
    encrypt='true',
    sslValidateCertificate='false')

# Assume that the books data are in table Books of schema XXXX in SAP HANA Cloud or SAP Datasphere
df_hana = (conn.table('Books', schema='XXXX'))
# convert hana_ml.dataframe to panda.dataframe
df_books= df_hana.select(['Description', 'Category']).collect()

# Prepare the ludwig config dict
config = {
  "input_features": [
    {
      "name": "Description",      # The name of the input column
      "type": "text",             # Data type of the input column
      "encoder": {
        "type": "parallel_cnn",   # The model architecture
      }                          
    }
  ],
  "output_features": [
    {
      "name": "Category",
      "type": "set",
    }
  ]
}

# Constructs Ludwig model from config in dictionary
model = LudwigModel(config, logging_level=logging.INFO)

# Trains the model with dataset retrieved from SAP HANA Cloud or SAP Datasphere above. 
model.experiment(dataset=df_books)

 

Integrate ludwig with SAP AI Core for low-code machine learning

SAP AI Core is the corner stone of SAP Business AI on AI workloads that is designed to handle the execution and operations of your AI assets in a standardized, scalable, and hyperscaler-agnostic way. With its openness, we can integrate ludwig with SAP AI Core, run ludwig's AI workloads on SAP AI Core.

Business Values

The business values of combining low-code machine learning with ludwig and SAP AI Core:

  • Accelerated innovation with enterprise AI: Ludwig simplifies the machine learning process with a low-code approach, enabling business users and data scientists to quickly experiment, build, train, and deploy models. With SAP AI Core, these machine learning operations can be streamlined and managed at scale, fostering faster AI-driven innovation across the enterprise.

  • Improved Accessibility and Productivity: Ludwig’s no-code/low-code approach makes machine learning more accessible to non-technical users. When paired with SAP AI Core, this enables broader adoption of AI across teams, empowering citizen data scientists and business users to create ML solutions without deep coding expertise.

  • Enterprise Readiness:  With SAP AI Core, we can make Ludwig enterprise ready with enterprise-grade scalability, security, and compliance.
  • Seamless Integration of Custom Intelligence into Business Process: With SAP AI Core, models built with Ludwig can be easily integrated with SAP Solutions, such as SAP S/4HANA Cloud, or other cloud solutions, or even your custom application on SAP BTP with custom intelligence.

Solution Architecture

01-solution-architecture.jpg

Technical Detail

As indicated in the solution architecture, we'll have one workflow template named ludwig-cli for running the ludwig commands(ludwig experiment/train/hyperopt etc),  and one serving template named ludwig-serve for serving the models. Let's have a look at the technical detail.

Prerequisites

If you would like to try it out with hands-on, please go through the prerequisites below.

  • If you are new to SAP AI Core, please get yourself familiar with SAP AI Core first by follow this tutorial.
  • You will need a github account, docker account, AWS(or Azure/GCS etc) account for cloud object storage as described in this tutorial.
Step 1: Adapted serving code for SAP AI Core

In model serving and inference above, the default endpoint looks like. http://0.0.0.1:8000/predict etc. In the ludwig-serve serving template, we'll launch the inference server through an adapted ai_core_serve.py with steps below:

  • Copy the serve.py from ludwig github repo and saved it as ai_core_serve.py (to be used in docker image)
  • Simply change to the endpoints: 
    • @app.post("/predict") => @app.post("/v1/predict")
    • @app.post("/batch_predict") => @app.post("/v1/batch_predict")
Step 2: Build and Push the custom Docker Image

We'll build a custom docker image based on official ludiwg-ray-gpu:master image adapted for SAP AI Core, which will be used in both ludwig-cli workflow template and ludwig-serve serving template. Here is the Dockerfile for ludwig-ray-gpu:ai-core image. Please assure ai_core_serve.py file is placed in the same folder as Dockerfile below when building the image.

 

 

# Specify the base layers (default dependencies) to use
ARG BASE_IMAGE=ludwigai/ludwig-ray-gpu:master
FROM ${BASE_IMAGE}

RUN sudo apt-get install -y openmpi-bin

# Update and install dependencies of cloud storages: https://ludwig.ai/latest/user_guide/cloud_storage/
# s3fs 0.4.8 already installed in base image. So no futher installation is required for using AWS S3.
# To use Azure Blob Storage, please install adlfs with a compatible version of fsspec[http]<=2023.10.0.
# To use Goolge Cloud Storage, please install gcsfs with a compatible version of fsspec[http]<=2023.10.0
RUN pip install --upgrade --no-cache-dir pip
RUN pip install --no-cache-dir "s3fs>=2023.6.0, <=2023.10.0" "aiobotocore>=2.5.4,<2.6.0" "botocore==1.31.17" "awscli==1.29.17" "boto3==1.28.17" "mpi4py==3.1.5"
RUN pip install --upgrade --no-cache-dir transformers bitsandbytes

# Create directory for user nobody in SAP AI Core run-time
RUN sudo mkdir -p /nonexistent/ludwig/results && \
    sudo mkdir -p /nonexistent/ludwig/data && \
    sudo mkdir -p /nonexistent/matplotlib-conf && \
    sudo mkdir -p /nonexistent/hf-home && \
    sudo mkdir -p /home/ray/ray_results && \
    sudo mkdir -p /home/ray/.triton && \
    sudo mkdir -p /home/ray/.cache && \
    sudo chown -R nobody:nogroup /nonexistent /home/ray/ray_results /home/ray/.triton /home/ray/.cache && \
    sudo chmod -R 770 /nonexistent /home/ray/ray_results /home/ray/.triton /home/ray/.cache

# Copy adapted ai_core_serve.py for SAP AI Core serving
COPY ai_core_serve.py /nonexistent/ludwig/ai_core_serve.py
ENV HF_HOME=/nonexistent/hf-home
ENV MPLCONFIGDIR=/nonexistent/matplotlib-conf

 

 

Build and push your own docker image to docker hub with commands below:

 

# 1.Login to docker hub
docker login -u <YOUR_DOCKER_USER> -p <YOUR_DOCKER_ACCESS_TOKEN>

# 2.Build the docker image
docker build --platform=linux/amd64 -t docker.io/<YOUR_DOCKER_USER>/ludwig-ray-gpu:ai-core .

# 3.Push the docker image to docker hub
docker push docker.io/<YOUR_DOCKER_USER>/ludwig-ray-gpu:ai-core 

 

Once the docker image is pushed, note down the docker image uri: docker.io/<YOUR_DOCKER_USER>/ludwig-ray-gpu:ai-core, which will be used in create the configurations in SAP AI Core later.

Step 3: Dataset and Model Management in Cloud Object Storage

As mentioned above, ludwig can have direct access to cloud storage for Dataset and Model, therefore, it is not required to register Dataset and Model artifacts in SAP AI Core. For the book-genre-classification use case, I have created a S3 bucket named ludwig-custom-models, and have created the folder structure as below:

 

 

ludwgi-custom-models (s3 bucket)
    /book-genre-classification
        /config
            -config_parallel_cnn.yaml
            -config_distilbert.yaml           
        /data
            -dataset.csv
        /results

 

Step 4: Create a github repo for hosting the templates below.
1). ludwig-cli Workflow Template

Here is the yaml file about ludwig-cli-template.yaml for general workflow of running ludwig command with direct access to cloud object storage. Please replace the place holder  <REPLACE_WITH_YOUR_DOCKER_SECRET>.

 

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: ludwig-cli
  annotations:
    scenarios.ai.sap.com/description: "General ludwig cli workflow with direct access to cloud storage"
    scenarios.ai.sap.com/name: "ludwig-cli"
    executables.ai.sap.com/description: "ludwig cli"
    executables.ai.sap.com/name: "ludwig-cli"
  labels:
    scenarios.ai.sap.com/id: "ludwig-cli"
    executables.ai.sap.com/id: "ludwig-cli"
    ai.sap.com/version: "1.0.0"
spec:
  imagePullSecrets:
    - name: <REPLACE_WITH_YOUR_DOCKER_SECRET>
  entrypoint: ludwig-cli
  arguments:
    parameters:
      - name: image
        value: "docker.io/ludwigai/ludwig-ray-gpu:master"
        description: "Your custom docker image for ludwig."
      - name: exportCommand
        value: "AWS_ACCESS_KEY_ID=<YOUR_AWS_ACCESS_KEY_ID> AWS_SECRET_ACCESS_KEY=<YOUR_AWS_SECRET_ACCESS_KEY>"
        description: "Setup environment viable(s) with export command: export <exportCommand>. For example, environment viable for cloud storage: https://ludwig.ai/latest/user_guide/cloud_storage/"
      - name: ludwigCommand
        value: "experiment --config s3://<YOUR-S3-BUCKET/PATH/TO/YOUR_CONFIG>.yaml --dataset s3://<YOUR-S3-BUCKET/PATH/TO/YOUR_DATASET>.csv --experiment_name <YOUR_EXPERIMENT_NAME> --model_name <YOUR_MODEL_NAME> --logging_level info"
        description: "Setup environment viable(s) with export command: export <exportCommand>. For example, environment viable for cloud storage: https://ludwig.ai/latest/user_guide/cloud_storage/, or environment viable HF_TOKEN for hugging face if you need to finetune a gated model from hugging face"
      - name: resourcePlan
        value: "starter"
        description: "Resource Plan of SAP AI Core. More detail available here: https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/choose-resource-plan-c58d4e584a5b40a2992265beb9b6be3c"
  templates:
    - name: ludwig-cli
      metadata:
        labels:
          ai.sap.com/resourcePlan: "{{workflow.parameters.resourcePlan}}"
      container:
        image: "{{workflow.parameters.image}}"
        imagePullPolicy: Always
        command: ["/bin/sh", "-c"]
        args:
          - >
            set -e && 
            echo "---Start Ludwig Cli Command---" && 
            export {{workflow.parameters.exportCommand}} &&
            ludwig {{workflow.parameters.ludwigCommand}} &&
            echo "---End Ludwig Cli Command---"

 

2).ludwig-serve Serving Template

Option 1:  ludwig serve with direct model access in cloud storage, no Model artifact registration in SAP AI Core. This option is used in rest of steps. Please replace the place holder  <REPLACE_WITH_YOUR_DOCKER_SECRET>.

 

apiVersion: ai.sap.com/v1alpha1
kind: ServingTemplate
metadata:
  name: ludwig-serve
  annotations:
    scenarios.ai.sap.com/description: "ludwig serve on SAP AI Core"
    scenarios.ai.sap.com/name: "ludwig-serve"
    executables.ai.sap.com/description: "ludwig serve on SAP AI Core"
    executables.ai.sap.com/name: "ludwig-serve"
  labels:
    scenarios.ai.sap.com/id: "ludwig-serve"
    ai.sap.com/version: "0.0.1"
spec:
  inputs:
    parameters:
    - name: image
      type: "string"
      default: "docker.io/ludwigai/ludwig-ray-gpu:master"
      description: "Define the url of the Docker image of which you have built for ludwig."
    - name: exportCommand
      type: "string"      
      default: "AWS_ACCESS_KEY_ID=<YOUR_AWS_ACCESS_KEY_ID> AWS_SECRET_ACCESS_KEY=<YOUR_AWS_SECRET_ACCESS_KEY>"
      description: "Setup environment viable(s) with export command: export <exportCommand>. For example, environment viable for cloud storage: https://ludwig.ai/latest/user_guide/cloud_storage/"
    - name: modelName
      type: "string"
      default: "my-finetune-model"
      description: "Only relevant for llm finetune models compatible with SAP Generative AI Hub SDK"
    - name: modelPath
      type: "string"
      default: "s3://<YOUR-S3-BUCKET/PATH/TO/YOUR_EXPERIMENT_RUN/model"
      description: "The target model path. Local path as /mnt/models or model path in cloud storage. https://ludwig.ai/latest/user_guide/cloud_storage/"
    - name: loggingLevel
      type: "string"
      default: "info"
      description: "logging level to be used to serve with ludwig serve --logging_level"
    - name: resourcePlan
      type: "string"
      default: "starter"
      description: "Resource plans are used to select resources in workflow and serving templates."
    - name: minReplicas
      type: "string"
      default: "1"
      description: "The lower limit for the number of replicas to which the autoscaler can scale down."
    - name: maxReplicas
      type: "string"
      default: "1"
      description: "The upper limit for the number of replicas to which the autoscaler can scale down."
  template:
    apiVersion: "serving.kserve.io/v1beta1"
    metadata:
      annotations: |
        autoscaling.knative.dev/metric: concurrency
        autoscaling.knative.dev/target: 1
        autoscaling.knative.dev/targetBurstCapacity: -1
        autoscaling.knative.dev/window: "10m"
        autoscaling.knative.dev/scaleToZeroPodRetentionPeriod: "10m"
      labels: |
        ai.sap.com/resourcePlan: "{{inputs.parameters.resourcePlan}}"
    spec: |
      predictor:
        imagePullSecrets:
        - name: <REPLACE_WITH_YOUR_DOCKER_SECRET>
        minReplicas: {{inputs.parameters.minReplicas}}
        maxReplicas: {{inputs.parameters.maxReplicas}}
        containers:
        - name: kserve-container
          image: "{{inputs.parameters.image}}"
          ports:
            - containerPort: 8000
              protocol: TCP
          command: ["/bin/sh", "-c"]
          args:
            - >
              set -e && 
              echo "-------------Starting ludwig serve--------------" &&
              export {{inputs.parameters.exportCommand}} &&
              python /nonexistent/ludwig/ai_core_serve.py
              --model_path={{inputs.parameters.modelPath}}
              --logging_level={{inputs.parameters.loggingLevel}}

 

Option 2:  Model artifact registration in SAP AI Core. 
You need to register the model artifact by manual as model output artifact is not defined in training with ludwig-cli template.

 

apiVersion: ai.sap.com/v1alpha1
kind: ServingTemplate
metadata:
  name: ludwig-serve
  annotations:
    scenarios.ai.sap.com/description: "ludwig serve on SAP AI Core"
    scenarios.ai.sap.com/name: "ludwig-serve"
    executables.ai.sap.com/description: "ludwig serve on SAP AI Core"
    executables.ai.sap.com/name: "ludwig-serve"
  labels:
    scenarios.ai.sap.com/id: "ludwig-serve"
    ai.sap.com/version: "0.0.1"
spec:
  inputs:
    artifacts:
      - name: customModel
    parameters:
    - name: image
      type: "string"
      default: "docker.io/ludwigai/ludwig-ray-gpu:master"
      description: "Define the url of the Docker image of which you have built for ludwig."
    - name: exportCommand
      type: "string"      
      default: "AWS_ACCESS_KEY_ID=<YOUR_AWS_ACCESS_KEY_ID> AWS_SECRET_ACCESS_KEY=<YOUR_AWS_SECRET_ACCESS_KEY>"
      description: "Setup environment viable(s) with export command: export <exportCommand>. For example, environment viable for cloud storage: https://ludwig.ai/latest/user_guide/cloud_storage/"
    - name: modelName
      type: "string"
      default: "my-finetune-model"
      description: "Only relevant for llm finetune models compatible with SAP Generative AI Hub SDK"
    - name: modelPath
      type: "string"
      default: "/mnt/models"
      description: "The target model path. Local path as /mnt/models or model path in cloud storage. https://ludwig.ai/latest/user_guide/cloud_storage/"
    - name: loggingLevel
      type: "string"
      default: "info"
      description: "logging level to be used to serve with ludwig serve --logging_level"
    - name: resourcePlan
      type: "string"
      default: "starter"
      description: "Resource plans are used to select resources in workflow and serving templates."
    - name: minReplicas
      type: "string"
      default: "1"
      description: "The lower limit for the number of replicas to which the autoscaler can scale down."
    - name: maxReplicas
      type: "string"
      default: "1"
      description: "The upper limit for the number of replicas to which the autoscaler can scale down."
  template:
    apiVersion: "serving.kserve.io/v1beta1"
    metadata:
      annotations: |
        autoscaling.knative.dev/metric: concurrency
        autoscaling.knative.dev/target: 1
        autoscaling.knative.dev/targetBurstCapacity: -1
        autoscaling.knative.dev/window: "10m"
        autoscaling.knative.dev/scaleToZeroPodRetentionPeriod: "10m"
      labels: |
        ai.sap.com/resourcePlan: "{{inputs.parameters.resourcePlan}}"
    spec: |
      predictor:
        imagePullSecrets:
        - name: <REPLACE_WITH_YOUR_DOCKER_SECRET>
        minReplicas: {{inputs.parameters.minReplicas}}
        maxReplicas: {{inputs.parameters.maxReplicas}}
        containers:
        - name: kserve-container
          image: "{{inputs.parameters.image}}"
          ports:
            - containerPort: 8000
              protocol: TCP
          command: ["/bin/sh", "-c"]
          args:
            - >
              set -e && 
              echo "-------------Starting ludwig serve--------------" &&
              export {{inputs.parameters.exportCommand}} &&
              python /nonexistent/ludwig/ai_core_serve.py
              --model_path={{inputs.parameters.modelPath}}
              --logging_level={{inputs.parameters.loggingLevel}}
          env:
            - name: STORAGE_URI
              value: "{{inputs.artifacts.customModel}}"

 

Step 5: Onboard your github repo to your SAP AI Core

Please follow this document to add a git repo through SAP AI Launchpad.

Step 6: Create an Application and Synchronize with github repo

Please follow this document to create an application  through SAP AI Launchpad.

Step 7: Create a configuration for model experiment(Training, Testing and Evaluation)

10-ludwig-exp-config.jpg

resourcePlan: If you are working with free plan of SAP AI Core, then the only option is starter, which will take around 20 hours to complete the experiment. Otherwise, you may choose infer.s or train.l with GPU acceleration, which takes around 30 mins.

ludwigCommand
: experiment --config s3://ludwig-custom-models/book-genre-classification/config/config_parallel_cnn.yaml --dataset s3://ludwig-custom-models/book-genre-classification/data/dataset.csv --experiment_name parallel_cnn --model_name bgc --output_directory  s3://ludwig-custom-models/book-genre-classification/results/

Please follow this document to create a configuration for ludwig-cli scenario through SAP AI Launchpad for more detail.

Step 8: Create an execution

Create an execution for configuration in step 7 by click "Click Execution" button in last screenshot. Then you can check the logs of execution. Please follow this document to create an execution through SAP AI Launchpad.
10-ludwig-exp-training-start.jpg

Once the execution is completed. It will produce the result in s3://ludwig-custom-models/book-genre-classification/results/parallel_cnn_bgc/model/ as screenshot below. The three files highlighted in red will be used to serve the inference API in next steps.
10-ludwig-exp-output.jpg

Step 9: Create a configuration of ludwig-serve serving template

20-ludwig-serve-config.jpg
resourcePlan: As it is a small model, so starter is enough for test.
modelPath: Please assure the right modelPath as the output model path in step 7.

Step 10: Create a deployment for serving

Create a deployment for configuration in step 9 by clicking "Click Deployment" button in last screenshot, note down the deployment id to be used later. Then you can check the logs of deployment. Please follow this document to create an execution through SAP AI Launchpad. At the end, the deployment is running, and the logs indicate that inference server is started.
21-ludwig-serve-deployment.jpg

Step 11: Consume the inference API of model

Finally, we can consume the inference API from step 10. Here is the sample code.

 

import requests, json
from ai_api_client_sdk.ai_api_v2_client import AIAPIV2Client

# Please replace the resource_group if it is not default
resource_group="default" 

# The following configuration comes from the service key of SAP AI Core
ai_api_client = AIAPIV2Client(
    base_url= "<AI_API_URL_FROM_SERVICE_KEY>/v2/lm",
    auth_url= "<URL_FROM_SERVICE_KEY>/oauth/token",
    client_id= "<CLIENT_ID>",
    client_secret="<CLIENT_SECRET>",
    resource_group=resource_group)

token = ai_api_client.rest_client.get_token()
headers = {
        "Authorization": token,
        'ai-resource-group': resource_group,
        "Content-Type": "application/json"}

# prepare the inference base url
deployment_id = "<REPLACE_WITH_YOUR_DEPLOYMENT_ID>"
deployment = ai_api_client.deployment.get(deployment_id)
inference_base_url = f"{deployment.deployment_url}/v1"
headers["Content-Type"] = "application/x-www-form-urlencoded"

predict_endpoint = f"{inference_base_url}/predict"

# predict the book genre with its description
json_data = {
    "Description": "Three Body Problem: A past, present, and future wherein Earth encounters an alien civilization from a nearby system of three Sun-like stars orbiting one another, a representative example of the three-body problem in orbital mechanics."
}

response = requests.post(predict_endpoint, headers=headers, data=json_data)
print('Result:', response.text)

 

 Conclusion

As we have seen, Ludwig simplifies the machine learning process with a low-code approach, enabling business users and data scientists to quickly experiment, build, train, and deploy models. With integration with SAP AI Core, these machine learning operations can be streamlined and managed at scale, fostering faster AI-driven innovation across the enterprise with enterprise-grade of security, scalability and compliance. In the 2nd blog post, we'll replicate the use cases about defect detection with image segmentation and sound-based predictive maintenance with Ludwig with low-code approach, and compare with their original implementation with pro-code approach.

P.S. The publication of the full sample code under github.com/sap-samples is work in process. Please stay tuned.