Late last year, we introduced SAP AI Core & SAP AI Launchpad, which were both made generally available via the SAP Business Technology Platform (aka BTP). With these 2 products you can productise and operate AI models that natively integrate with SAP applications.
Specifically, SAP AI Core is the service with which you can train, deploy, and monitor ML models. SAP AI Launchpad on the other hand is a Software as a Service application that provides a visual interface to manage AI scenarios, whether they exist in SAP AI Core or not. These AI Scenarios could exist in SAP solutions (like AI Business Services) or could exist outside the SAP ecosystem.
This blog summarises the end to end ML lifecycle process and intends to serve as a visual introduction of ML Operations (ML Ops) in SAP AI Core & SAP AI Launchpad. If you’d like to follow step by step tutorials for simple to advanced use cases, please follow the tutorials provided on our developers portal here.
From configure to manage - what to expect
The end-to-end process of setting up your models as an ML Ops engineer, will look something like below.
Figure 1: Configuring to managing - the end to end process flow
Before we jump into what the actual process looks like, it is pertinent to understand a few terms here. SAP AI Core connects to 3 external systems.
Github, a Microsoft product, is used to store & collaborate on code for your AI workflows. You can think of workflows as a sequence of steps that will be executed. For eg. Read data, train model and save model to the cloud. These instructions are stored as a yaml file which serves as a configuration file for SAP AI Core.
Docker, is used to maintain the latest ML code and its dependencies.
Lastly, SAP AI Core only provides ephemeral (or short lived) storage, while training or inferencing a model. Data, models and any other collateral that are required at run time, will be stored on a cloud storage like an Amazon S3 bucket.
That said, the rest of this blog will progress is short 30-60s videos of what each step entails. Note: There are many ways of completing these steps. For sake of brevity, this blog largely covers the SAP AI Launchpad way of doing things. Alternate approaches include using the AI APIs to interact with SAP AI Core and AI Core SDK, which are described in detail in the tutorials mentioned above.
To start using SAP AI Core, you will need a BTP account. To start your free tier subscription or free trial account look here.
You can follow the steps here to provision an SAP AI Core instance on BTP. There are video tutorials for the onboarding process here.
In this blog, as in our tutorials, we reference the California Housing dataset. We use the combination of the below features to predict the median house value for California districts, expressed in hundreds of thousands of dollars ($100,000). Below is a sample preview of the what the dataset looks like.
Step 1.1 Get Service Keys
We start off by getting the service keys for SAP AI Core from BTP which can be accessed from the BTP cockpit.
Step 1.2 Create workspace
A workspace is always tied to a dedicated resource group. In this case we use the default resource group. SAP AI Core uses resource groups to isolate ML resources & workloads.
Step 1.3 Connect to Github
To ensure SAP AI Core syncs with the workflows in GitHub you need to first register your GitHub repo in SAP AI Core. This means you will enter your git credentials with the Github path URL and access token. Subsequently, you create an application in SAP AI Core that uses these credentials and syncs yaml files in the GitHub path to SAP AI Core. The syncing is performed every 3 minutes currently.
Step 1.4 Connect to Docker
First you create an access token in your docker repository. Then you add the credentials required for SAP AI Core to pull the docker image from your chosen repository.
Step 1.5 Connect to S3
In this demo we connect to S3, but you can as well connect to any other cloud storage system. We create an object store secret with details of our S3 bucket. This step is currently not supported in SAP AI Launchpad, so we use the AI APIs to demonstrate this step. Note: SAP AI Core writes to S3 buckets named default currently, but it can read from a bucket of any name. You can use default for both reading and writing.
The process of training and serving is nearly identical, with some slight edits.
Step 2.1 Register training artifact
In this step we register this training dataset with SAP AI Core. The data required for training is already available in an S3 bucket. Notice the url field references the S3 bucket registered previously using the object secret (in step 5 of Configure). Starting from the home directory, we include the folder path to the dataset itself. This step is currently not supported in SAP AI Launchpad as of the date this blog was written, so we use the AI APIs to demonstrate this step. This feature is on the roadmap, so expect to see it in upcoming releases.
Step 2.2 Publish train docker image
We now build a docker image on our local system that contains 3 files:
The ML code where the training data is read, pre-processing is performed, model is fit and saved to the S3 bucket.
The packages that the ML code has dependencies on in a file called requirements.txt.
Docker file with instructions on how the image should be created (create folders for data & model, copy the code to relevant folders, install dependencies in docker & provide permissions to execute files in the folders).
Step 2.3 Create train workflow
To understand this step, it is required to have a high level understanding of how yaml works. As mentioned earlier, yaml is used as a configuration file for the sequence of steps SAP AI Core needs to follow at the time of execution. This yaml file is also referred to as the AI Workflow. A couple of things to note here:
we provide a name which is an execution ID that should be unique across all workflows being referenced in your GitHub repository.
we provide a scenario name & executable name that you will reference later during configuration to identify this AI workflow.
imagePullSecrets/name: We provide the name of the docker registry you added in the Configure step.
arguments/parameters/name: We declare the variable DT_MAX_DEPTH (local to AI Workflow) whose value we are expecting to be filled in at the time of execution (more details in the next step). This is the hyper parameter that will decide the depth of the decision tree being used in the ML model.
templates/inputs/artifacts/path: We provide the folder in the docker image within which our ML code will be expecting the input data. SAP AI Core takes care of moving the data from S3, into this folder in the docker image, at runtime.
templates/outputs/artifacts/path: We provide the folder in the docker image into which our ML code will write the final model. SAP AI Core takes care of moving the model from the docker to S3, again at runtime.
templates/container/image: We provide the reference to the docker image in our repository
env/value: We pass the local variable DT_MAX_DEPTH to the variable inside the ML code by the same name.
args: We provide the name of the file containing the ML code that will be executed at run time
Step 2.4 Train using configuration
In this step we create the configuration & the execution:
The configuration is a set of binding information between your workflow and datasets. The workflow itself, contains all information about your data, docker and ML code inside the docker. We provide the following information during setup step:
Scenario name, template name and version number defined in the yaml file (aka AI Workflow)
Value for the parameters defined in the yaml file and docker image
Map the registered data artifact that will be passed to the docker image based on information in the yaml file
Once the configuration is created, we create an execution. An execution is just an instance of your configuration running. The output of the execution in this case is a model which will be saved in S3. The run time status indicates when the execution is completed. The logs indicate errors, if any, and results of any print statements that the ML code contains.
What happens next
Thus far, we saw how to configure & train ML models. Our ML model is now ready and available in S3 for use. Should you be deploying the model on the device (say on an industrial camera or a microcontroller) you can move this model physically now to the end device. However, if a client application (say a web / mobile application) needs to access the model and receive the prediction for a new data point, you need to serve or deploy the model.
You can read these details in Part 2 of this blog, which covers serving the model & managing it.
sureshkumar.raju for reviewing the videos and suggesting very useful edits.
paul-pinard for help with the videos. You can find many similar videos around products in our SAP AI portfolio on our SAP AI channel in youtube.
priyanshu.srivastava5 for help with understanding concepts of the product.
dhrubajyoti.paul for help with understanding the serving tutorials.