Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
felixbartler
Product and Topic Expert
Product and Topic Expert
1,285

When deploying machine learning models on SAP AI Core, they are exposed at a deployment URL for consumption. In this blogpost we will look into easy options to make these endpoints static, so that you have a easy time to integrate them to into applications.

felixbartler_0-1717074164671.png

Background:

SAP AI Core is the AI Workload Management Service on BTP. Aside from Trainings-Workflows, AI Core is mainly used to host models for productive inference. When issuing such a deployment, the resulting URL is comprised of the deployment ID like this: <host>/v2/inference/deployments/<deployment_id>/v2/<defined_path>

resulting in a situation, where each new iteration of the deployment will result in having a new URL. Now when integrating this Endpoint to a Business Application, ideally we can refer to a static url for our deployment.

In this blog post we look at two ways of how to accomplish such a scenario:

1. Patching a existing deployment:

Since some time, AI Core introduced a interesting feature, that allows us to PATCH a deployment. This effectively gives us the native opportunity to provide a new configuration to an already existing deployment.

Since the configurations hold the parameters (environment variables) and the artifacts, this means we can gracefully introduce new artifacts on the fly. For example modifying a top_n parameter or something used in the process.

Upon trying it out, I found out, that we can even throw in a new version of the docker image we registered. Effectively AI Core starts up a new instance of our workload with a separate environment. As I observed, this switchover happens in a matter of a few minutes and this is achieved with near zero downtime.

Let's have a look how this looks in action:

Demo:

To demonstrate how this works for the client, I ran a script polling my /hello endpoint every 3 seconds.

@app.post("/v2/hello/")
def hello():
    return {"prediction": "this_is_the_updated_code", "envexample": os.environ.get("envexample", "none")}

It outputs a hard-coded string and the environment variable set via a configuration.

As you can see in the below screenshot - showing the print output of the endpoints response, the switchover works nicely. Interesting to note is, that the code change, as well as the configuration change took place. Although we have to mention, that there seems to be a few seconds, where our API call results in an error. Overall this looks very promising!

felixbartler_2-1717073192027.png

Now let's have a look at how we can update a deployment:

In the AI Launchpad there is a dedicated "Update" Button for Deployments:

felixbartler_0-1717158015726.png

In the menu we can select a previously created configuration to patch it.

With the ai_core_sdk it is as easy as running the one line below. It requires you to provide an existing ID of a deployment and you can provide a new configuration ID.

 

deployment_response = ai_api_v2_client.deployment.modify(existing_deployment_id, None, new_config_id)

 

Because this is quite helpful, I integrated this functionality to update a deployment instead of creating a new one into the full CI/CD example on Github. Checkout my other Blog where I describe how I use a deployment config.json to specify the target state of what should be deployed - there I just added a field for deployments "existing_deployment_id", which in case filled is tried to be updated instead of creating a new one.

Note: Kubernetes does provide the possibility to specify the imagePullPolicy for containers. This should be taken into account. I tested the scenario to switch over code with the values Always and IfNotPresent. And for both cases it seems to work. Obviously, the updated Docker Image does have to have the same tagging specified in the serving template 😉

2. Maintaining a Destination:

A BTP native way to find a solution to the changing deployment URL can be the usage of the destination service. In general developers of Apps typically register all theire external Services as such a Destination to ease the management of dependencies across environments. The Destinations are most of the time created per Subaccount and thus prove to be a good way to feed the service details per stage to the business application.

Now we as ML Engineer can provide a Destination to our Inference Endpoint either by creating one manually and updating it whenever we want to have a updated deployment - or we even create it automatically via the API. 

felixbartler_0-1717071331807.png

In the BTP Cockpit under the respective Subaccount > Connectivity > Destinations we can create one using the Create Destination Button. There we choose the Protocol HTTP and the Authentication Type OAuth2ClientCredentials. Upon completion, we can check the details using the Check Connection Button.

In my other blog post, I show how to setup a CI/CD Pipeline for the automatic deployment of Code and Templates to AI Core. For that purpose, it would be ideal to also use a automatic way of updating the destinations. Below I created a little Python Script, that can be integrated in a CI/CD flow:

 

import os
import logging
import requests
from requests.auth import HTTPBasicAuth

logging.basicConfig(level=logging.INFO, format='%(message)s')

AICORE_AUTH_URL = os.environ["AICORE_AUTH_URL"]
AICORE_BASE_URL = os.environ["AICORE_BASE_URL"]
AICORE_CLIENT_ID = os.environ["AICORE_CLIENT_ID"]
AICORE_CLIENT_SECRET = os.environ["AICORE_CLIENT_SECRET"]
AICORE_RESOURCE_GROUP = os.environ["AICORE_RESOURCE_GROUP"]

DESTINATION_AUTH_URL = os.environ["DESTINATION_AUTH_URL"]
DESTINATION_BASE_URL = os.environ["DESTINATION_BASE_URL"]
DESTINATION_CLIENT_ID = os.environ["DESTINATION_CLIENT_ID"]
DESTINATION_CLIENT_SECRET = os.environ["DESTINATION_CLIENT_SECRET"]


def update_deployment_destination(destination_name, deployment_id):
    """create or update a subaccount level destination for a deployment id"""
    
    logging.info(f"CREATE DESTINATION {destination_name}")
        
    auth_response = requests.post(f"{DESTINATION_AUTH_URL}/oauth/token?grant_type=client_credentials", auth=HTTPBasicAuth(DESTINATION_CLIENT_ID, DESTINATION_CLIENT_SECRET))

    # Check if the request was successful
    if auth_response.status_code != 200:
        logging.info(auth_response, auth_response.content)
        raise Exception("DESTINATION LOGIN ERROR")

        
    destination_body = {
        'Description': 'desc',
        'Type': 'HTTP',
        'clientId': AICORE_CLIENT_ID,
        'Authentication': 'OAuth2ClientCredentials',
        'Name': destination_name,
        'tokenServiceURL': f"{AICORE_AUTH_URL}/oauth/token",
        'ProxyType': 'Internet',
        'URL': f"{AICORE_BASE_URL}/inference/deployments/{deployment_id}",
        'tokenServiceURLType': 'Dedicated',
        'clientSecret': AICORE_CLIENT_SECRET
    }

    headers = {"Authorization": "Bearer " + auth_response.json()["access_token"], "Content-Type": "application/json", "Accept": "application/json" }

    create_response = requests.post(f"{DESTINATION_BASE_URL}/destination-configuration/v1/subaccountDestinations", json=destination_body, headers=headers)
    
    if create_response.status_code == 409:  # means destination with name exists
        update_response = requests.put(f"{DESTINATION_BASE_URL}/destination-configuration/v1/subaccountDestinations", json=destination_body, headers=headers)

    if create_response.status_code == 201 or update_response.status_code == 200:
        logging.info(f"DESTINATION {destination_name} SUCCESSFULLY CREATED/UPDATED")

 

It's using the Destination Services RESTful API to Create or Update the Destination.

Note: As a prerequisite we need to create a instance of the Destination Service on the BTP and create a service key off it. Those credentials are used to access the API and are setup as environment variables.

Final Words:

Both ways presented are a good solution on how to accomplish a static url for deployments in SAP AI Core. There are other alternatives like hosting a own small Cloud Foundry App to redirect traffic or using API Rules in third party solutions. Especially the first example can be interesting for developers working on long running deployments. Not only does this give us the static url, but more interestingly we can change existing deployments in various ways.

Hope this Blog Post was interesting. Feel free to checkout the Github and leave a comment. 😉