Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
lars_gregori
Advisor
Advisor
1,570
In my previous blog post about image retraining, a question of the data structure for text retraining came up. I have answered this question, but I think it makes also sense to write a blog post about it. So I decided to write a small series with at least one more blog post.

For the text retraining I will use Twitter Sentiment Analysis data which classifies positive and negative sentences. SAP Leonardo trains the Machine Learning Model, which can be deployed and used by the text classification service. Here is an example:



Most of the procedure is similar to the image retraining, therefore I will refer to it and describe the differences in more detail.

What do you need?

Step 1 – SAP Leonardo Machine Learning instance


Take a look at Step 1 of the Image Retraining to create a SCP trial account and a Service Key:
{
"clientid": "sb-42mn3-3z7p-96r-3c79-0x1pm0l!p216|klgco-lag-vitas!d66",

"clientsecret": "5ghsdYM/Z567N5LoQ7nrXBkZ0BV=",
"serviceurls": {

"TEXT_LINEAR_RETRAIN_API_URL": "https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining",

"TEXT_CLASSIFIER_URL": "https://mlftrial-text-classifier.cfapps.eu10.hana.ondemand.com/api/v2/text/classification",

},
"url": "https://p2000894545trial.authentication.eu10.hana.ondemand.com"
}

Note: Instead of the IMAGE_RETRAIN_API_URL and IMAGE_CLASSIFICATION_URL the TEXT_LINEAR_RETRAIN_API_URL and TEXT_CLASSIFIER_URL are important.

Don't mess it up like I did. :confounded face:

Step 2 – Storage


Same procedure as Step 2 at the image retraining.

In case you've got already a storage, just run the POST again to get the endpoint, accessKey and secretKey.

Step 3 – Training data


The following steps are necessary to create the training data:

The script creates a sentiment_100.zip file with the following structure (see also SAP Help – Uploading Data😞
sentiment
├── test
│ ├── negative
│ └── positive
├── training
│ ├── negative
│ └── positive
└── validation
├── negative
└── positive

Step 4 – Upload


Take a look at Step 4 of the image retraining and run this to upload the sentiment data set:
mc cp sentiment_100.zip saps3/data/sentiment

Step 5 – Training


After uploading the training data, start the training with Postman.

Training


URL: TEXT_LINEAR_RETRAIN_API_URL/jobs
Doc: https://api.sap.com/api/text_linear_retrain_api/resource

POST
https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining/jobs

Headers:
Authorization: {{Bearer Token}}
Content-Type: application/json

Body:
{
"dataset": "sentiment",
"modelName": "sentiment",
"preprocessingLanguage": "en",
"completionTime": 24,
"memory": 8192
}

Result:
{
"id": "sentiment-2018-12-01t2235z745432"
}

Jobs


You can check, if the job is successful finished, when the status is SUCCEEDED.
URL: TEXT_LINEAR_RETRAIN_API_URL/jobs
Doc: https://api.sap.com/api/text_linear_retrain_api/resource

GET
https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining/jobs

Results:
{
"finishTime": "2018-12-01T23:31:02+00:00",
"message": "",
"startTime": "2018-12-01T22:35:53+00:00",
"id": "sentiment-2018-12-01t2235z745432",
"status": "SUCCEEDED",
"submissionTime": "2018-12-01T22:35:51+00:00"
}

This took nearly one hour but I use over 300,000 sentiments (sentiment_5) for the training.

Logs


In case of failure or success, download the job logs:
mc cp --recursive saps3/data/<JOB ID>/ logs

example:
mc cp --recursive saps3/data/sentiment-2018-12-01t2235z745432/ logs

Step 6 – Deploy


The model must be deployed after a successful training.

Deploy Model
URL: TEXT_LINEAR_RETRAIN_API_URL/deployments
Doc: https://api.sap.com/api/text_linear_retrain_api/resource

POST
https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining/deploy...

Header:
Authorization: {{Bearer Token}}
Content-Type: application/json

Body:
{
"modelName": "sentiment",
"modelVersion": "1"
}

Result:
{
"id": "f6b34f68-6bf0-4fe8-98f5-9f9a4310a9b8"
}

After some time the model is available for a text classification.

Step 7 – Test


For my first test I've used this tweet from vitaliy.rudnytskiy:


https://twitter.com/Sygyzmundovych/status/1061608300490440704

Text Classification


URL: TEXT_CLASSIFIER_URL/models/{model}/versions/{version}
Doc: Inference Service for Customizable Text Classification

POST
https://mlftrial-text-classifier.cfapps.eu10.hana.ondemand.com/api/v2/text/classification/models/sen...

Header:
Authorization: {{Bearer Token}}
Content-Type: application/json

Body:
texts=Starting sampling of a next batch of dark beers. This one has nice velvety taste, but way too sweet. ? - Drinking a Świderskie by Cerkom @ Oporów —

Here is the result for this 88.4% positive tweet:
{
"id": "6288fb40-1671-4c09-7cec-0baa12950d82",
"predictions": [
{
"results": [
{
"label": "positive",
"score": 0.8846777437444767
},
{
"label": "negative",
"score": 0.11532225625552328
}
]
}
],
"processedTime": "2018-12-02T14:24:14.242227+00:00",
"status": "DONE"
}

I don't want to end this blog post with a negative sentence, but you can find one in my Postman collection.

have fun :goofy face: