In this part of our blog series, you’ll call a real-time inference according to the payload.
In Step 2, we learned how to train and serve the model using the Swagger API page. In this step, we’ll learn how to call a real-time inference using a clickstream. Here’s a short video to demonstrate how to do this:
Video Walk-Through of Step 1
Inference Input and Output
To make the real-time inference call, navigate to the
Inference section. There are three different inference calls. However, for this guide, we’ll use the
next-items endpoint. The details of each endpoint are described in the following
documentation.
Inference APIs on the Swagger API Page
After you open the
next-items dropdown, you must complete some actions similar to those during model training:
- Enter tenant name.
You must use the same tenant name that you entered during the training process.
- Insert Payload.
Here, you provide all the relevant inference input data in the payload. Each of the different inference endpoints has different requirements: For next_items, the items_ls parameter is required while the other parameters are not required (but are imputable). The items_ls parameter is a list of item_id representing the user’s past item interactions (clickstream) to generate the recommendations.
For this parameter to be valid, the input must meet the following requirements:
- Correspond to an object entry in the item_catalogue training data used to train the model, or
- Be provided as an entry in the metadata parameter as a cold start item, or
- Be provided as a cold start item via the "metadata update" feature
Taking an example from our sample dataset, insert a payload with the content
{ "items_ls": ["2858"] }
Request Payload on the Swagger API Page
For more details of the payload input, refer to this
documentation.
After clicking
Execute, you can expect the following responses:
- The training process has not finished yet. This returns a 404 code, stating that no model instances were found.
Error Not Found in the API Response
- The user inputted an incorrect payload. This returns a 400 code, stating that the model doesn’t understand the payload request.
Error Bad Request in the API Response
- The model is able to understand the request and successfully return a set of recommendations. This returns a 200 code, stating the recommended items with their respective confidence scores.
Recommendation Result Sample in the API Response
- Forbidden. The user has exceeded their inference quota for the month. A short message is displayed with code 403.
Error Forbidden in the API Response
Cheers
At this point, we have successfully called inferences. If you have encountered any issues, feel free to
leave a comment below. My team will definitely help you out. Alternatively, check out the Q&A area in the community or visit our community page to browse our use cases and learning materials.
In the
next step, we will be using your own data to train a model for your own use case. Feel free to follow my profile and stay tuned for the next steps. See you in the next blog!