As the adoption of UUIDs (Universally Unique Identifier) and GUIDs (Globally Unique Identifier) becomes prevalent, APL (Automated Predictive Library) users may face the case where their training dataset contains such IDs. These unique identifiers act as a primary key to differentiate each row in your invoices table for example, or your customers table. This blog shows how APL can work with input data containing a UUID. We will cover the two phases of machine learning: Train and Predict.
If you run the following SQL in SAP HANA:
select SYSUUID as "MY_UUID" from dummy
you obtain a big alpha-numeric value:
Now, create a view from it just to check the datatype:
create view MY_APL_VIEW as
select SYSUUID as "MY_UUID" from dummy;
select view_name, column_name, data_type_name
from view_columns where view_name = 'MY_APL_VIEW';
The UUID is stored as a VARBINARY, but this datatype is not supported in APL, therefore we will convert the UUID into a hexadecimal string. This conversion can be done with the function: bintohex().
We need to train a classification model against this dataset:
Because the table contains a UUID, we must convert that column using the script below:
select
BINTOHEX("UUID") as "Id",
"Days_to_Report",
"Bodily_Injury",
"Property_Damage",
"Age",
"Gender",
"Income",
"Fraud"
from
"APL_SAMPLES"."ACCIDENT_CLAIMS"
Make the conversion either via a SQL view if you are working with APL SQL, or via a dataframe if you are working with APL Python. An example of dataframe based on a select statement is shown here:
The SQL view for training the APL model looks like this:
The inference dataset requires the same conversion before you can run the APL predictions:
select
BINTOHEX("UUID") as "Id",
"Days_to_Report",
"Bodily_Injury",
"Property_Damage",
"Age",
"Gender",
"Income"
from
"APL_SAMPLES"."ACCIDENT_NEW_CLAIMS"
The APL output from predicting claims fraud contains Id values that are in hexadecimal:
We must convert back the Id so that the APL output table can be joined to the input table:
select N.*, "gb_decision_Fraud" as "Prediction"
from APPLY_OUT O, APL_SAMPLES.ACCIDENT_NEW_CLAIMS N
where HEXTOBIN(O."Id") = N."UUID"
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
32 | |
13 | |
13 | |
11 | |
10 | |
9 | |
7 | |
7 | |
6 | |
6 |