2025 Mar 04 10:40 AM
I run into an issue with anomaly detection when using HANA_ML. I have a table in in SAP HANA with numeric columns in format DOUBLE. I created it in two variants:
HC_TABLE1 with just the data
HC_TABLE2 the same data but also having an ID columns
Now training an IsolationForsest with the data works fine and I can save the model:
iso_forest = IsolationForest(
max_samples =2048,
n_estimators=100, # Number of trees
max_features=458, # Use all features from the table
random_state=1, # For reproducibility
thread_ratio=0.9 # Parallel processing
)
hana_df1 = conn.table('HC_TABLE1')
result = iso_forest.fit(hana_df1)
However predicting the outliers does NOT work:
hana_df2 = conn.table('HC_TABLE2')
predictions = iso_forest.predict(hana_df2, key='ID')
ERROR:hana_ml.algorithms.pal.preprocessing:HANA version: 2.00.078.00.1715149848 (fa/hana2sp07). (423, 'AFL error: AFL DESCRIBE for nested call failed - invalid table(s) for ANY-procedure call (Input table 0: column 0 has invalid SQL type.): line 33 col 1 (at pos 8351)')
How can I identify what is going wrong here? I can train a model but never use it? The exactly same data which was used for fitting is not accepted for doing predictions. I am close to ditching HANA_ML completely and using the real python ML routines instead.
2025 Mar 05 9:36 AM
In my desperation I have changed the ID column from type BIGINT to INTEGER - et voila - the error disappeared. So contrary to the documentation, BIGINT is not accepted as a datatype for the ID column. At least on my version of HANA_ML.
2025 Mar 05 7:29 AM
Hi @mark_foerster The error "Input table 0: column 0 has invalid SQL type" typically means that one of of the columns has a type that is not supported by the algorithm. Here in the first table the first column.
The supported column types of the PAL algorithms are documented here https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-predictive-analysis-l...
Please check that the ID column is the first column and that the column type is supported. In case the column type is different a cast should help https://help.sap.com/doc/cd94b08fe2e041c2ba778374572ddba9/2024_4_QRC/en-US/hana_ml.dataframe.html#ha...
2025 Mar 05 8:41 AM
Hello Andreas,
thanks a lot for your reply. I had already asked Claude, and got some very similar information. This is my base table for input:
This is the offending code:
j
I am really at a loss here. Just 4 data columns, all are of type DOUBLE. And the table starts with an integer column ID. And still HANA_ML complains about column 0 having the wrong SQL type. I am at a loss here what is causing this:
So I read the documentation and still it won't work. Is there a way to get more detailed infos from HANA_ML, like what is the unsupported SQL type it found?
Regards,
Mark
2025 Mar 05 9:36 AM
In my desperation I have changed the ID column from type BIGINT to INTEGER - et voila - the error disappeared. So contrary to the documentation, BIGINT is not accepted as a datatype for the ID column. At least on my version of HANA_ML.
2025 Mar 11 7:38 AM
2025 Mar 05 8:08 AM
@mark_foerster, your point is to what I understand, you trained with a table without ID column, you predict with a table without ID column?
If so
Best regards
2025 Mar 05 10:16 AM
That's great, happy that it's working now. Can you please check whether BIGINT works in the latest hana_ml version. (currently 2.23.25021400). Our Product Group can then check whether this is an issue with the algorithm or the documentation.
2025 Mar 05 1:21 PM
I can reprodue the issue with hana-ml version: 2.23.240121700. INTEGER is accepted for column ID, but BIGINT isn't:
ERROR:hana_ml.algorithms.pal.preprocessing:(423, 'AFL error: AFL DESCRIBE for nested call failed - invalid table(s) for ANY-procedure call (Input table 0: column 0 has invalid SQL type.): line 10 col 1 (at pos 399)')