cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

HANA Fuzzy Search

rohit_trivedi
Product and Topic Expert
Product and Topic Expert
0 Kudos
1,704

Experts:

My use case is: I want to perform fuzzy search of Names in one table into another table.

                          Both tables have Thousands of records.

Problem: If i use SQL CONTAINS statment to perform fuzzy search, it takes around 20 ms for one check.

               Therefore to check 100K Names, it takes 100000 * .02 = 2000 seconds!! (which is quite high).

What is most performant way to perform fuzzy check with mass data on both sides ?

Thanks & Regards.               

View Entire Topic
rohit_trivedi
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Kyle,

            Problem is that even existing capabilities of hardware, CPU is not exploited with the software approach offered.

You can check that SQL CONTAINS statement used for performing fuzzy search accepts only single string as input!

And this is bottleneck. I need to loop through each entry in one table to find hits in target table.

JOIN on two tables is not possible with SQL CONTAINS. Therefore no possibility exists for parallel processing of workset.

Regards,
Rohit.

kyle_mcadam
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Rohit,

Thanks for the details, now I see more clearly, what the problem is.

one option you have here, is to consider a union all query, where you split each section of the union with a seperate sql statement that selects from a portion of table 1,

this way the number of sections in your union all query could potentially be split by the sql optimizer, into parallel queries....(up to the number of available cores)

finally being joined in one result