on 2013 Dec 17 3:18 PM
Experts:
My use case is: I want to perform fuzzy search of Names in one table into another table.
Both tables have Thousands of records.
Problem: If i use SQL CONTAINS statment to perform fuzzy search, it takes around 20 ms for one check.
Therefore to check 100K Names, it takes 100000 * .02 = 2000 seconds!! (which is quite high).
What is most performant way to perform fuzzy check with mass data on both sides ?
Thanks & Regards.
Request clarification before answering.
Hi Kyle,
Problem is that even existing capabilities of hardware, CPU is not exploited with the software approach offered.
You can check that SQL CONTAINS statement used for performing fuzzy search accepts only single string as input!
And this is bottleneck. I need to loop through each entry in one table to find hits in target table.
JOIN on two tables is not possible with SQL CONTAINS. Therefore no possibility exists for parallel processing of workset.
Regards,
Rohit.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Rohit,
Thanks for the details, now I see more clearly, what the problem is.
one option you have here, is to consider a union all query, where you split each section of the union with a seperate sql statement that selects from a portion of table 1,
this way the number of sections in your union all query could potentially be split by the sql optimizer, into parallel queries....(up to the number of available cores)
finally being joined in one result
| User | Count |
|---|---|
| 17 | |
| 8 | |
| 7 | |
| 6 | |
| 4 | |
| 3 | |
| 3 | |
| 2 | |
| 2 | |
| 2 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.