To tackle a real world problem of the hiring department performing a new hire instead of a rehire for seasonal contractors, I decided to implement the Jaro-Winkler algorithm in SAP. Since there is a policy to not store key information on contractors (such as SSN), there is only the ability to match on a person’s name. Unfortunately, the name is not always typed correctly which leads to the inability to find a previous hire and this leads to hiring a new contractor instead of rehiring a contractor. Using the Jaro-Winkler algorithm, we are now able to suggest possible similar contractors based on the string comparison of first and last name. Jaro-Winkler calculates the distance (a measure of similarity) between strings. The measurement scale is 0.0 to 1.0, where 0.0 is the least likely and 1.0 is a positive match. For our purposes, anything below a 0.8 is not considered useful.
“Jaro–Winkler distance (Winkler, 1990) is a measure of similarity between two strings”.
Deliberately misspelling my last name and shortening my first name:
I only show the top three results. In fact, since I’m creating a combined score of two different measurements, I would likely consider anything below a 1.60 as not useful in real world. So now we have a safeguard to alleviate hiring when we should be re-hiring. Of course, there are many uses for this algorith, such as zip code verification for zip+4. Also, another use is using as a dictionary check for valid words.