- Numeric Absolute Differences within a Range
- Numeric Percentage Differences within a Range
- Varying Letters when Numbers Remain Constant
- Opposite Amounts when Absolute Values Remain Constant
- Character Differences Due to Transpositions
- Varying Special Characters when Alphanumerics Remain Constant
- Varying Numbers when Letters Remain Constant
- Date Differences within a Range
- Character Differences within a Levenshtein Distance
Comparing values in the same field, but in different rows is a challenge – especially if you need to not only identify which data differs, but how it differs. The Duplicate Excluder App uses a comprehensive suite of nine tests to identify duplicates across rows and refine the difference for one field. This testing process helps filter out false positives that differ significantly in value. The maximum amount of difference between numbers can be specified to pinpoint potential areas of interest, anomalies or errors.
How it Works
The Duplicate Key Exclusion App helps identify records where certain key fields are the same, but the exclusion field is different. Tests included with this App are:
- Character Differences Due to Transpositions:
- Limits the duplicate key exclusion findings to the records where the difference between two values of the exclusion field is a single transposition of two characters
- Character Differences within a Levenshtein Distance:
- Allows the user to set an upper limit for the allowable Levenshtein distance between two character values for the exclusion field of interest
- Levenshtein distance allows for fuzzy matching by seeing how many characters need to change to make one character value equal to another (for example, potato and tomato are within a distance of 2)
- Date Differences within a Range:
- Allows the user to set an upper limit for the allowable difference in days between two dates for the exclusion field of interest
- Numeric Absolute Differences within a Range:
- Allows the user to set an upper limit for the allowable difference in days between two dates for the exclusion field of interest
- Numeric Absolute Differences within a Range:
- Allows the user to set an upper limit for the allowable absolute difference between two numeric amounts for the exclusion field of interest
- Numeric Percentage Differences within a Range:
- Allows the user to set an upper limit for the allowable percentage difference between two numeric amounts for the exclusion field of interest
- Opposite Amounts when Absolute Values Remain Constant:
- Limits the duplicate key exclusion findings to the records where there is a difference between two numeric values, but the values are equal in absolute value
- Varying Letters when Numbers Remain Constant:
- Limits the duplicate key exclusion findings to the records where the difference between two character values occurs only with the letters, but the values are equal if only the numbers are compared
- Varying Letters when Numbers Remain Constant:
- Limits the duplicate key exclusion findings to the records where the difference between two character values occurs only with the letters, but the values are equal if only the numbers are compared
- Varying Numbers when Letters Remain Constant:
- Limits the duplicate key exclusion findings to the records where the difference between two character values occurs only with the numbers, but the values are equal if only the letters are compared
- Varying Special Characters when Alphanumerics Remain Constant:
- Limits the duplicate key exclusion findings to the records where the difference between two character values occurs only with special characters, but the values are equal if only the non-special characters are compared