Consecutive Codes #3

chancyk · 2015-05-26T15:54:14Z

Consecutive codes may not be handled correctly, as can be seen with the test cases Pfister and Tymczak referenced at http://www.archives.gov/research/census/soundex.html.

The original Russell and census versions of the algorithm seem to implement this consecutive code behavior for adjacent letters only (not separated by a vowel or '0' code character).

The archives.gov reference also mentions another special case where a consecutive code is discarded when separated by an 'H' or 'W'.

EDIT: The 'H' or 'W' rule actually is used in the SQL Server implementation. Removed the comment that it's not.

EDIT2: I was right and wrong before my first edit. MSSQL is case sensitive for its handling of 'H' and 'W'. Consecutive codes are discarded for upper case and not for lower case...

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consecutive Codes #3

Consecutive Codes #3

chancyk commented May 26, 2015

Consecutive Codes #3

Consecutive Codes #3

Comments

chancyk commented May 26, 2015