|
Phonetic
coding NYSIIS VS Soundex
Taken directly from Robert L. Taft, "Name Search Techniques",
New York State Identification and Intelligence
| NYSIIS algorithm |
| 1)
Translate first characters of name |
|
MAC
=> MCC |
| PH => FF |
| KN => NN |
| K => C |
| SCH => SSS |
| 2)
Translate last characters of name |
|
EE
=> Y, |
| IE => Y |
| DT,RT,RD,NT,ND => D |
| 3) First character of key = first character of name |
| 4)
Translate remaining characters by following rules, incrementing
by one character each time |
|
EV
=> AF else A,E,I,O,U => A |
| Q => G |
| Z
=> S |
| M
=> N |
| KN
=> N else K => C |
| SCH
=> SSS |
| PH
=> FF |
| H
=> If previous
or next is non vowel, previous |
| W
=> If previous
is vowel, previous |
| 2)
Translate last characters of name |
|
If
last character is S, remove it |
| If last characters are AY, replace
with Y |
| If last character is A, remove
it |
R. C. Russell developed the Soundex algorithm to processes
data collected from the 1890 census. Known as the Russell Soundex algorithm
numerous variants have been employed for genealogy studies and retrieval
systems.
| Soundex Code Algorithm |
| Use
the first letter drop
A, E, I, O, U, H, W and Y |
| Generate
three digit code according to the following table |
1
|
B, F, P, V |
2
|
C, G, J, K, Q, S, X |
3
|
D |
4
|
L |
5
|
M, N |
6
|
R |
| Drop
letters having the same code |
| If necessary pad with zeros |
General
Information About Phonetics
Problems
cause by phonetic skewed distribution
NameSearch® General
Information
|
|