Reference: Brad & Kathy Genealogy (http://www.bradandkathy.com/cgi-bin/yasc.cgi)
First applied to the US Census in 1880, a
Soundex-coded surname is an alpha-numeric indexed based on the way a surname sounds, rather than the way it is spelled.
The intent was to help researchers find a surname quickly even though it
may have received different spellings. In generating a Soundex
index for a name one follows these basic rules:
- Every Soundex code consists of
a letter and three numbers, such as B-536 (which also happens to
represent names like 'Bender').
- The letter is always the first
letter of the surname, whether it is a vowel or a consonant.
Except for the surname's first letter, all remaining vowels (A, E, I,
O, and U) as well as the consonants W, Y, and H are disregarded in
forming the Soundex code.
- The next three consonants of
the surname are assigned values from the lookup table below (and note
the exceptions).
- Any remaining consonants in
the name are ignored (in other words, a maximum of three are used).
If there are not three consonants following the initial letter, zeroes
complete the three-digit code. A name comprised of only vowels after
the first letter (such as 'Lee') yield no code number, and would thus
be represented as L-000. A name with only one additional consonant,
such as COOK, would be C-200. Although most surnames can be coded using
the guide in Table 1 there are always exceptions and special
considerations, all of which are outlined below.
Surnames that sound the same but are spelled differently
(like SMITH and SMYTH) will have the same code. And because vowels
are ignored in all but the first character, even SMYTHE will have that
same code. The same goes for BIRCH / BURCH. But the surnames BIRTCH
or BURTCH, with the inclusion of a T within the first three consonants
would generate a different index.
In a similar vein, names that sound alike but differ in
spelling, and have a different
first letter such as COOK / KOCH, FAUST / PHAUST etc. will result
in different Soundex codes for each spelling. Therefore, any
comprehensive search would require searching for all possible Soundex
codes, based on the variation of the surname's first letter.
|
|
Name |
|
Soundex |
BENDER
|
|
B536 |
BIRCH
|
|
B620 |
BURCH
|
|
B620 |
BIRTCH
|
|
B632 |
BURTCH |
|
B632 |
COOK
|
|
C200 |
KOCH
|
|
K200 |
FAUST
|
|
F230 |
PHAUST
|
|
P230 |
SMITH
|
|
S530 |
SMYTH
|
|
S530 |
SMYTHE
|
|
S530 |
|
|
|
TABLE 1: SOUNDEX CODING GUIDE
After retaining the first letter of the surname and
disregarding any following letters if they are A, E, I, O, U, W, Y,
or H: |
|
|
|
|
|
The
number |
|
Represents the letters |
|
1 |
|
B,
P, F, V |
|
2 |
|
C,
S, K, G, J, Q, X, Z |
|
3 |
|
D, T |
|
4 |
|
L |
|
5 |
|
M, N |
|
6 |
|
R |
|
Prefixes
If the surname has a prefix, such as D', De, dela, Di, du, Le, van,
or Von, code it both with and without the prefix because it might be
listed under either code. The surname vanDevanter, for example,
could be V-531 or D-153. Mc and Mac are not considered to be
prefixes and should be coded like other surnames.
Double Letters
If the surname has any double letters, they should be
treated as one letter. Thus, in the surname Lloyd, the second l
should be crossed out. In the surname Gutierrez, the second r should
be disregarded.
Side-by-Side Letters
A surname may have different side-by-side letters that
receive the same number on the Soundex coding guide. For example,
the c, k, and s in Jackson all receive a number 2 code. These
letters with the same code should be treated as only one letter. In
the name Jackson, the k and s should be disregarded. This rule also
applies to the first letter of a surname, even though it is not
coded. For example, Pf in Pfister would receive a number 1 code for
both the P and f. Thus in this name the letter f should be crossed
out, and the code is P-236.
American Indian and Asian Names
A phonetically spelled American Indian or Asian name was sometimes
coded as if it were one continuous name. If a distinguishable
surname was given, the name may have been coded in the normal
manner. For example, Dances with Wolves might have been coded as
Dances (D-522) or as Wolves (W-412). The the name Shinka-Wa-Sa
may have been coded as Shinka (S-520) or Sa (S-000).
If Soundex cards do not yield expected results, researchers should
consider other surname spellings or variations on coding names.
Female Religious Figures
Nuns or other female religious figures with names such as Sister
Veronica may have been members of households or heads of households
or institutions where a child or children age 10 or under resided.
Because many of these religious figures do not use a surname, the
Soundexes for the post-1880 censuses frequently use the code S-236,
for Sister, whether or not a surname exists. So far as can be
determined, though, the Soundex for the 1880 census does not use the
code S-236 for this purpose.
Because of the limitations of the 1880 Soundex, the number of cards
mentioning a nun or comparable person is likely to be very small. If
this person was the head of a household or institution with
children, indexers may have coded the head's surname. If no surname
existed, the indexers may have used the Not Reported (NR) surname
option discussed later. In either case, if the household or
institution headed by a female religious figure included a child
under 10, the researcher also can code the child's surname and seek
an Individual Card. No Individual Card, though, applies to a nun or
any other person 10 years or older.
Single-Term Names
In 1880 many individuals, especially in Alaska or areas with many
Native Americans, may have used only a single-term name such as
Loksi or Hiawatha. Perhaps not until the 1900s did their descendants
use a surname. Some researchers, therefore, may need to code a
single-term name as though it was a surname. If this rule applies to
the head of a family and other family members have different names,
Individual Cards will also pertain to those members age 10 or
younger.
H and W Rule
The letters H and W do not act as separators between letters having
the same code value. As a result, such letters are treated as
adjacent and are condensed into a single code. For example, the
letter sequence "CHS" would be coded as 2, whereas without this
rule, it would be coded as 22. Note that this rule has often been
omitted in descriptions of Soundex.
|
TABLE 2: EXAMPLES OF SOUNDEX CODING
After retaining the first letter of the surname and
disregarding the next letters if they are A, E, I, O, U, W, Y,
and H, then: |
|
|
|
|
|
|
|
Name |
|
Coded |
|
Soundex Code |
|
Allricht
|
|
l,
r, c
|
|
A-462 |
|
Eberhard |
|
b,
r, r
|
|
E-166 |
|
Engebrethson |
|
n,
g, b
|
|
E-521 |
|
Heimbach |
|
m,
b, c |
|
H-512 |
|
Hanselmann |
|
n,
s, l |
|
H-524 |
|
Henzelmann |
|
n,
z, l |
|
H-524 |
|
Hildebrand |
|
l,
d, b |
|
H-431 |
|
Kavanagh |
|
v,
n, g |
|
K-152 |
|
Lind, Van |
|
n, d |
|
L-530 |
|
Lukaschowsky |
|
k,
s,
|
|
L-222 |
|
McDonnell |
|
c,
d, n |
|
M-235 |
|
McGee |
|
c |
|
M-200 |
|
O'Brien |
|
b,
r, n |
|
O-165 |
|
Opnian
|
|
p,
n, n |
|
O-155 |
|
Oppenheimer |
|
p,
n, m |
|
O-155 |
|
Riedemanas |
|
d,
m, n |
|
R-355 |
|
Zita |
|
t |
|
Z-300 |
|
Zitzmeinn |
|
t,
z, m |
|
Z-325 |
|