Question: Sound coding is an algorithm for creating indices for words based on their pronunciation. The goal is for homophones to be encoded to the same

Sound coding is an algorithm for creating indices for words based on their pronunciation. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. From an English word, you generate a letter and three numbers that roughly describe how a given word sounds. Similar sounding words will have similar codes. The first letter of the sound code is simply the first letter in the word. The remaining numbers range from 1 to 6, indicating different categories of sounds created by consonants following the first letter. If the word is too short to generate 3 numbers, 0 is added as needed. If the generated code is longer than 3 numbers, the extra are thrown away.

Code

Letters

Description

1

B, F, P, V

Labial

2

C, G, J, K, Q, S, X, Z

Gutterals and sibilants

3

D, T

Dental

4

L

Long liquid

5

M, N

Nasal

6

R

Short liquid

SKIP

A, E, H, I, O, U, W, Y

Vowels (and H, W, and Y) are skipped

There are several special cases when calculating a sound code:

Letters with the same sound number that are immediately next to each other are discarded. So Pfizer becomes Pizer, Sack becomes Sac, Czar becomes Car, Collins becomes Colins, and Mroczak becomes Mrocak.

If two letters with the same sound number separated by "H" or "W", only use the first letter. So Ashcroft is treated as Ashroft.

Sample Sound Codes

Word

Sound Code

Washington

W252

Wu

W000

DeSmet

D253

Gutierrez

G362

Pfister

P236

Jackson

J250

Tymczak

T522

Ashcraft

A261

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!