The Carnegie Mellon University Pronouncing Dictionary is an open-source machine-readable pronunciation dictionary for North American English that contains over 134,000 words and their pronunciations. CMUdict is being actively maintained and expanded. We are open to suggestions, corrections and other input.
The current phoneme set has 39 phonemes, not counting varia due to lexical stress. This phoneme (or more accurately, phone) set is based on the ARPAbet symbol set developed for speech recognition uses. You can find a description of the ARPAbet on Wikipedia, as well information on how it relates to the standard IPA symbol set. If you check off the stress box you will get a pronunciation in which vowels are annotated (see above). Stress is difficult to get right and people disagree about it. There are words in the language that differentiate by stress (e.g. PR'OGRESS PROGR'ESS).
Phoneme Example Translation ------- ------- ----------- AA odd AA D AE at AE T AH hut HH AH T AO ought AO T AW cow K AW AY hide HH AY D B be B IY CH cheese CH IY Z D dee D IY DH thee DH IY EH Ed EH D ER hurt HH ER T EY ate EY T F fee F IY G green G R IY N HH he HH IY IH it IH T IY eat IY T JH gee JH IY K key K IY L lee L IY M me M IY N knee N IY NG ping P IH NG OW oat OW T OY toy T OY P pee P IY R read R IY D S sea S IY SH she SH IY T tea T IY TH theta TH EY T AH UH hood HH UH D UW two T UW V vee V IY W we W IY Y yield Y IY L D Z zee Z IY ZH seizure S IY ZH ER
This cgi was created by kevin lenzo, and the source code is freely available.
For correspondence about this interface, including options you'd like to see, please email air -at ´cs‘cmu→ εdυ