Speech perception

From Wikipedia, the free encyclopedia

Speech perception refers to the processes by which humans are able to interpret and understand the sounds used in language. The study of speech perception is closely linked to the fields of phonetics and phonology in linguistics and cognitive psychology and perception in psychology.

Contents

Some of the earliest work in the study of how humans perceive speech sounds was conducted by Alvin Liberman and his colleagues at Haskins Laboratories (1957). Using a speech synthesizer, they constructed speech sounds that varied in place of articulation along a continuum from /ba/ to /da/ to /ga/. Listeners were asked to identify which sound they heard and to discriminate between two different sounds. The results of the experiment showed that listeners grouped sounds into discrete categories, even though the sounds they were hearing were varying continuously. Based on these results, they proposed the notion of categorical perception as a mechanism by which humans are able to identify speech sounds.

More recent research using different tasks and methodologies suggests that listeners are highly sensitive to acoustic differences within a single phonetic category, contrary to a strict categorical account of speech perception.

The process of perceiving speech begins at the level of the sound signal and the process of audition. (For a complete description of the process of audition see Hearing.) After processing the initial auditory signal, speech sounds are further processed to extract acoustic cues and phonetic information.

The speech sound signal contains a number of acoustic cues that are used in speech perception. The cues differentiate speech sounds belonging to different phonetic categories. For example, one of the most studied cues in speech is voice onset time or VOT. VOT is a primary cue signaling the difference between voiced and voiceless stop consonants, such as "b" and "p". Other cues differentiate sounds that are produced at different places of articulation or manners of articulation. The speech system must also combine these cues to determine the category of a specific speech sound. This is often thought of in terms of abstract representations of phonemes. These representations can then be combined for use in word recognition and other language processes.

The process of speech perception is not necessarily uni-directional. That is, higher-level language processes may interact with basic speech perception processes to aid in recognition of speech sounds. For instance, C. M. Wong and R. L. Diehl found that Cantonese listeners tended to produce more accurate responses on the judgment of Cantonese tones when they listened to a single speaker's voice rather than from a group of speakers.

One of the basic problems in the study of speech is how to deal with the noise in the speech signal. This is shown by the difficulty that computer speech recognition systems have with recognizing human speech. These programs can do well at recognizing speech when they have been trained on a specific speaker's voice, and under quiet conditions. However, these systems often do poorly in more realistic listening situations where humans are able to understand speech without difficulty.

One of the basic questions in speech perception is how infants learn speech sound categories. Different languages use different sets of speech sounds. For example, English distinguishes two voicing categories of sounds, whereas Hindi has three categories. Infants must learn which sounds their native language uses, and which ones it does not. It remains unclear how they are able to do this. Some researchers have suggested that certain sound categories are innate, that is, they are genetically-specified. Others have suggested that infants may be able to learn the sound categories of their native language through passive listening, using a process called statistical learning.

Studies of infant speech perception have shown that, in general, infants are able to distinguish more categories of speech sounds than adults. Newborns are able to distinguish between many of the sounds of human languages, but by about 12 months of age, they are only able to distinguish those sounds used in their native language.

Liberman, A. M., Harris, K. S., Hoffman, H. S. & Griffith, B. C. (1957) The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology 54: 358 - 368.

Wong, C. M. & Diehl, R. L. (2003) Perceptual normalization for inter- and intratalker variation in Cantonese level tones. Journal of Speech, Language and Hearing Research 46: 413 - 421.

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.