Discover the most talked about and latest scientific content & concepts.

Concept: Haskins Laboratories


Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca’s area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of “motherese” on early language learning, and (iii) the “social-gating” hypothesis and humans' development of social understanding.

Concepts: Scientific method, Brain, Cognition, Language, Developmental psychology, Theory, Haskins Laboratories, Broca's area


To clarify how the pure-tone threshold (PTT) on the PTA predicts speech perception (SP) in elderly Japanese persons.

Concepts: Cognition, Speech recognition, Haskins Laboratories, Alvin Liberman, Direct realism


Voice imitation basically consists in estimating a synthesizer input parameters to mimic a target speech signal. This is a difficult inverse problem because the mapping is time-varying, non-linear and from many to one. It typically requires considerable amount of time to be done manually. This work presents a system based on a genetic algorithm (GA) to automatically estimate the input parameters of the Klatt and HLSyn formant synthesizers using an analysis-by-synthesis process. Results are presented for natural (human-generated) speech for three male speakers. These results obtained with GA-based system outperform those obtained with the baseline Winsnoori with respect to four objective figures of merit and a subjective test. The GA with Klatt synthesizer generated similar voices to the target and the subjective tests indicate an improvement in the quality of the synthetic voices when compared to the ones produced by the baseline.

Concepts: Human voice, Genetic algorithm, Haskins Laboratories, Vocoder, Speech synthesis


Atypical language lateralization has been marked as one of the factors that may contribute to the development of dyslexia. Indeed, atypical lateralization of linguistic functions such as speech processing in dyslexia has been demonstrated using neuroimaging studies, but also using the behavioral dichotic listening (DL) method. However, so far, DL results have been mixed. The current study assesses lateralization of speech processing by using DL in a sample of children at familial risk (FR) for dyslexia. In order to determine whether atypical lateralization of speech processing relates to reading ability, or is a correlate of being at familial risk, the current study compares the laterality index of FR children who did and did not become dyslexic, and a control group of readers without dyslexia. DL was tested in 3rd grade and in 5/6th grade. Results indicate that at both time points, all three groups have a right ear advantage, indicative of more pronounced left-hemispheric processing. However, the FR-dyslexic children are less good at reporting from the left ear than controls and FR-nondyslexic children. This impediment relates to reading fluency.

Concepts: Linguistics, Language, Cultural studies, Speech recognition, Dyslexia, Reading, Haskins Laboratories, Cluttering


The ability to use cognitive-control functions to regulate speech perception is thought to be crucial in mastering developmental challenges, such as language acquisition during childhood or compensation for sensory decline in older age, enabling interpersonal communication and meaningful social interactions throughout the entire life span. Although previous studies indicate that cognitive control of speech perception is subject to developmental changes, its exact developmental trajectory has not been described. Thus, examining a sample of 2,988 participants (1,119 women) with an age range from 5 to 89 years, the aim of the present cross-sectional study was to examine the development of cognitive control of speech perception across the life span using age as continuous predictor. Based on data collected with the forced-attention consonant-vowel dichotic listening paradigm, the data analysis revealed an inverted U-shaped association of age and performance level: A steep increase in performance level was seen throughout childhood and adolescence, reaching highest performance in the early 20s, and was followed by a monotonous, continuous decline into late adulthood. Thus, cognitive control of speech perceptions shows similar life span developmental trajectories as observed regarding cognitive-control functions in other domains, for example, as assessed in the visual domain. (PsycINFO Database Record

Concepts: Psychology, Cognition, Sense, Mind, Developmental psychology, Trajectory, Dichotic listening, Haskins Laboratories


We revisit an article, “Perception of the Speech Code” (PSC), published in this journal 50 years ago (Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967) and address one of its legacies concerning the status of phonetic segments, which persists in theories of speech today. In the perspective of PSC, segments both exist (in language as known) and do not exist (in articulation or the acoustic speech signal). Findings interpreted as showing that speech is not a sound alphabet, but, rather, phonemes are encoded in the signal, coupled with findings that listeners perceive articulation, led to the motor theory of speech perception, a highly controversial legacy of PSC. However, a second legacy, the paradoxical perspective on segments has been mostly unquestioned. We remove the paradox by offering an alternative supported by converging evidence that segments exist in language both as known and as used. We support the existence of segments in both language knowledge and in production by showing that phonetic segments are articulatory and dynamic and that coarticulation does not eliminate them. We show that segments leave an acoustic signature that listeners can track. This suggests that speech is well-adapted to public communication in facilitating, not creating a barrier to, exchange of language forms. (PsycINFO Database Record

Concepts: Cognition, Phonology, Linguistics, Language, International Phonetic Alphabet, Phonetics, Knowledge, Haskins Laboratories


The motor theory of speech perception has experienced a recent revival due to a number of studies implicating the motor system during speech perception. In a key study, Pulvermüller et al. (2006) showed that premotor/motor cortex differentially responds to the passive auditory perception of lip and tongue speech sounds. However, no study has yet attempted to replicate this important finding from nearly a decade ago. The objective of the current study was to replicate the principal finding of Pulvermüller et al. (2006) and generalize it to a larger set of speech tokens while applying a more powerful statistical approach using multivariate pattern analysis (MVPA). Participants performed an articulatory localizer as well as a speech perception task where they passively listened to a set of eight syllables while undergoing fMRI. Both univariate and multivariate analyses failed to find evidence for somatotopic coding in motor or premotor cortex during speech perception. Positive evidence for the null hypothesis was further confirmed by Bayesian analyses. Results consistently show that while the lip and tongue areas of the motor cortex are sensitive to movements of the articulators, they do not appear to preferentially respond to labial and alveolar speech sounds during passive speech perception.

Concepts: Scientific method, Brain, Type I and type II errors, Multivariate statistics, Premotor cortex, Hearing, Null hypothesis, Haskins Laboratories


A machine that can read printed material to the blind became a priority at the end of World War II with the appointment of a U.S. Government committee to instigate research on sensory aids to improve the lot of blinded veterans. The committee chose Haskins Laboratories to lead a multisite research program. Initially, Haskins researchers overestimated the capacities of users to learn an acoustic code based on the letters of a text, resulting in unsuitable designs. Progress was slow because the researchers clung to a mistaken view that speech is a sound alphabet and because of persisting gaps in man-machine technology. The tortuous route to a practical reading machine transformed the scientific understanding of speech perception and reading at Haskins Labs and elsewhere, leading to novel lines of basic research and new technologies. Research at Haskins Laboratories made valuable contributions in clarifying the physical basis of speech. Researchers recognized that coarticulatory overlap eliminated the possibility of alphabet-like discrete acoustic segments in speech. This work advanced the study of speech perception and contributed to our understanding of the relation of speech perception to production. Basic findings on speech enabled the development of speech synthesis, part science and part technology, essential for development of a reading machine, which has found many applications. Findings on the nature of speech further stimulated a new understanding of word recognition in reading across languages and scripts and contributed to our understanding of reading development and reading disabilities. (PsycINFO Database Record © 2014 APA, all rights reserved).

Concepts: Research, Speech recognition, Haskins Laboratories, Speech synthesis, Pattern playback, Alvin Liberman, Reading machine, Philip Rubin


Text-to-speech options on augmentative and alternative communication (AAC) devices are limited. Often, several individuals in a group setting use the same synthetic voice. This lack of customization may limit technology adoption and social integration. This paper describes our efforts to generate personalized synthesis for users with profoundly limited speech motor control. Existing voice banking and voice conversion techniques rely on recordings of clearly articulated speech from the target talker, which cannot be obtained from this population. Our VocaliD approach extracts prosodic properties from the target talker’s source function and applies these features to a surrogate talker’s database, generating a synthetic voice with the vocal identity of the target talker and the clarity of the surrogate talker. Promising intelligibility results suggest areas of further development for improved personalization.

Concepts: Haskins Laboratories, Vocoder, Speech synthesis


Based on the Motor Theory of speech perception, the interaction between the auditory and motor systems plays an essential role in speech perception. Since the Motor Theory was proposed, it has received remarkable attention in the field. However, each of the three hypotheses of the theory still needs further verification. In this review, we focus on how the auditory-motor anatomical and functional associations play a role in speech perception and discuss why previous studies could not reach an agreement and particularly whether the motor system involvement in speech perception is task-load dependent. Finally, we suggest that the function of the auditory-motor link is particularly useful for speech perception under adverse listening conditions and the further revised Motor Theory is a potential solution to the “cocktail-party” problem.

Concepts: Motor control, Sense, Anatomy, Falsifiability, Motor system, Haskins Laboratories, Extrapyramidal system, Alvin Liberman