Extensive research shows that inter-talker variability (i.e., changing the talker) affects recognition memory for speech signals. However, relatively little is known about the consequences of intra-talker variability (i.e. changes in speaking style within a talker) on the encoding of speech signals in memory. It is well established that speakers can modulate the characteristics of their own speech and produce a listener-oriented, intelligibility-enhancing speaking style in response to communication demands (e.g., when speaking to listeners with hearing impairment or non-native speakers of the language). Here we conducted two experiments to examine the role of speaking style variation in spoken language processing. First, we examined the extent to which clear speech provided benefits in challenging listening environments (i.e. speech-in-noise). Second, we compared recognition memory for sentences produced in conversational and clear speaking styles. In both experiments, semantically normal and anomalous sentences were included to investigate the role of higher-level linguistic information in the processing of speaking style variability. The results show that acoustic-phonetic modifications implemented in listener-oriented speech lead to improved speech recognition in challenging listening conditions and, crucially, to a substantial enhancement in recognition memory for sentences.
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This “cooling pattern” forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.
- Proceedings of the National Academy of Sciences of the United States of America
- Published about 3 years ago
It is widely assumed that one of the fundamental properties of spoken language is the arbitrary relation between sound and meaning. Some exceptions in the form of nonarbitrary associations have been documented in linguistics, cognitive science, and anthropology, but these studies only involved small subsets of the 6,000+ languages spoken in the world today. By analyzing word lists covering nearly two-thirds of the world’s languages, we demonstrate that a considerable proportion of 100 basic vocabulary items carry strong associations with specific kinds of human speech sounds, occurring persistently across continents and linguistic lineages (linguistic families or isolates). Prominently among these relations, we find property words (“small” and i, “full” and p or b) and body part terms (“tongue” and l, “nose” and n). The areal and historical distribution of these associations suggests that they often emerge independently rather than being inherited or borrowed. Our results therefore have important implications for the language sciences, given that nonarbitrary associations have been proposed to play a critical role in the emergence of cross-modal mappings, the acquisition of language, and the evolution of our species' unique communication system.
Neil Armstrong insisted that his quote upon landing on the moon was misheard, and that he had said one small step for a man, instead of one small step for man. What he said is unclear in part because function words like a can be reduced and spectrally indistinguishable from the preceding context. Therefore, their presence can be ambiguous, and they may disappear perceptually depending on the rate of surrounding speech. Two experiments are presented examining production and perception of reduced tokens of for and for a in spontaneous speech. Experiment 1 investigates the distributions of several acoustic features of for and for a. The results suggest that the distributions of for and for a overlap substantially, both in terms of temporal and spectral characteristics. Experiment 2 examines perception of these same tokens when the context speaking rate differs. The perceptibility of the function word a varies as a function of this context speaking rate. These results demonstrate that substantial ambiguity exists in the original quote from Armstrong, and that this ambiguity may be understood through context speaking rate.
The seemingly limitless diversity of proteins in nature arose from only a few thousand domain prototypes, but the origin of these themselves has remained unclear. We are pursuing the hypothesis that they arose by fusion and accretion from an ancestral set of peptides active as co-factors in RNA-dependent replication and catalysis. Should this be true, contemporary domains may still contain vestiges of such peptides, which could be reconstructed by a comparative approach in the same way in which ancient vocabularies have been reconstructed by the comparative study of modern languages. To test this, we compared domains representative of known folds and identified 40 fragments whose similarity is indicative of common descent, yet which occur in domains currently not thought to be homologous. These fragments are widespread in the most ancient folds and enriched for iron-sulfur- and nucleic acid-binding. We propose that they represent the observable remnants of a primordial RNA-peptide world.
During much of the past century, it was widely believed that phonemes-the human speech sounds that constitute words-have no inherent semantic meaning, and that the relationship between a combination of phonemes (a word) and its referent is simply arbitrary. Although recent work has challenged this picture by revealing psychological associations between certain phonemes and particular semantic contents, the precise mechanisms underlying these associations have not been fully elucidated. Here we provide novel evidence that certain phonemes have an inherent, non-arbitrary emotional quality. Moreover, we show that the perceived emotional valence of certain phoneme combinations depends on a specific acoustic feature-namely, the dynamic shift within the phonemes' first two frequency components. These data suggest a phoneme-relevant acoustic property influencing the communication of emotion in humans, and provide further evidence against previously held assumptions regarding the structure of human language. This finding has potential applications for a variety of social, educational, clinical, and marketing contexts.
Mounting physiological and behavioral evidence has shown that the detectability of a visual stimulus can be enhanced by a simultaneously presented sound. The mechanisms underlying these cross-sensory effects, however, remain largely unknown. Using continuous flash suppression (CFS), we rendered a complex, dynamic visual stimulus (i.e., a talking face) consciously invisible to participants. We presented the visual stimulus together with a suprathreshold auditory stimulus (i.e., a voice speaking a sentence) that either matched or mismatched the lip movements of the talking face. We compared how long it took for the talking face to overcome interocular suppression and become visible to participants in the matched and mismatched conditions. Our results showed that the detection of the face was facilitated by the presentation of a matching auditory sentence, in comparison with the presentation of a mismatching sentence. This finding indicates that the registration of audiovisual correspondences occurs at an early stage of processing, even when the visual information is blocked from conscious awareness.
ABSTRACT For sixty-seven children with ASD (age 1;6 to 5;11), mean Total Vocabulary score on the Language Development Survey (LDS) was 65·3 words; twenty-two children had no reported words; and twenty-one children had 1-49 words. When matched for vocabulary size, children with ASD and children in the LDS normative sample did not differ in semantic category or word-class scores. Q correlations were large when percentage use scores for the ASD sample were compared with those for samples of typically developing children as well as children with vocabularies <50 words. The 57 words with the highest percentage use scores for the ASD children were primarily nouns, represented a variety of semantic categories, and overlapped substantially with the words having highest percentage use scores in samples of typically developing children as well as children with lexicons of <50 words. Results indicated that the children with ASD were acquiring essentially the same words as typically developing children, suggesting delayed but not deviant lexical composition.
- Journal of experimental psychology. Learning, memory, and cognition
- Published almost 6 years ago
It is almost a truism that language aids serial-order control through self-cuing of upcoming sequential elements. We measured speech onset latencies as subjects performed hierarchically organized task sequences while “thinking aloud” each task label. Surprisingly, speech onset latencies and response times (RTs) were highly synchronized, a pattern that is not consistent with the hypothesis that speaking aids proactive retrieval of upcoming sequential elements during serial-order control. We also found that when instructed to do so, subjects were able to speak task labels prior to presentation of response-relevant stimuli and that this substantially reduced RT signatures of retrieval-however, at the cost of more sequencing errors. Thus, while proactive retrieval is possible in principle, in natural situations it seems to be prevented through a strong “gestalt-like” tendency to synchronize speech and action. We suggest that this tendency may support context updating rather than proactive control. (PsycINFO Database Record © 2013 APA, all rights reserved).
Everyday experience tells us that it is often possible to identify a familiar speaker solely by his/her voice. Such observations reveal that speakers carry individual features in their voices. The present study examines how suprasegmental temporal features contribute to speaker-individuality. Based on data of a homogeneous group of Zurich German speakers, we conducted an experiment that included speaking style variability (spontaneous vs. read speech) and channel variability (high-quality vs. mobile phone-transmitted speech), both of which are characteristic of forensic casework. Speakers demonstrated high between-speaker variability in both read and spontaneous speech, and low within-speaker variability across the two speaking styles. Results further revealed that distortions of the type introduced by mobile telephony had little effect on suprasegmental temporal characteristics. Given this evidence of speaker-individuality, we discuss suprasegmental temporal features' potential for forensic voice comparison.