How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.
Crocodilians are among the most vocal non-avian reptiles. Adults of both sexes produce loud vocalizations known as ‘bellows’ year round, with the highest rate during the mating season. Although the specific function of these vocalizations remains unclear, they may advertise the caller’s body size, because relative size differences strongly affect courtship and territorial behaviour in crocodilians. In mammals and birds, a common mechanism for producing honest acoustic signals of body size is via formant frequencies (vocal tract resonances). To our knowledge, formants have to date never been documented in any non-avian reptile, and formants do not seem to play a role in the vocalizations of anurans. We tested for formants in crocodilian vocalizations by using playbacks to induce a female Chinese alligator (Alligator sinensis) to bellow in an airtight chamber. During vocalizations, the animal inhaled either normal air or a helium/oxygen mixture (heliox) in which the velocity of sound is increased. Although heliox allows normal respiration, it alters the formant distribution of the sound spectrum. An acoustic analysis of the calls showed that the source signal components remained constant under both conditions, but an upward shift of high-energy frequency bands was observed in heliox. We conclude that these frequency bands represent formants. We suggest that crocodilian vocalizations could thus provide an acoustic indication of body size via formants. Because birds and crocodilians share a common ancestor with all dinosaurs, a better understanding of their vocal production systems may also provide insight into the communication of extinct Archosaurians.
Recent neuroscience research suggests that tinnitus may reflect synaptic loss in the cochlea that does not express in the audiogram but leads to neural changes in auditory pathways that reduce sound level tolerance (SLT). Adolescents (N = 170) completed a questionnaire addressing their prior experience with tinnitus, potentially risky listening habits, and sensitivity to ordinary sounds, followed by psychoacoustic measurements in a sound booth. Among all adolescents 54.7% reported by questionnaire that they had previously experienced tinnitus, while 28.8% heard tinnitus in the booth. Psychoacoustic properties of tinnitus measured in the sound booth corresponded with those of chronic adult tinnitus sufferers. Neither hearing thresholds (≤15 dB HL to 16 kHz) nor otoacoustic emissions discriminated between adolescents reporting or not reporting tinnitus in the sound booth, but loudness discomfort levels (a psychoacoustic measure of SLT) did so, averaging 11.3 dB lower in adolescents experiencing tinnitus in the acoustic chamber. Although risky listening habits were near universal, the teenagers experiencing tinnitus and reduced SLT tended to be more protective of their hearing. Tinnitus and reduced SLT could be early indications of a vulnerability to hidden synaptic injury that is prevalent among adolescents and expressed following exposure to high level environmental sounds.
- Proceedings of the National Academy of Sciences of the United States of America
- Published about 5 years ago
The perception of the pitch of harmonic complex sounds is a crucial function of human audition, especially in music and speech processing. Whether the underlying mechanisms of pitch perception are unique to humans, however, is unknown. Based on estimates of frequency resolution at the level of the auditory periphery, psychoacoustic studies in humans have revealed several primary features of central pitch mechanisms. It has been shown that (i) pitch strength of a harmonic tone is dominated by resolved harmonics; (ii) pitch of resolved harmonics is sensitive to the quality of spectral harmonicity; and (iii) pitch of unresolved harmonics is sensitive to the salience of temporal envelope cues. Here we show, for a standard musical tuning fundamental frequency of 440 Hz, that the common marmoset (Callithrix jacchus), a New World monkey with a hearing range similar to that of humans, exhibits all of the primary features of central pitch mechanisms demonstrated in humans. Thus, marmosets and humans may share similar pitch perception mechanisms, suggesting that these mechanisms may have emerged early in primate evolution.
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 5 years ago
The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced “impairment” in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.
Language is a distinguishing characteristic of our species, and the course of its evolution is one of the hardest problems in science. It has long been generally considered that human speech requires a low larynx, and that the high larynx of nonhuman primates should preclude their producing the vowel systems universally found in human language. Examining the vocalizations through acoustic analyses, tongue anatomy, and modeling of acoustic potential, we found that baboons (Papio papio) produce sounds sharing the F1/F2 formant structure of the human [ɨ æ ɑ ɔ u] vowels, and that similarly with humans those vocalic qualities are organized as a system on two acoustic-anatomic axes. This confirms that hominoids can produce contrasting vowel qualities despite a high larynx. It suggests that spoken languages evolved from ancient articulatory skills already present in our last common ancestor with Cercopithecoidea, about 25 MYA.
Following a planktonic dispersal period of days to months, the larvae of benthic marine organisms must locate suitable seafloor habitat in which to settle and metamorphose. For animals that are sessile or sedentary as adults, settlement onto substrates that are adequate for survival and reproduction is particularly critical, yet represents a challenge since patchily distributed settlement sites may be difficult to find along a coast or within an estuary. Recent studies have demonstrated that the underwater soundscape, the distinct sounds that emanate from habitats and contain information about their biological and physical characteristics, may serve as broad-scale environmental cue for marine larvae to find satisfactory settlement sites. Here, we contrast the acoustic characteristics of oyster reef and off-reef soft bottoms, and investigate the effect of habitat-associated estuarine sound on the settlement patterns of an economically and ecologically important reef-building bivalve, the Eastern oyster (Crassostrea virginica). Subtidal oyster reefs in coastal North Carolina, USA show distinct acoustic signatures compared to adjacent off-reef soft bottom habitats, characterized by consistently higher levels of sound in the 1.5-20 kHz range. Manipulative laboratory playback experiments found increased settlement in larval oyster cultures exposed to oyster reef sound compared to unstructured soft bottom sound or no sound treatments. In field experiments, ambient reef sound produced higher levels of oyster settlement in larval cultures than did off-reef sound treatments. The results suggest that oyster larvae have the ability to respond to sounds indicative of optimal settlement sites, and this is the first evidence that habitat-related differences in estuarine sounds influence the settlement of a mollusk. Habitat-specific sound characteristics may represent an important settlement and habitat selection cue for estuarine invertebrates and could play a role in driving settlement and recruitment patterns in marine communities.
Bats are among the most gregarious and vocal mammals, with some species demonstrating a diverse repertoire of syllables under a variety of behavioral contexts. Despite extensive characterization of big brown bat (Eptesicus fuscus) biosonar signals, there have been no detailed studies of adult social vocalizations. We recorded and analyzed social vocalizations and associated behaviors of captive big brown bats under four behavioral contexts: low aggression, medium aggression, high aggression, and appeasement. Even limited to these contexts, big brown bats possess a rich repertoire of social vocalizations, with 18 distinct syllable types automatically classified using a spectrogram cross-correlation procedure. For each behavioral context, we describe vocalizations in terms of syllable acoustics, temporal emission patterns, and typical syllable sequences. Emotion-related acoustic cues are evident within the call structure by context-specific syllable types or variations in the temporal emission pattern. We designed a paradigm that could evoke aggressive vocalizations while monitoring heart rate as an objective measure of internal physiological state. Changes in the magnitude and duration of elevated heart rate scaled to the level of evoked aggression, confirming the behavioral state classifications assessed by vocalizations and behavioral displays. These results reveal a complex acoustic communication system among big brown bats in which acoustic cues and call structure signal the emotional state of a caller.
Older adults frequently complain that while they can hear a person talking, they cannot understand what is being said; this difficulty is exacerbated by background noise. Peripheral hearing loss cannot fully account for this age-related decline in speech-in-noise ability, as declines in central processing also contribute to this problem. Given that musicians have enhanced speech-in-noise perception, we aimed to define the effects of musical experience on subcortical responses to speech and speech-in-noise perception in middle-aged adults. Results reveal that musicians have enhanced neural encoding of speech in quiet and noisy settings. Enhancements include faster neural response timing, higher neural response consistency, more robust encoding of speech harmonics, and greater neural precision. Taken together, we suggest that musical experience provides perceptual benefits in an aging population by strengthening the underlying neural pathways necessary for the accurate representation of important temporal and spectral features of sound.
Tissue level structural and mechanical properties are important determinants of bone strength. As an individual ages, microstructural changes occur in bone, e.g., trabeculae and cortex become thinner and porosity increases. However, it is not known how the elastic properties of bone change during aging. Bone tissue may lose its elasticity and become more brittle and prone to fractures as it ages. In the present study the age-dependent variation in the spatial distributions of microstructural and microelastic properties of the human femoral neck and shaft were evaluated by using acoustic microscopy. Although these properties may not be directly measured in vivo, there is a major interest to investigate their relationships with the linear elastic measurements obtained by diagnostic ultrasound at the most severe fracture sites, e.g., the femoral neck. However, before the validity of novel in vivo techniques can be established, it is essential to understand the age-dependent variation in tissue elastic properties and porosity at different skeletal sites. A total of 42 transverse cross-sectional bone samples were obtained from the femoral neck (Fn) and proximal femoral shaft (Ps) of 21 men (mean±SD age 47.1±17.8, range 17-82years). Samples were quantitatively imaged using a scanning acoustic microscope (SAM) equipped with a 50MHz ultrasound transducer. Distributions of the elastic coefficient (c(33)) of cortical (Ct) and trabecular (Tr) tissues and microstructure of cortex (cortical thickness Ct.Th and porosity Ct.Po) were determined. Variations in c(33) were observed with respect to tissue type (c(33Tr)