Category:Perception of Communication Sounds

From CNBH Acoustic Scale Wiki

Jump to: navigation, search
Introduction to the content of the wiki

Auditory perceptions are constructed in the brain from sounds entering the ear canal, in conjunction with current context and information from memory. It is not possible to make direct measurements of perceptions, so all descriptions of perceptions involve explicit, or implicit, models of how perceptions are constructed. The category Auditory Processing of Communication Sounds focuses on how the auditory system might construct your initial experience of a sound, referred to as the 'auditory image'. It describes a computational model of how the construction might be accomplished -- the Auditory Image Model (AIM). The category Perception of Communication Sounds focuses on the structures that appear in the auditory image and how we perceive them. These categories are intended to work as a pair, with the reader going back and forth as their interest shifts back and forth from the perceptions themselves and how the auditory system might construct our perceptions.

Roy Patterson



This Perception category of the wiki focuses on our initial perception of a sound -- the auditory image that the sound produces (Patterson et al., 1992; Patterson, 1994). It is assumed that sensory organs and the neural mechanisms that process sensory data together construct internal, mental models of objects in the world around us; the visual system constructs a visual object from the light the object reflects and the auditory system constructs an auditory object from the sound the object emits, and these objects are combined with any tactile and/or olfactory information (which might possibly also be thought of as tactile and/or olfactory objects) to produce our experience of an external object. Our task as auditory neuroscientists is to characterize the auditory part of this object modelling process.

If the sound arriving at the ears is a noise, the auditory image is filled with activity, but it lacks organization and the details are continually fluctuating. If the sound has a pulse-resonance form, an auditory figure appears in the auditory image with an elaborate structure that reflects the phase-locked neural firing pattern produced by the sound in the cochlea (Patterson et al., 1992). Extended segments of sound, like syllables or musical notes, cause auditory figures to emerge, evolve, and decay in what might be referred to as auditory events (Patterson et al., 1992), and these events characterize the acoustic gestures of the external source. All of the processing up to the level of auditory figures and events can proceed without the need of top-down processing associated with context or attention (Patterson et al., 1995). It is assumed, for example, that auditory figures and events are produced in response to sounds when we are asleep. And, if we are presented with the call of a new animal that we have never encountered before, the early stages of auditory processing will still produce an auditory event, even though we (the listeners) might be puzzled by the event.

Subsequently, when alert, the brain may interpret the auditory event, in conjunction with events in other sensory systems, and in conjunction with contextual information that gives the event meaning. At this point, the event with its meaning becomes an auditory object, that is, the auditory part of the perceptual model of the external object that was the source of the sound. An introduction to auditory {objects, events, figures, images and scenes} is described in the paper entitled Homage à Magritte . It is a revised transcription of a talk presented at the Auditory Objects Meeting at the Novartis Foundation in London, 1-2 October 2007. It is intended to stimulate discussion of how we use, and should use, terms like auditory {images, figures, events, objects and scenes}.

An introduction to auditory objects, events, figures, images and scenes

Magritte's painting of a pipe with the famous inscription

The perception of acoustic scale in speech sounds

Discrimination of speaker size from syllable phrases (Ives et al., 2005)

The JND for speaker size for five standard speakers and six syllable groups

The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex and age (Smith and Patterson, 2005)

Perceived size a speaker as a function of GPR and VTL

The processing and perception of size information in speech sounds (Smith et al., 2005)

The robustness of vowel recognition to variation in GPR and VTL

The robustness of speech communication to changes in acoustic scale

The robustness of bio-acoustic communication and the role of normalization (Patterson et al., 2007)

Scale-shift covariant auditory images

The robustness of human speech recognition to variation in vocal characteristics (Vestergaard et al. in preparation)

Recognition performance: (A) for the training voice (0) and the new voices (1 – 8), (B) for CVs and VCs, and (C) for all voices. The wide bars show average performance across all syllable categories; the thin bars show performance separately for each consonant category.

Effects of voicing in the recognition of concurrent syllables (Vestergaard and Patterson, 2009)

Recognition scores as a function of 1) SNR (top panels) and 2) SII (bottom panels). The left panels (A#) show syllable recognition; the middle panels (B#) show consonant recognition, and the right panels (C#) show vowel recognition. The solid lines show performance for voiced target syllables, and the dashed lines show performance for whispered syllables.

The interaction of the acoustic scale variables in speech perception

The interaction of vocal tract length and glottal pulse rate in the recognition of concurrent syllables (Vestergaard et al., 2009)

Surfaces showing how recognition performance improves as the GPR and VTL of the target speaker and the distracter diverge. The three surfaces show performance for three signal-to-noise ratios: +6, 0 and -6 dB

Comparison of relative and absolute judgements of speaker size (Walters et al., 2008)

Size surface inferred from an experiment on speaker-size discrimination

The perception of acoustic scale in musical tones

The perception of family and register in musical tones (Patterson et al., 2010)

Sixteen common instruments illustrating four registers within each of four instrument families

Reviewing the definition of timbre as it pertains to the perception of speech and musical sound - ISH 2009 (Patterson et al., 2010)

The GPR-VTL plane with musical notation

The Domain of Tonal Melodies: Physiological limits and some new possibilities (van Dinther and Patterson, 2005)

The domain of melodic pitch

Perception of acoustic scale and size in musical instrument sounds (van Dinther and Patterson, 2006)

Size surface inferred from an experiment on instrument-size discrimination

Pitch strength decreases as F0 and harmonic resolution increase in complex tones ... (Ives and Patterson, 2008)

Dual Profile images

Research projects

The effect of phase in the perception of octave height  Ambox warning pn.svg Access to this page is currently restricted (van Dinther and Patterson, in preparation)

Attenuating the odd harmonics of complex tone shifts the pitch vertically up the pitch helix

Revising the definition of timbre to make it useful for speech and musical sounds (BSA2008)

Vowel spectra illustrating the two components of acoustic scale in communication sounds

The role of GPR and VTL in the definition of speaker identity

Gaudrain, Li, Ban, Patterson, Interspeech 2009

Estimating the size and sex of a speaker from their speech sounds Ambox warning pn.svg Access to this page is currently restricted

Figure 1. Mechanisms involved in estimating speaker size. Bottom panel: Dual profile of a vowel showing the formant wavelengths and the pitch wavelength. Middle panel: Conversion of formant wavelengths to vowel type and acoustic scale of the vocal-tract filter. Top panel: conversion of acoustic scale values to a common code for height estimatation.

Obligatory streaming based on acoustic scale difference

Size judgement iso-contour

Published papers for the Category:Perception of Communication Sounds

Discrimination of Source Size

Discrimination of speaker size: Smith et al. (2005), Smith and Patterson (2005), Ives et al. (2005), Smith et al. (2007)

Discrimination of musical instrument size: van Dinther and Patterson (2006)

Robustness of Auditory Perception to Changes in Source Size

Robustness of speech recognition: Smith et al. (2005), Smith and Patterson (2005), Ives et al. (2005), Smith et al. (2007), Walters et al. (2008)

Robustness of music perception: van Dinther and Patterson (2006)


Pages in category "Perception of Communication Sounds"

The following 27 pages are in this category, out of 27 total.








P cont.




T cont.



Personal tools