From CNBH Acoustic Scale Wiki
There are perceptual studies in the literature that provide data, either directly or indirectly, on the effects of compressive distortion on complex sounds, although not the sounds of speech and music. This suggests that we might begin the study of compressive distortion in speech and music by seeing if the CAR-FAC system in AIM-MAT-1.5 can explain the spectrum of distortion products observed in one or more of these studies, and in the process, tune the parameters of the CAR-FAC to simulate the compressive distortion produced in the human cochlea.
CAR-FAC and the distortion observed in Pressnitzer and Patterson (2001)
Pressnitzer and Patterson (2001) measured the level of the difference tones produced by a stimulus composed of harmonics 15 to 25 of 100 Hz; the primaries were all in cosine phase. They used a cancellation tone technique described originally by Schouten (1938) and successively refined by Goldstein (1967) and Smoorenburg (1972a), Smoorenburg (1972b) for the measurement of the cubic difference tone, 2f1-f2.
Specifically, a sinusoid was added to the stimulus at the frequency of the distortion tone to be measured plus 3 Hz, and its amplitude was adjusted maximize the loudness of the beats between the probe tone and the distortion tone. Then, a second tone was added at the frequency of the CT and its amplitude and phase were adjusted to cancel the beats. It is assumed that the second “cancellation tone” has equal amplitude and opposite phase to the distortion tone.
The results of the first experiment (Fig. 1a) showed that the distortion tone at 100 Hz is only about 14 dB below the level of the primaries (upper panel) even though the primaries have a moderate level, and the nearest primary is 1400 Hz, or nearly 4 octaves, higher in frequency. Note also that the first three distortion tones have essentially the same phase as the primaries (lower panel).
The pattern of peaks in the distortion spectrum is similar to the pattern of peaks that appears in the distortion section of the auditory spectrum produced by the CAR-FAC, insofar as the first three harmonics of the fundamental appear as separate peaks and peak level decreases with harmonic number. The figure appear quite different because one has a linear frequency axis while the other has a quasi-log frequency axis, and the primaries do not appear in Fig. 1a of Pressnitzer and Patterson (2001).
In a second experiment, they measured the level of the lowest distortion tone as a function of the number of primaries and showed (Fig. 2a) that the level increases about 3.8 dB for each doubling of the number of primaries in the range 2 to 17 primaries, producing 1 to 16 pairs of adjacent components with difference frequency h1.
Figure 2b shows the auditory spectra of a complex tone as the number of components is increased from 2 to 17 primaries, producing 1 to 16 pairs of adjacent components with difference frequency h1. The overall rms level of the tone is held constant as the number of components is increased, and as a result, the level of the peak of the structure in the auditory spectrum that represents the primaries decreases as the number of components increases and the width of the structure increases. This procedure differs from that in PPish01 where the level of the primaries is fixed and the level of the complex increases linearly with the number of components. The current procedure has the advantage of illustrating that the level of the composite distortion component, h1, is largely invariant when the overall level of the complex is fixed. The distortion region of the auditory spectrum develops as the number of components increases. The distortion spectrum for the complex with 2 tones (red) has a single peak at h1. There after as the number of distortion components doubles, the spectrum develops more peaks and the process asymptotes as the number of distortion components increases in the range 8-16.
In the third experiment, Pressnitzer and Patterson (2001) showed that an alternating phase wave, with successive primaries having phases of 0 and 90 degrees, produces more distortion at h2 than it does at h1 (see Figure 3a).
When the APH wave is analyzed by the CAR-FAC system, the auditory spectrum is as shown in Figure 3b. The first and third harmonics in the distortion spectrum are observed to be reduced in level while the second harmonic is increased in level. This example shows that the compressive distortion generated by the CAR-FAC system is sensitive to the phases of the primaries, as would be expected, and it captures the basic form of the difference between the distortion produced by CPH and APH waves.
Properties of auditory distortion
Pressnitzer and Patterson (2001) suggested that the amplitude and phase of the distortion tones produced by complex tones can be explained in relatively simple terms.
- The distortion tones are the vector sum of the difference tones produced by all possible pairs of primaries. The contribution of cubic distortion tones can be neglected.
- The amplitude of the distortion component produced by a given pair of primaries is solely a function of the frequency difference between the primaries. The absolute frequency of the primaries can be neglected.
- The phase of the distortion tone is the sum of two terms: the difference between the phases of the primaries and a constant that depends solely on the frequency difference between the primaries. The phase shifts associated with propagation along the cochlea can be neglected.
To test the model, they generated a minimum distortion stimulus in which the phases of the hypothesized distortion components were distributed evenly around the phase circle, and showed that, indeed, it reduced the levels of the distortion tones considerably.
The simplicity of the effects is seductive, and it seems entirely possible that a CAR-FAC system might produce distortion spectra similar to those revealed in these experiments, and that many potential complications may not arise in practice.
Distortion and the LLMP
The pronounced effect of phase on the distortion spectrum observed in Pressnitzer and Patterson (2001) prompted an extension to the experiment of Pressnitzer, Patterson and Krumbholtz (2001) on the Lower Limit of Melodic Pitch (LLMP). In that study, the LLMP was measured using bandpass-filtered, harmonic, complex tones. On each trial, a four-note, random melody was presented in the first interval, and after a brief pause it was repeated with one of the notes changed by a semi-tone. The listener’s task was to report which note had changed. The repetition rate of the base note of the melody was lowered adaptively to determine the lowest base note that supported reliable detection of the note that changed -- the lower limit of melodic pitch.
The equivalent rectangular bandwidth of the stimulus was fixed at 1.2 kHz and the primaries were in cosine phase (CPH). Threshold for melodic pitch was measured as the lower cut-off frequency of the filter, Fc, was increased from 200 to 3200 Hz. The overall level of the stimuli was around 55 dB SPL. As is traditional in experiments where the primaries are high harmonics of a low fundamental, a band of noise was inserted in the region below the primaries to mask any distortion products. They found that threshold rose from about 32 Hz to 300 Hz as Fc increased from 32 to 3200 Hz. similar to the CPH function (solid line with circles) in Figure 4a.
In the extension to the LLMP experiment, Pressnitzer and Patterson (2001) varied the phase relations between components, using Alternating and Schroeder phase stimuli (APH and SPH, respectively) as well as CPH stimuli. One condition was similar to that in Pressnitzer and Patterson (2001) inasmuch as the stimulus included a lowpass-filtered, continuous pink noise to mask distortion products; in the other condition the noise was omitted. The results are shown in Figures 4a and 4b, respectively. When the region of the distortion spectrum is masked by noise, the LLMP is observed to increase with Fc as it did in Pressnitzer, Patterson and Krumbholtz (2001) (Figure 4a). The results for the CPH and SPH stimuli are very similar; for the APH stimuli, the LLMP is lower in the region where Fc is low and there is better temporal resolution in the internal representation of the sound. The pattern of results is markedly different when the distortion spectrum is not masked by noise (Figure 4b). The LLMP for SPH stimuli is largely unaffected, presumably because it produces weak distortion tones and so the masking noise was having little effect in that case. In contrast, the LLMP for CPH stimuli remains just over 32 Hz as Fc increase to 3200 Hz. Similarly, the LLMP for APH stimuli is largely independent of frequency region, but in this case, it is almost an octave lower than the LLMP for CPH stimuli.
The data show that highpass filtered complexes generate distortion tones that are integrated into the perception of the tone, and the distortion tones allow the listener to continue making semi-tone distinctions typical of melodic pitch perception when there are no low-frequeny primaries.
CAR-FAC and the distortion observed in Wiegrebe and Patterson (1999)
This section will deal with the distortion spectra produced by amplitude modulated noise.
- Goldstein, J.L. (1967). “Auditory nonlinearity.” J. Acoust. Soc. Am., 41, p.676-689. 
- Pressnitzer, D., Patterson, R.D. and Krumbholtz, K. (2001). “The lower limit of melodic pitch.” J. Acoust. Soc. Am., 109, p.2074-2084.  
- Pressnitzer, D. and Patterson, R.D. (2001). “Distortion products and the pitch of harmonic complex tones”, in Physiological and Psychophysical Bases of Auditory Function, Breebaart, D.J., Houtsma, A.J.M., Kohlrausch, A., Prijs, V.F. and Schoonhoven, R. editors, p.97-103 (Shaker).        
- Smoorenburg, G.F. (1972). “Audibility region of combination tones.” J. Acoust. Soc. Am., 52, p.603-614. 
- Smoorenburg, G.F. (1972). “Combination tones and their origin.” J. Acoust. Soc. Am., 52, p.615-632. 
- Wiegrebe, L. and Patterson, R.D. (1999). “Quantifying the distortion products generated by amplitude-modulated noise.” J. Acoust. Soc. Am., 106, p.2709-2718.