Dominant spectral region


The spectral-frequency band in which the ear can receive acoustic signals extends from about 20 to 16000 Hz - at least for a young and normal-hearing person. When a sound stimulus is reduced to a narrow frequency band, one can be sure that it will be appreciated by the auditory system even when that band occurs near the low or the high end of the auditory range of frequencies (provided, of course, that the signal's intensity exceeds the threshold of hearing). However, when a broad-band signal is presented that covers the entire frequency band of audition, there is an intermediate frequency region through which the majority of information is acquired by the auditory system. There is a dominant spectral-frequency region.

The position and the extension of the dominant region can roughly be inferred from existing systems for audio communication. The frequency band of the telephone channel was for economical reasons confined to 300-3400 Hz, and this, of course, was based on the observation that it suffices for high-intelligibility speech communication. The AM radio channel, extending from about 100 Hz to 4000 Hz, was for decades regarded more or less acceptable even for transmission of music.

Of course, one condition for those frequency bands to be sufficient is that the sound signals to be transmitted include essential information in these bands. However, as those channels are useless if they don't transmit essential information to the ear, it is the auditory system that determines what is essential. This is why the layout of the telephone- and AM-radio channels tell us something about the auditory system's dominant spectral region.

For instance, transmission of the frequency band below 300 Hz obviously is not essential for speech communication. This is not trivial, as the oscillation frequency of the human voice (both of men and women) essentially is below 300 Hz. If perception of the voice pitch were dependent on the presence of the lower harmonics of the speech signal, this could not work (see topic virtual pitch). On the other hand, transmission of the first two or three formants of speech is essential. These formants indeed are included in the telephone frequency band.

More information about the dominant spectral region comes from a number of aspects of audio communication. French & Steinberg (1947a) and Pollack (1948a) have measured speech intelligibility of high-pass and low-pass limited channels, respectively. French and Steinberg found that filtering out the frequency band beyond about 1800 Hz (low-pass) had the same reducing effect on the intelligibility of monosyllables as did filtering out the band below 1800 Hz (high-pass). This can be regarded as an experimental verification of the notion that consonants are about as important as the first two formants of vowels, as the former essentially are represented above about 1800 Hz, the latter, below. Moreover it is interesting to note that the frequency 1800 Hz, in the projection of the frequency scale on the length of the cochear partition, roughly corresponds to the middle of the the latter's extension.

On first sight, these observations suggest that 1800 Hz might be the center of the dominant region - at least where intelligibility of monosyllables is concerned. However, while in a sense this may appear reasonable, in another sense it is not. At least, 1800 Hz does not appear to be the "most important" frequency. From their data on intelligibility of monosyllables French & Steinberg deduced an "importance function", i.e., according to the criterion of how drastically the so-called articulation index is affected by a small change of the filter's cut-off frequency. The articulation index is defined as the negative logarithm of the relative number of errors made in syllable recognition, and such is a measure of the transmitted information. The "importance function" such obtained by French & Steinberg as a function of frequency has a maximum at about 700 Hz and decays on both sides almost symmetrically on a log-frequency scale.

So, at least with respect to intelligibility of speech, 700 Hz appears to be the "most important" or "most dominant" frequency. Relating this to the bands used by the telephone- and AM-radio channels, one may notice that 700 Hz is not too far off the geometric mean of the respective cut-off frequencies. These means are about 630 and 1000 Hz, respectively. Thus one can say that on a logarithmic frequency scale the "most important" frequency (700 Hz) lays in the middle between the lower and upper cut-off frequencies - which intuitively appears to make sense, indeed.

Another aspect is perception of virtual pitch. Schouten (1962a), Plomp (1967a), Ritsma (1967a), and Yost (1982a) explored in independent types of experiments the question, from which spectral region the mechanism that creates virtual pitch (residue pitch) picks up the necessary information, when the Fourier spectrum of the stimulus extends from a low fundamental frequency up to several kHz. From their results one can infer that this, roughly, is the frequency region extending from about 400 to 2000 Hz. As the geometric mean of these values is about 890 Hz, which is not too far off 700 Hz, one may conclude that the auditory system's dominant region for speech intelligibility may be more or less identical with that for the formation of virtual pitch, which is a remarkable coincidence [55], [56], [104] p. 349-351.

When we worked out the algorithm for calculating virtual pitch and its relative prominence, a quantitative specification of an "importance" or "dominance" function was required. We designed that function according to the criterion that the algorithm's predictions of the strike note of bells should be optimal [55], [56]. This appeared appropriate, as finding the strike note of bells from the partials is a challenge to any pitch model in which the relative dominance of partials plays a prominent role [65]. As a result, a weighting function for the spectral pitches was found that agrees well with the "importance" function for the intelligibility of speech found by French & Steinberg (1947a). In particular, the frequency of the maximum of spectral-pitch weighting at 700 Hz was found to be optimal.

It will thus appear that the "importance" and "weighting" functions just discussed characterize a universal property of the auditory system. This view is supported by the finding of Bilsen & Raatgever (1973a), that in binaural lateralization a spectral dominance region is involved that resembles the one described above.


Author: Ernst Terhardt terhardt@ei.tum.de - Feb. 20, 2000


main page