|
Phase behavior of cochlear response
If the cochlea operated like a bank of harmonic oscillators (see page History,
section Overview) it
would perform as a spectrum analizer. The amplitudes of the basilar membrane
(BM) responses to a complex and generally time-varying sound would exhibit
peaks wherever the local oscillators meet resonance conditions (characteristic
frequency sites), and each peak would develop and move on the basilar membrane
with a width and a delay depending on oscillator tuning. The sharper the tuning,
the narrower the peak, but also the more delayed the response to a sudden
frequency change. Frequency selectivity and response speed are indeed related
to each other in a way that is reminiscent of the indetermination relationship
established by Heisenberg in quantum mechanics. Were oscillators tuned too
sharply, the cochlea could not track rapid frequency changes, and its capability
of recognizing and discriminating sounds of vital importance for the species
would be very poor. In the oscillator-bank model, the phase of the BM response
to each frequency component decreases from -180o (at base) to
0o (at apex), becoming -90o at the characteristic frequency
site. Were this model also linear, the BM response amplitude would be the
algebraic sum of the responses to all frequency components, and in general
the total phase change at any site would unpredictably depend on the input-signal
structure. The phase behavior of the real cochlea differs substantially
from this description.
The above cochleograms illustrate the time courses of the basilar membrane
oscillations elicited by 300 Hz periodic stimulation (20 msec gaussian-shaped
pulses) for the bank oscillator model (left) and the hydrodynamic model
(right). Cochleogram amplitude is rendered by color brightness.
Coordinates are: t for time and x for basilar
membrane position, measured as fractional distance from the stapes. The
peaks of the response to the harmonic frequency components that form the
input pulse sequence are shown in the insets at the bottom left of the figure.
Note maximum response at the characteristic frequency sites of the harmonics
(n · 300 Hz, n = 1, 2, ...). Also
note the remarkable difference between phase behaviors. In the oscillator
bank model, the phase profile of the BM oscillation shows a marked zigzag,
whose shape depends strongly on the characteristic frequencies of pulse-sequence
harmonics. In the hydrodynamic model, the analogous phase profile declines
regularly in a way that is independent of the characteristic frequency sites
of the harmonics. This is clearly important for subsequent processing of cochlear
output.
A seemingly paradoxical filter
If the cochlea operated linearly, it would perform almost like a bank of
linear filters translating the Fourier components of an input sound into a
linear superposition of oscillation modes, i.e. the travelling waves. Therefore,
phase-delay peculiarities apart, it would be functionally equivalent to an
oscillator bank. Mechanical non-linearity breaks down this equivalence producing
effects that are important for acoustic signal processing at normal loudness
levels. As described in the Nonlinear undamping page, in normal hearing
conditions, i.e. for inputs of 40-70 dB SPL (sound pressure level), the cochlea
operates just across two different regimes, which are respectively highly
undamped and heavily damped. Responses to frequency inputs below 30-40 dB
are enhanced and, because of sharp tuning, their rise is somewhat delayed.
Those beyond 60-70 dB SPL are depressed and, because of poor tuning, follow
promptly the stimulus. In any case, cochlear response amplitudes to frequency
components of different amplitude tend to be equalized. An important side-effect
of this way of working is tone-to-tone suppression. This mechanism causes
the survival of the responses to the frequency components of larger amplitude
and the dramatic suppression of the responses to components of proximal frequency
but smaller amplitude. Besides noise suppression, this mechanism provides
a sort of frequency selectivity for Fourier components of sufficiently high
intensity, i.e. precisley those eliciting highly damped fast responses.
In this way the cochlea produces what may appear an engineering paradox,
combining fast responsiveness and high frequency selectivity.
Neurogram of a cochlear response
in the cat
The sound filtering capabilities of the cochlea described above are spectacularly
evidenced in the analysis of the time domain responses to synthesized speech-like
sounds from large populations auditory nerve fibres. The figure below shows
a cochleogram reconstructed from data detected from the auditory nerve of
a cat. The patterning suggests that groups of fibres respond similarly, even
though their characteristic frequencies may differ by nearly an octave. This
segregation indicates that the effective bandwidths of cochlear filters are
much wider than the narrow tuning curves found in mechanical and neural measurements
based upon iso-response increments at detection threshold. To account for
the observed neural responses, basilar membrane vibration must segment into
a set of regions, each one coherently oscillating at a prevailing frequency
(although with a progressive phase lag of a few cycles towards the apex).
This figure evidences that the cochlea responds primarely to formants,
extracting the defining features of the sound source. To filter the perceptually
relevant features of sound, the cochlea exploits the saturation property
of the outer hair cell amplifier (see page Nonlinear undamping)
producing two main effects: response amplitude equalization and
tone-to-tone suppression. Equalization makes the perception of sufficiently
different sound frequency-components largely independent of their
intensity. Suppression of a tone by an adjacent one produces an
effect equivalent, at a mechanical level, to lateral inhibition
typical of neuronal structures. Tone suppression is putatively the
main cause of what, at the psycho-acoustical level, appears as
the phenomenon of frequency masking. As an added bonus, masking
implies a remarkable degree of noise suppression. Simple linear
amplification of input sound cannot substitute for this non-linear effect,
a fact worth considering in the design of acoustic prosthetic devices.
|