Figure1: Temporal and spectral structure of speech and musical sounds:

a) - e) Waveforms/Spectrogram of the utterance ``Three''; 
f) and g) Waveform/Spectrogram of instrumental music;
(note the sharp "attack" phase of each note)

        Figure (1) shows examples of a speech utterance and notes from a musical instrument (castanet).  Figure (1a) shows the waveform of the utterance "three"(its spectrogram is in Figure (1b)) which starts with a stop consonant. The characteristic signature of a stop consonant is an (almost) complete closure of the vocal tract followed by a sharp release of broadband energy called, a burst. These events are zoomed into in Figure (1c). The burst qualifies as a spectrally-diffuse component in our terminology. The second burst is followed by aspiration (noise-like signal which has energy in several frequency bands and hence is presumed to be a sum of spectrally-compact components)  which is then followed by start of periodic voicing.
      

      The signal component corresponding to the narrow first formant region is shown in figure (1d). Clearly, we can associate a carrier frequency (the dominant harmonic's frequency) with this spectrally-compact signal. We model such a signal component (actually its complex or analytic version), using a bandpass signal model (see [1]). The signal component in the third formant region is shown in Figure (1e). This signal component originates from a broad formant and hence is a sum of many harmonically related components. Addition of many time-varying sinusoids results in signal reinforcements at some time instants and cancellations at other time instants (resulting in envelope actually or nearly going to zero at some time locations). Thus the waveform in Figure (1e) appears to be composed of a sequence of ``bandpass pulses''. We would classify this signal component as spectrally-diffuse.  In this case it would seem that the carrier frequency is not the dominant feature of this signal (although it is important) but the time locations of the bandpass pulses are  also relevant parameters. Therefore we model each ``bandpass pulse'' signal in figure (1e) in the frequency domain (see [1]).  This model is ofcourse applicable to the bursts in Figure (1c) as well.