In lieu of an abstract, here is a brief excerpt of the content:

  • Frequency-Slope Estimation and Its Application to Parameter Estimation for Non-Stationary Sinusoids
  • Axel Röbel

Sinusoidal models are often used for the representation, analysis, or transformation of music or speech signals (Quatieri and McAulay 1986; Amatriain et al. 2002.). An important step that is necessary for obtaining the sinusoidal model lies in estimating the amplitudes, frequencies, and phases of the sinusoids from the peaks of the Discrete Fourier Transform (DFT). The estimation is rather simple provided the signal is stationary. A standard method for this estimation is the quadratically interpolated Fast Fourier Transform (QIFFT) estimator (Abe and Smith 2005). The QIFFT estimator uses the bin at the maximum of each spectral peak together with its two neighbors to establish a second-order poly-nomial model of the log amplitude and unwrapped phase of the peak. The amplitude and frequency estimates of the sinusoid that is related to the spectral peak are then derived from the height and frequency position of the maximum of the polynomial. The evaluation of the phase polynomial at the frequency position provides the estimate of the phase of the sinusoid.

For non-stationary sinusoids, the parameter estimation becomes more difficult, because the QIFFT algorithm is severely biased whenever the frequency is not constant. The term bias refers to the systematic estimation error, that is, the error of the estimator that exists even if no measurement noise is present. For the partials in natural vibrato signals, the estimation bias of the QIFFT estimator accounts for a significant amount of residual energy (i.e., the energy remaining after subtracting the sinusoidal model from the original signal). This is the major reason for the perceived voiced energy in the residual of vibrato signals.

A number of algorithms with low estimation bias for non-stationary sinusoids have been proposed. Algorithms that try to implement a maximum likelihood estimate (MLE) generally assume that the amplitude of the sinusoids is constant. As an example, we refer to an algorithm that is based on signal demodulation employing an initial search over a grid of frequencies and frequency slopes and a final fine-tuning of the parameters using an iterative maximization of the amplitude of the demodulated signal (Abatzoglou 1986). Similar to multi-component signals with stationary sinusoids, the MLE of sinusoidal parameters for multi-component signals with frequency-modulated (FM) sinusoids is rather costly, because a highly nonlinear and high-dimensional cost function must be maximized (Saha and Kay 2002). Owing to the computational savings and despite the fact that windowing reduces the estimator efficiency (Offelli and Petri 1992), the windowing technique is generally preferred if the signal contains more than a single sinusoid.

Most of the algorithms that employ analysis windows for the parameter analysis of amplitude-modulated (AM) and/or FM sinusoids rely on the fact that the analysis window is approximately Gaussian, such that a mathematical investigation becomes tractable. Marques and Almeida (1986) developed this approach for sinusoids with linear FM and constant amplitude, and Peeters and Rodet (1999) extended it to sinusoids with linear FM and AM. Abe and Smith (2005) presented a version for sinusoids with linear FM and exponential AM. The method presented in Abe and Smith 2005 is special in that it tries to extend its range to other analysis windows by means of a set of linear bias-correction functions. The resulting estimator is computationally efficient and achieves small bias for standard windows as long as the zero-padding factor is sufficiently large (i.e., greater than three) and the modulation rates are relatively small.

In this article, we present a bias-correction scheme for sinusoidal parameter estimation of sinusoids with linear AM/FM modulation. As a first [End Page 68] step, we provide a mathematical foundation for the conjecture that linear amplitude modulation does not create any additional bias for the QIFFT estimator. With respect to bias reduction, we can therefore ignore the amplitude modulation of the signal. Then we extend an initial version of our bias-reduction method that has been proposed originally in Röbel (2006). The basic ideas of the algorithm are similar to those in Abatzoglou (1986) in that the algorithm is based on signal demodulation and maximization of...

pdf

Share