A Percussive Sound Synthesizer Based on Physical and Perceptual Attributes

Mitsuko Aramaki; Richard Kronland-Martinet; Thierry Voinier; Solvi Ystad

In lieu of an abstract, here is a brief excerpt of the content:

A Percussive Sound Synthesizer Based on Physical and Perceptual Attributes
Mitsuko Aramaki, Richard Kronland-Martinet, Thierry Voinier, and Sølvi Ystad

Synthesis of impact sounds is far from a trivial task owing to the high density of modes generally contained in such signals. Several authors have addressed this problem and proposed different approaches to model such sounds. The majority of these models are based on the physics of vibrating structures, as with for instance modal synthesis (Adrien 1991; Pai et al. 2001; van den Doel, Kry, and Pai 2001; Cook 2002; Rocchesso, Bresin, and Fernström 2003). Nevertheless, modal synthesis is not always suitable for complex sounds, such as those with a high density of mixed modes. Other approaches have also been proposed using algorithmic techniques based on digital signal processing. Cook (2002), for example, proposed a granular-synthesis approach based on a wavelet decomposition of sounds.

The sound-synthesis model proposed in this article takes into account both physical and perceptual aspects related to sounds. Many subjective tests have shown the existence of perceptual clues allowing the source of the impact sound (its material, size, etc.) to be identified merely by listening (Klatzky, Pai, and Krotkov 2000; Tucker and Brown 2002). Moreover, these tests have brought to the fore some correlations between physical attributes (the nature of the material and dimensions of the structure) and perceptual attributes (perceived material and perceived dimensions). Hence, it has been shown that the perception of the material mainly correlates with the damping coefficient of the spectral components contained in the sound. This damping is frequency-dependent, and high-frequency modes are generally more heavily damped than low-frequency modes. Actually, the dissipation of vibrating energy owing to the coupling between the structure and the air increases with frequency (see, for example, Caracciolo and Valette 1995).

To take into account this fundamental sound behavior from a synthesis point of view, a time-varying filtering technique has been chosen. It is well known that the size and shape of an object's attributes are mainly perceived by the pitch of the generated sound and its spectral richness. The perception of the pitch primarily correlates with the vibrating modes (Carello, Anderson, and Kunkler-Peck 1998). For complex structures, the modal density generally increases with the frequency, so that high frequency modes overlap and become indiscernible. This phenomenon is well known and is described for example in previous works on room acoustics (Kuttruff 1991).

Under such a condition, the human ear determines the pitch of the sound from emergent spectral components with consistent frequency ratios. When a complex percussive sound contains several harmonic or inharmonic series (i.e., spectral components that are not exact multiples of the fundamental frequency), different pitches can generally be heard. The dominant pitch then mainly depends on the frequencies and the amplitudes of the spectral components belonging to a so-called dominant frequency region (Terhardt, Stoll, and Seewann 1982) in which the ear is pitch sensitive. (We will discuss this further in the Tuning section of this article.) With all these aspects in mind, and wishing to propose an easy and intuitive control of the model, we have divided it into three parts represented by an excitation element, a material element, and an object element.

The large number of parameters available through such a model necessitates a control strategy. This strategy (generally called a mapping) is of great importance for the expressive capabilities of the instrument, and it inevitably influences the way it can be used in a musical context (Gobin et al. 2004). [End Page 32] In this article, we mention some examples of possible strategies, like an original tuning approach based on the theory of harmony. This approach makes it possible to construct complex sounds like musical chords, in which the root of the chord, its type (major or minor), and its inversions can be chosen by the performer. However, owing to the strong influence between mapping and composition, the choice of the strategy should, as far as possible, be available to the composer.

Click for larger view
View full resolution

Figure 1.

Impact sound synthesis model. This model is divided in three parts representing the object, the excitation, and...

Computer Music Journal

A Percussive Sound Synthesizer Based on Physical and Perceptual Attributes

Share

Additional Information

Project MUSE Mission