Machine Learning of Jazz Grammars

Jon Gillick; Kevin Tang; Robert M. Keller

In lieu of an abstract, here is a brief excerpt of the content:

Machine Learning of Jazz Grammars
Jon Gillick, Kevin Tang, and Robert M. Keller

In the context of an educational software tool that can generate novel jazz solos using a probabilistic grammar (Keller 2007), this article describes the automated learning of such grammars. Learning takes place from a corpus of transcriptions, typically from a single performer, and our methods attempt to improvise solos representative of such a style. In order to capture idiomatic gestures of a specific soloist, we extend an earlier grammar representation (Keller and Morrison 2007) with a technique for representing melodic contour. Representative contours are extracted from a corpus using clustering, and sequencing among contours is done using Markov chains that are encoded into the grammar.

This article first defines the basic building blocks for contours of typical jazz solos, which we call slopes, then shows how these slopes may be incorporated into a grammar wherein the notes are chosen according to tonal categories relevant to jazz playing. We show that melodic contours can be accurately portrayed using slopes learned from a corpus. Experimental results, including blind comparisons of solos generated from grammars based on several corpora, are reported.

Related Work

Grammars form the basis of our melody generation technique (Keller and Morrison 2007). Use of grammars for creating musical structures has also been suggested or investigated by numerous researchers (Winograd 1968; Roads 1979; Bell and Kippen 1992; Cope 1992; McCormack 1996; Pachet 1999; Papadopoulos and Wiggins 1999; and others). Dubnov et al. (2003) used probabilistic and statistical machine learning methods for musical style recognition. Eck and Lapalme (2008) investigated automatic composition and improvisation with neural networks, and Cruz-Alcazar and Vidal-Ruiz (1998) developed a method for learning grammars to model musical style. The latter applied three grammatical inference algorithms to automatic composition of melodies in Gregorian, Bach, and Joplin styles, achieving the best results with the Gregorian melodies. They classified 20 percent of composed melodies as very good, which they defined as able to be “taken as an original piece from the current style without being a copy or containing evident fragments from samples.” We strove toward the same definition of “very good” solos.

An important part of our representation involves a formalization of melodic contour. Kim et al. (2000) used contours for musical classification and querying, and Chang and Jiau (2003) investigated musical contour with applications to extracting repeating figures and themes from music. In addition, De Roure and Blackburn (2000) proposed melodic pitch contours for content-based navigation of music.

We use clustering as a means of organizing and abstracting a large variety of similar melodic fragments, and Markov chains to represent likely transitions between abstract fragments. Kang, Ku, and Kim (2001) used a graphical clustering algorithm for extraction of melodic themes. Jones (1981) described uses of both Markov chains and grammars in music composition. Verbeurgt, Dinolfo, and Fayer (2004), among others, used Markov models as a means for composition by learning transition [End Page 56] probabilities between patterns. Ames (1989) dealt with different-sized Markov chains of notes.

Jazz Improvisation

Ideally, jazz improvisation involves the creation of new melodies while the melodies themselves are being performed. It is known that this process is often informed by prior creation and practice of vocabulary ideas prior to performance. One of the purposes of our work is to construct software tools that facilitate the construction and recording of such ideas. Such a tool can also be used to transcribe and analyze existent ideas. The present work shows that, once transcribed, a solo can be put to use in the creation of a grammar, which can then be used to provide improvisations over any chord progression, not just the ones in the corpus.

Although a given jazz performer might not be aware of how he or she does improvise, it seems reasonable to say that ideas of what one is able and willing to play can be captured in the form of patterns or, more generally, some form of grammar. It is obvious that a finite set of patterns can be described by an ad hoc grammar. A grammar that is too ad hoc, however, would tend to generate only very predictable, and thus eventually uninteresting, melodies...

Computer Music Journal