In lieu of an abstract, here is a brief excerpt of the content:

  • An Evaluation of Musical Score Characteristics for Automatic Classification of Composers
  • Ofer Dor and Yoram Reich

Although humans can distinguish between different types of music, automated music classification is a great challenge. Within the last decade, numerous studies have been conducted on the subject, using both audio and score analysis (Kranenburg and Baker 2004; Manaris et al. 2005; Laurier and Herrera 2007; Weihs et al. 2007; Raphael 2008; Laurier et al. 2009). The classifications in these studies were done mostly by inference methods and/or machine-learning methods. The results have been quite modest (Kranenburg and Baker 2004; Laurier et al. 2009).

As music can be classified in many ways, studies have focused on diverse classification targets. Kranenburg and Baker (2004) have shown that it is possible to automatically recognize musical style from compositions of five well known 18th-century composers. Numerous algorithms have been proposed to detect important musical features (melodic, rhythmic, and harmonic) with data mining and machine-learning techniques in large corpora of scores (Hartmann et al. 2007). Geertzen and Zaanen (2008) presented an approach to automatic composer recognition based on learning recurring patterns in music by grammatical inference.

Music can be represented in audio or as notation. Existing classification studies encode features with different representations. Clearly, the nature of the representation is a major determinant of the success of the classification. Therefore, the difficulty that present approaches have in classifying composers by their compositions stems from using features that do not sufficently capture the differences between the composers.

Manaris et al. (2005), using a new set of features (metrics), achieved a classification accuracy of 94 percent for five composers. Their experiments, however, seem questionable: They performed only one holdout test instead of multiple cross-validation tests with statistical significant results, as is customary in data mining or machine-learning studies (Reich and Barai 1999; Demsar 2006). Therefore, it is difficult to assess the benefit of their proposed features. Furthermore, the distribution of the composers in the training and testing instances is not clear. (A later paper [Manaris et al. 2008] did use cross-validation, but with respect to genre classification rather than composer classification.)

We propose an approach that classifies composers of classical music based on certain low-level characteristics of their compositions. The proposed composition characteristics are descriptive features derived from the time-ordered sequence of pitches in a composition. These are simple features appropriate for a data-mining application; they are not high-level music-theoretical features, such as the results of traditional harmonic, contrapuntal, motivic, or metrical analysis. However, our results indicate that by using these features, classification by composer in a two-composer data set can be done with usually greater than 90 percent accuracy. Our results also show that classifying composers with works in the same genre and/or the same instrumentation achieves higher accuracy than does classifying works without regard to genre or instrumentation. (Although the term “genre” has multiple meanings in music scholarship, we use it here to refer to a major period of classical music or, in the case of ragtime, to a particular historical style.) We show, for example, that distinguishing between keyboard music of Mozart and Haydn—which, as these composers have very similar styles, is considered a challenging task for most humans—can be done with 75 percent accuracy. Further, we demonstrate the contribution of individual features to the classification accuracy in data sets containing multiple composers, as well as in data sets containing only two composers.

In what follows, we describe the data structure used in this study, including the new [End Page 86] features discovered by a machine-learning program called CHECKUP, and we discuss the experiments and their results. The data sets and scores are freely available at

Data Structure

The notation of a musical score needs to be converted to different syntaxes that are suitable for machine-learning classifiers.

For nine composers, all available musical score files in **kern format (Huron 1997) were downloaded from the Humdrum project’s library at The scores are for keyboard (e.g., piano or organ) or for string instruments. The **kern format is...


Additional Information

Print ISSN
pp. 86-97
Launched on MUSE
Open Access
Back To Top

This website uses cookies to ensure you get the best experience on our website. Without cookies your experience may not be seamless.