Learning Musical Expressions, Machine Learning showcases and success stories

MLnetOiS Logo left

MLnetOiS Logo right

On this page:
Materials | References | Acknowledgments

Learning Musical Expressions

By Gerhard Widmer, Department of Medical Cybernetics and Artificial Intelligence, University of Vienna,
and Austrian Research Institute for Artificial Intelligence, Vienna, Austria

The goal of this research is to gain a deeper understanding of musical expression and to contribute to the branch of musicology that tries to develop quantitative models and theories of musical expression. Expressive music performance refers to the variations in tempo, timing, dynamics (loudness), articulation, etc. that performers apply when playing and `interpreting' a piece. Our goal is to study real expressive performances with machine learning methods, in order to discover some fundamental patterns or principles that characterize `sensible' musical performances, and to elucidate the relation between structural aspects of the music and typical or musically `sensible' performance patterns. The ultimate result would be a formal model that explains or predicts those aspects of expressive variation that seem to be common to most typical performances and can thus be regarded as fundamental principles.

Here are some results of an early experiment with waltzes by Frederic Chopin. The training pieces were (only!) five rather short excerpts (about 20 bars on average) from the three waltzes Op.64 no.2, Op.69 no.2, and Op.70 no.3, played by the author on an electronic piano and recorded via MIDI. These examples were processed and presented to a hybrid learning algorithm that induces rules that predict both a categorical class (e.g., "shorten the note" vs. "lengthen the note") and a precise numeric value ("shorten the note by a factor of 0.896"). The algorithm was first described in (Widmer, 1993). The results of learning were then tested by having the system play other excerpts from Chopin waltzes. The only expression dimensions considered were tempo and dynamics; other aspects like articulation (e.g., staccato vs. legato) and the use of the piano pedals were ignored.

The enclosed sound examples demonstrate the effect of learning. For each of three test pieces, you can listen

to the piece as the system plays it *before learning*; this is a perfectly "mechanical" performance, with constant ('flat') tempo and no variations in dynamics, as the system knows absolutely nothing about expressive performance;
to the piece as played by the system *after learning*: the learner applies the tempo and dynamics rules learned from the training pieces. You will hear a clear difference to the "non-expressive" performance in terms of timing and loudness. You will hear a number of mistakes too, of course; remember that this was learned from very little training information. Also, note that the learner is not aware of expression marks given in the musical score; it sees the notes only.

Some aspects of these performances are also illustrated graphically. The attached expression curves are to be interpreted as follows:

The tempo curve plots the relative tempo applied to each note – the higher the curve, the faster the local tempo. A purely mechanical performance with constant tempo would correspond to a straight line at y=1.0.
The dynamics curves plot the relative loudness with which each note was played - the higher, the louder. 1.0 is the average loudness of the entire performance.

Various arrows and other marks were added by the author to illustrate various structural regularities inherent in the performances.

Materials

The following examples are available:

chopin1a_scr.wav (3.393 KB) / .mid (2 KB): Chopin Waltz op.18, Eb major (beginning), before learning

chopin1a.wav (3.449 KB) / .mid (2 KB): Chopin Waltz op.18, Eb major (beginning), after learning

chopin1a_dynamics.gif (11 KB): Dynamics (= loudness) curve corresponding to chopin1a.wav/mid

chopin1a_tempo.gif (11 KB): Tempo curve corresponding to chopin1a.wav/mid

chopin7b_scr.wav (3.348 KB) / .mid (2 KB): Chopin Waltz op.64, no.2, C# minor (beginning of 2nd part), before learning

chopin7b.wav (3.298 KB) / .mid (2 KB): Chopin Waltz op.64, no.2, C# minor (beginning of 2nd part), after learning

chopin7b_tempo.gif (10 KB): Tempo curve corresponding to chopin7b.wav/mid

chopin10_scr.wav (3.679 KB) / .mid (2 KB): Chopin Waltz op. 69, no. 2, B minor (beginning), before learning

chopin10.wav (1.870 KB) / .mid (5 KB): Chopin Waltz op. 69, no. 2, B minor (beginning, longer passage), after learning

References

Widmer, G. (1993). Combining Knowledge-Based and Instance-Based Learning to Exploit Qualitative Knowledge. Informatica 17, Special Issue on Multistrategy Learning, pp. 371-385.

[2] Widmer, G. (1997). Applications of Machine Learning to Music Research: Empirical Investigations into the Phenomenon of Musical Expression. In R.S. Michalski, I. Bratko and M. Kubat (eds.), Machine Learning, Data Mining and Wiley, Chichester, UK.

This research is currently continued in the form of a large basic research project, financed by a generous grant by the Austrian Federal Government (see Acknowledgements). An overview of the project and an up-to-date list of publications can be found at .

Acknowledgments

The continuation of this research is supported by the Austrian Federal Government via a very generous START Research Prize (START programme Y99-INF). The Austrian Research Institute for Artificial Intelligence acknowledges for Education, Science, and Culture.