Learning Musical Expressions, Machine Learning showcases and success stories

MLnetOiS Logo left

MLnetOiS Logo right

On this page:
System description | References

SaxEx: A Case-Based Reasoning approach to expressive music synthesis

by J. L. Arcos & R. Lopez de Mantaras, IIIA - Artificial Intelligence Research Institute, CSIC - Spanish Scientific Research Council, Barcelona, Spain

One of the major difficulties in the automatic generation of music is to endow the resulting piece with the expressiveness that characterizes human performers. Following musical rules, no mater how sophisticated and complete they are, is not enough to achieve expression, and indeed computer music usually sounds monotonous and mechanical. The main problem is to grasp the performers personal touch, that is, the knowledge brought about when performing a score. A large part of this knowledge is implicit and very difficult to verbalize. For this reason, AI approaches based on declarative knowledge representations are very useful to model musical knowledge an indeed we represent such knowledge declaratively in our system, however they have serious limitations in grasping performance knowledge. An alternative approach, much closer to the observation/imitation/experimentation process observed in human performers, is that of directly using the performance knowledge implicit in examples of human performers and let the system imitate these performances. To achieve this, we have developed the SaxEx [1,2] ,a case-based reasoning system capable of generating expressive performances of melodies based on examples of human performances. CBR is indeed an appropriate methodology to solve problems by means of examples of already solved similar problems.

System description

The problem-solving task of the system is to infer, via imitation, and using its case-based reasoning capability, a set of expressive transformations to be applied to every note of an inexpressive musical phrase given as input. To achieve this, it uses a case memory containing human performances and background musical knowledge, namely Narmour's theory of musical perception [3] and Lerdahl & Jackendoff's GTTM [4]. The score, containing both melodic and harmonic information, is also given. The expressive transformations to be decided and applied by the system affect the following expressive parameters: dynamics, rubato, vibrato, articulation, and attack. Except for the attack, the notes in the human performed musical phrases are qualified using the SMS (Spectral Modeling and Synthesis) system [5], by means of five different ordered values. For example, for dynamics the values are: very low, low, medium, high and very high and they are automatically computed relative to the average loudness of the inexpressive input phrase. The same idea is used for rubato, vibrato (very little vibrato to very high vibrato) and articulation (very legato to very staccato). The meanings of these values are modeled by means of fuzzy sets. For the attack we have just two situations: reaching the pitch from a lower pitch or increasing the noise component of the sound.

The interactive possibilities of SaxEx allow the user to choose among a variety of alternative choices that have an effect on the resulting expression. Among the alternative choices, the user can express his/her preferences about what is more important, when searching for similar notes at the retrieval step, by ordering the following musical concepts: the metrical strength, the harmonic stability, the melodic direction, the implication/realization structure, the time-span reduction, the prolongation-reduction, the note duration, etc. It can also specify his/her preferences along three affective dimensions: "tender-aggressive", "sad-joyful", and "calm-restless". This interactivity makes it a very interesting environment to experiment and learn about expression in music.

The similarity reasoning capabilities of the CBR system, guided by the background musical knowledge, allow to retrieve those notes in the case base of expressive examples (human performances) that are, musically speaking, similar to each current inexpressive note of the input. The expressive way in which the current inexpressive note will be played (in the so called Reuse step), depends on either one of several criteria such as majority rule, minority rule, randomly, etc. Or, what is even better in the new version of the system, on a fuzzy combination of the fuzzy values of the expressive parameters of the retrieved similar notes notes as explained in the next section.

The following URL's contain information and sound examples of this system:

http://www.iiia.csic.es/~arcos/noos/Demos/Example.html

http://www.iiia.csic.es/~arcos/noos/Demos/Aff-Example.html

References

[1] J.L. Arcos and R. López de Mántaras, An Interactive CBR approach for generating expressive music. Journal of Applied Intelligence, 2001, Vol. 27, nº 1, pp. 115-129.

[2] J.L. Arcos, R. Lopez de Mantaras, X. Serra, Saxex: A case-based reasoning system for generating expressive musical performances, Journal of New Music Research 27 (3), 1998, 194-210.

[3] E. Narmour, The analysis and cognition of basic melodic structures: The implication-realization model, University of Chicago Press, 1990.

[4] F. Lerdhal and R. Jackendoff, An overview of hierarchical structure in music, in: Machine Models of Music (S.M. Schwanaver and D.A. Levitt, eds.), The MIT Press, 1993, 289-312.

[5] X. Serra, J. Bonada, P. Herera, R. Loureiro, Integrating complementary spectral methods in the design of a musical synthesizer, in: Proceedings of the ICMC'97, 1997, 152-159.