I was invited to participate in the Mapping Social Interaction through Sound symposium on 27-28 November 2020. The symposium is organised by Humboldt University, Berlin and – as it is customary these days – will take place on Zoom.
This is the abstract of my talk.
Building and exploring multimodal musical corpora:
from data collection to interaction design using machine learning
Musical performance is a multimodal experience, for performers and listeners alike. A multimodal representation of a piece of music can contain several synchronized layers, such as audio, symbolic representations (e.g. a score), videos of the performance, physiological and motion data describing the performers movements, as well as semantic labelling and annotations describing expressivity and other high-level qualities of the music. This delineates a scenario where computational music analysis can harness cross-modal processing and multimodal fusion methods to shift the focus toward the relationships that tie together different modalities, thereby revealing the links between low-level features and high-level expressive qualities.
I will present two concurrent projects focussed on harnessing musical corpora for analysing expressive instrumental music performance and design musical interactions. The first project is centered on a data collection method – currently being developed by the GEMM research cluster at the School of Music in Piteå – aimed at bridging the gap between qualitative and quantitative approaches. The purpose of this method is to build a data corpus containing multimodal measurements linked to high-level subjective observations. By applying stimulated recall (a common qualitative research method in education, medicine, and psychotherapy) the embodied knowledge of music professionals is systematically included in the analytic framework. Qualitative analysis through stimulated recall is an efficient method for generating higher-level understandings of musical performance. Initial results suggest that this process is pivotal in building our multimodal corpus, providing insights that would be unattainable using quantitative data alone.
The second project – a joint effort with the Computing Department at Goldsmiths, University of London – consists in a sonic interaction design approach that makes use of deep reinforcement learning to explore many mapping possibilities between large sound corpora and motion sensor data. The design approach adopted is inspired by the ideas established by the interactive machine learning paradigm, as well as by the use of artificial agents in computer music for exploring complex parameter spaces. We refer to this interaction design approach as Assisted Interactive Machine Learning (AIML). While playing with a large corpus of sounds through gestural interaction by means of a motion sensor, the user can give feedback to an artificial agent about the gesture-sound mappings proposed by the latter. This iterative process results in an interactive exploration of the corpus, as well as in a way of creating and refining gesture-sound mappings.
These projects are representative of how the development of methods for combining qualitative and quantitative data, in conjunction with the use of computational techniques such as machine learning, can be instrumental in the design of complex mappings between body movement and musical sound, and contribute to the study of the multiple facets of embodied music performance.
Visi, F. G., Östersjö, S., Ek, R., & Röijezon, U. (2020). Method development for multimodal data corpus analysis of expressive instrumental music performance. Frontiers in Psychology, 11(576751), doi: 10.3389/fpsyg.2020.576751
Download PDF (pre-print)
Visi, F. G., & Tanaka, A. (2021). Interactive Machine Learning of Musical Gesture. In E. R. Miranda (Ed.), Handbook of Artificial Intelligence for Music: Foundations, Advanced Approaches, and Developments for Creativity. Springer Nature, forthcoming.
View on arXiv.org
Download PDF (pre-print)
Visi, F. G., & Tanaka, A. (2020). Towards Assisted Interactive Machine Learning: Exploring Gesture-Sound Mappings Using Reinforcement Learning. In ICLI 2020 – the Fifth International Conference on Live Interfaces.