Home

Matteo Lionello

Research and Development in Machine Learning & Deep Learning Technologies for Audio and Sound

Curriculum


Education and Job Experience


IMT Lucca 2022 - ongoing
PhD Student in Cognitive, Computational and Social Neuroscience at Social and Affective NEuroscience (SANE)
Machine Learning research consultancy for audio , UK 2021 - 2022
The Bartlett Institute, University Collage London (UCL), London October 2018 - February 2021
MPhil Student at the Bartlett Institute
Thesis title: A new methodology for modelling urban soundscapes: a psychometric revisitation of the current standard and a Bayesian approach for individual response prediction
Universitat Pompeu Fabra, Barcelona September - December 2017
Visiting student at Music Technology Group, MTG - TELMI ERC project
Project title: A Machine Learning Approach to Violin Vibrato Modelling in Audio Performances and a Didactic Application for Mobile Devices
Aalborg University, Copenhagen October 2016 - June 2018
MSc in Sound and Music Computing, SMC
Thesis title: Deep Learning for Sounds Representation and Generation
University of Padova, Padova September 2011 - November 2015
BSc in Information Engineering
Thesis title: Interactive Soundscapes: Design of a physical space augmented by dynamic sound rendering.

Reviewer activity for Journals:


  • European Physical Journal, Springer ...
  • Neurocomputing, Elsevier
  • Quality and Quantity, Springer Science+Business Media
  • Building Simulation Journal, Springer

Conferences and Workshops


  • Organization for Human Brain Mapping 2024 (OHBM24) July 23 to 27, 2024 Seoul
  • November 9 to 11, 2023 Siena
  • International Conference on Acoustics, Speech, and Signal Processing (ICASSP) May 4 to 8, 2020 Barcelona
  • International Congress on Acoustics (ICA2019) September 9 to 13, 2019 Aachen
  • Machine Learning for Acoustics Summer School (UKANSS19) August 5 to 9, 2019 Gregynog Hall, Tregynon
  • Soundscape Workshop (IOA) June 25, 2019 London
  • Sound and Music Computing Conference (SMC Conference) July 4 to 7, 2018 Limassol
  • International Workshop on Machine Learning and Music (MML) October 6, 2017 Barcelona
  • New Interfaces for Musical Expression (NIME) May 15 to 19, 2017 Copenhagen
  • Sound and Music Computing Summer School August 26 to 30, 2016 Hamburg

Prizes and Awards


  • UKAN Acoustics Network Summer School Grant
  • Pounds 500,00 Young Scientist Conference Attendance Award
  • Studentship in urban sound environment, EU ERC.
  • Awarded grant of DKK 6.963,00 by OTTO MØNSTEDS FOND.

Technical Knowledge


Programming Languages:
Java, Python, Matlab, Unix, basic knowledge of C++ and PhP
Libraries:
Keras, Tensorflow, Pytorch, Pytorch Lightning
Protocols:
OpenSoundsControl (OSC) and TCP/IP
Databases:
basic knowledge of SQL and MySQL
Tools:
AFNI, PureData, Android, HTML, CSS and basic knowledge of PhP

Language Skills


Italian: Native language; English: C1 (7.5 IELTS, October 2018); Spanish: B\'asico

Projects


Current work @Social and Affective Neuroscience (SANE)


Segmentation and anlaysis of face expressions during naturalistic affective stimuli

currently ongoing...

NaPuCco: a non-parametric combination framework for group-level inference for non-negative statistics in 1-sample fMRI data

currently ongoing...

Audio Enhancement - real time


Audio quality enhancement

This work was part of a project developed for a private company

Real time audio denoising

This work was part of a project developed for a private company

Urban Soundscape index modelling


Bayesian-NN for individual soundscape assessment predictions and a perceptual index design, development, evaluation

A Bayesian modelling approach has been implemented to represent and analyse the uncertainty highlighted in the psychometric revisitation of the current standard soundscape ISO.

According to some commonly agreed criteria to u

Psychometric re-visitation of the current standard ISO for soundscape measuring and data collection protocol design

address: University College London, London
link: Pyscometric revistation of ISO protocol
link: Data collection protocol design

Likert scales are useful for collecting data on attitudes and perceptions from large samples of people. In particular, they have become a well-established tool in soundscape studies for conducting in situ surveys to determine how people experience urban public spaces. However, it is still

Likert scales are useful for collecting data on attitudes and perceptions from large samples of people. In particular, they have become a well-established tool in soundscape studies for conducting in situ surveys to determine how people experience urban public spaces. However, it is still unclear whether the metrics of the scales are consistently interpreted during a typical assessment task. The current work aims at identifying some general trends in the interpretation of Likert scale metrics and introducing a procedure for the derivation of metric corrections by analyzing a case study dataset of 984 soundscape assessments across 11 urban locations in London. According to ISO/TS 12913-2:2018, soundscapes can be assessed through the scaling of 8 dimensions: pleasant, annoying, vibrant, monotonous, eventful, uneventful, calm, and chaotic. The hypothesis underlying this study is that a link exists between correlations across the percentage of assessments falling in each Likert scale category and a dilation/compression factor affecting the interpretation of the scales metric. The outcome of this metric correction value derivation is introduced for soundscape, and a new projection of the London soundscapes according to the corrected circumplex space is compared with the initial ISO circumplex space. The overall results show a general non-equidistant interpretation of the scales, particularly on the vibrant-monotonous direction. The implications of this correction have been demonstrated through a Linear Ridge Classifier task for predicting the London soundscape responses using objective acoustic parameters, which shows significant improvement when applied to the corrected data. The results suggest that the corrected values account for the non-equidistant interpretation of the Likert metrics, thereby allowing mathematical operations to be viable when applied to the data. - From the abstract show less

Music recommendation system


Parametric t-SNE (ANN) for music recommendation system, playlist generation and browser GUI for online music streaming providers

address: Aalborg University, Copenhagen
link: VIMEO

This project presented the development of the model and a user interface for a music-space exploration based on the t-SNE dimension reduction technique, aiming at

This project presented the development of the model and a user interface for a music-space exploration based on the t-SNE dimension reduction technique, aiming at preserving the shapes and structure of a high dimensional dataset of songs, dictated by N-dimensional features vector, to its projection onto a plane. We investigate different models obtained from using different structures of hidden layers, pre-training technique, features selection, and data pre-processing. The resulting output model has been used to build a music-space of 20000 songs, visually rendered for browser interaction, providing the user a certain degree of freedom to explore it by changing the features to highlight, offering an immersive experience for music exploration and playlist generation. show less

Deep Learning for Music


Violin Vibrato modelling (SVM) and GUI on android rendering with real time pitch detection

address: MTG Pompeu Fabra Universitat, Barcellona
link: A Machine Learning Approach to Violin Vibrato Modelling in Audio Performances and a Didactic Application for Mobile Devices
link VIMEO

We present a machine learning approach to model vibrato in classical music violin audio performances. A set of descriptors have been extracted from the music scores of the performed pieces and used to train a model for classifying notes into vibrato or non-vibrato, as well as for predicting the performed vibrato amplitude and frequency.

We present a machine learning approach to model vibrato in classical music violin audio performances. A set of descriptors have been extracted from the music scores of the performed pieces and used to train a model for classifying notes into vibrato or non-vibrato, as well as for predicting the performed vibrato amplitude and frequency. In addition to score features we have included a feature regarding the fingering used in the performance. The results show that the fingering feature affects consistently the prediction of the vibrato amplitude. Finally, an implementation of the resulting models is proposed as a didactic real-time feedback system to assist violin students in performing pieces using vibrato as an expressive resource. show less

LSTM modelling for melody and rhythmic structure extraction and generation

address: Aalborg University, Copenhagen

One of the most suggestive concept in artificial intelligence applied to music, shown in several recent studies, is the concept of style. Even if the definition of style is hard to be explicated without considering also historical and social contexts, from a basic overview we can think about it as the sequence of patterns composing the structure of a music piece. The style could be thought indeed as a particular pattern hidden

One of the most suggestive concept in artificial intelligence applied to music, shown in several recent studies, is the concept of style. Even if the definition of style is hard to be explicated without considering also historical and social contexts, from a basic overview we can think about it as the sequence of patterns composing the structure of a music piece. The style could be thought indeed as a particular pattern hidden in a sequence of symbols describing a music work, doing so it will be easy to manage the information represented by the structure containing the symbols, using them in order to manipulate the style itself or to replicate it. Making a deeper step into this subject we can separate the main goal in two different problems. One first problem is, given an audio support from which extract the style information, find an automatic way to reduce the original audio to a stream of symbols in which the style is encoded. The second problem concerns using this stream of symbols to decode the style they are carrying; in order to do that it is introduced a support structure capable to decode and memorize the piece structure that will be re-used to compose new music sequences referring to the same style of the original one. The purpose of this study is the evaluation of Long Short Term Memory network, for music generation from a percussive sequence as an example. The sequence segmented and symbolized through a single-linkage algorithm based on MFCC analysis, and the fed into the network. Then the network is trained with the analyzed data and gains the ability to generate new percussive sequences, according to the example. The results are compared with the previously implemented method of Variable Length Markov Chain models for music generation. show less

Variational Autoencoder for sounds morphing

address: Aalborg university
link: End-To-End_Dilated Variational Autoencoder with Bottleneck Discriminative Loss for Sound Morphing

This project was developed with tensorflow 1.4 where probabilistic inference was not yet available. Two strategies for end-to-end variational autoencoders (VAE) for sound morphing are compared: VAE with

This project was developed with tensorflow 1.4 where probabilistic inference was not yet available. Two strategies for end-to-end variational autoencoders (VAE) for sound morphing are compared: VAE with dilation layers (DC-VAE) and VAE only with regular convolutional layers (CC-VAE). The training strategy used a combination of the following loss functions: 1) the time-domain mean-squared error for reconstructing the input signal, 2) the Kullback-Leibler divergence to the standard normal distribution in the bottleneck layer, and 3) the classification loss calculated from the bottleneck representation. On a database of spoken digits, 1-nearest neighbor classification was used to show that the sound classes separate in the bottleneck layer. We introduce the Mel-frequency cepstrum coefficient dynamic time warping (MFCC-DTW) deviation as a measure of how well the VAE decoder projects the class center in the latent (bottleneck) layer to the center of the sounds of that class in the audio domain. In terms of MFCC-DTW deviation and 1-NN classification, DC-VAE outperforms CC-VAE. These results limited to the current parametrization and the dataset indicate that DC-VAE is more suitable for sound morphing than CC-VAE, since the DC-VAE decoder better preserves the topology when mapping from the audio domain to the latent space. show less

Artistic augmented space


An Interactive Soundscape augmented space

Address: University of Padua
link: VIMEO

A sonic augmented physical space through granular synthesis: this installation was inspired by Truax's Entrance to the Harbour, from The Vancouver Soundscape 1973 and Pacific Fanfare from The Vancouver Soundscape 1996. It reproduced a collage of sounds referring to a real soundscape. The sounds were spacialized in

A sonic augmented physical space through granular synthesis: this installation was inspired by Truax's Entrance to the Harbour, from The Vancouver Soundscape 1973 and Pacific Fanfare from The Vancouver Soundscape 1996. It reproduced a collage of sounds referring to a real soundscape. The sounds were spacialized in a room according to the users's detected position by a camera set on the top of the room. The user had the possibility to explore both the spatial structure of the soundscape and the acoustic structure of its sounds by entering in a central area of the room, where a granular decomposition of the sounds was applied. Launched in the late '70s at the Simon Fraser University, Soundscape Composition is a set of composite strategies working on sound environment. Traditionally linked with granular synthesis, it uses electronic music tools to elaborate environmental sounds, preserving their original contexts. The current state of the art technologies allowed the design and the development of an interactive environment inspired by soundscape composition, in which a user can explore a sound augmented reality referring to a real soundscape. The soundscape exploration occurs on two different layers: on a first layer the user can spatially explore the soundscape, while on a second layer an exploration of the structural composition of the sounds that build the soundscape takes place. This second kind of exploration happens through a granular analysis of the sounds. The user moving through the installation modifies the synthesis and sound dynamic parameters, building a cognitive structure of the augmented environment. The sound feedback of the environment modifies user's awareness and, consequently, his decisions on how to move within it. show less

Environmental Sonification

address: Aalborg University, Copenhagen
link: YOUTUBE

Competitive workshop for developing solutions to the new Lighting-Sound system at the AAU University bridge - winning project

Publications


2021:

M. Lionello, F. Aletta, J. Kang, (2021) "Introducing a Method for Intervals Correction on Multiple Likert Scales: A Case Study on an Urban Soundscape Data Collection Instrument" Frontiers in Psychology

A. Mitchell, T. Oberman, F. Aletta, M. Kachlicka, M. Lionello, M. Erfanian, J. Kang, (2021). "Investigating urban soundscapes of the COVID-19 lockdown: A predictive soundscape modeling approach" The Journal of the Acoustical Society of America

2020:

M. Lionello, F. Aletta, J. Kang, (2020) "A systematic review of prediction models for the experience of urban soundscapes" in Applied Acoustics

A. Mitchell, T. Oberman, F. Aletta, M. Erfanian, M. Kachlicka, M. Lionello, J. Kang, (2020) "The Soundscape Indices (SSID) Protocol: A Method for Urban Soundscape Surveys—Questionnaires with Acoustical and Contextual Information" in Applied Sciences

M. Lionello, F. Aletta, J. Kang. (2019) "On the dimension and scaling analysis of soundscape assessment tools: a case study about the “Method A” of ISO/TS 12913-2:2018"

2019:

F. Aletta, T. Oberman, J. Kang, M. Erfanian, M. Kachlicka, M. Lionello, A. Mitchell, (2019) "Associations between soundscape experience and self-reported wellbeing in open public urban spaces: a field study", The LANCET,

M. Lionello, H. Purwins (2019) "End-To-End Dilated Variational Autoencoder with Bottleneck Discriminative Loss for Sound Morphing - A Preliminary Study" DOI: 10.13140/RG.2.2.21572.58240/1.

2018:

M. Lionello, L. Pietrogrande, H. Purwins, M. Abou-Zleikha (2018) "Exploration of Musical Space with Parametric t-SNE in a Browser Interface" Proceedings to the 15th Sound and Music Computing Conference, Limassol, Cyprus.

M. Lionello, R. Ramirez, (2018) "A Machine Learning Approach to Violin Vibrato Modelling in Audio Performances and a Didactic Application for Mobile Devices" Proceedings to the 15th Sound and Music Computing Conference, Limassol, Cyprus.

M. Lionello, H. Purwins "Deep Learning for Sounds Representation and Generation" Master thesis, 2018. Aalbrog University, Copenhagen available here

2017:

M. Lionello, M. Mandanici, S. Canazza, E. Micheloni, (2017) Interactive Soundscapes: Developing a Physical Space Augmented through Dynamic Sound Rendering and Granular Synthesis" Proceedings of the 14th Sound and Music Computing Conference, Espoo, Finland.

Complete list of publications available on my personal Research Gate page

Summary


Professional Profile:

My academic journey is characterized by a profound passion for the integration of technology, particularly in the fields of machine learning and deep learning, with a specific focus on music, sound, digital signal processing, and urban environments.

I am currently engaged in a Ph.D. program in neuroscience, with a research emphasis on unraveling the neural processes involved in the simultaneous processing of music and emotions in the human brain. This research aligns seamlessly with my broader academic interests exploring the intricate relationship between technology and human experiences.

Educational Background:

Commencing my studies at the University of Padua, I obtained a Bachelor's degree in Information Engineering. Subsequently, I pursued specialized studies in Sound and Music Computing at Aalborg University, where my commitment to machine learning and deep learning projects included collaborations with industry partners and a visiting research period at the Music Technology Group in Barcelona.

Continuing my academic trajectory, I earned an MPhil degree at the Institute for Environmental Design and Engineering, The Bartlett, University College London. As part of the ERC Advanced Grant in Urban Soundscape Indices (SSID), under the guidance of Prof. Jian Kang, I developed sophisticated machine learning methods and psychometric analysis tools to predict urban soundscapes.

Beyond technical pursuits, my scholarly motivation extends to ecological preservation and the confluence of engineering with artistic and humanistic domains. I aspire to leverage engineering tools for the promotion and preservation of cultural heritage.

Contacts


News