Current work @Social and Affective Neuroscience (SANE)
Segmentation and anlaysis of face expressions during naturalistic affective stimuli
currently ongoing...
NaPuCco: a non-parametric combination framework for group-level inference for non-negative statistics in 1-sample fMRI data
currently ongoing...
Audio Enhancement - real time
Audio quality enhancement
This work was part of a project developed for a private company
Real time audio denoising
This work was part of a project developed for a private company
Urban Soundscape index modelling
Bayesian-NN for individual soundscape assessment predictions and a perceptual index design, development, evaluation
A Bayesian modelling approach has been implemented to represent and analyse the uncertainty highlighted in the psychometric revisitation of the current standard soundscape ISO.
According to some commonly agreed criteria to u
Psychometric re-visitation of the current standard ISO for soundscape measuring and data collection protocol design
address: University College London, London
link: Pyscometric revistation of ISO protocol
link: Data collection protocol design
Likert scales are useful for collecting data on attitudes and perceptions from large samples of people. In particular, they have become a well-established tool in soundscape studies for conducting in situ surveys to determine how people experience urban public spaces. However, it is still ... read more
Likert scales are useful for collecting data on attitudes and perceptions from large samples of people. In particular, they have become a well-established tool in soundscape studies for conducting in situ surveys to determine how people experience urban public spaces. However, it is still unclear whether the metrics of the scales are consistently interpreted during a typical assessment task. The current work aims at identifying some general trends in the interpretation of Likert scale metrics and introducing a procedure for the derivation of metric corrections by analyzing a case study dataset of 984 soundscape assessments across 11 urban locations in London. According to ISO/TS 12913-2:2018, soundscapes can be assessed through the scaling of 8 dimensions: pleasant, annoying, vibrant, monotonous, eventful, uneventful, calm, and chaotic. The hypothesis underlying this study is that a link exists between correlations across the percentage of assessments falling in each Likert scale category and a dilation/compression factor affecting the interpretation of the scales metric. The outcome of this metric correction value derivation is introduced for soundscape, and a new projection of the London soundscapes according to the corrected circumplex space is compared with the initial ISO circumplex space. The overall results show a general non-equidistant interpretation of the scales, particularly on the vibrant-monotonous direction. The implications of this correction have been demonstrated through a Linear Ridge Classifier task for predicting the London soundscape responses using objective acoustic parameters, which shows significant improvement when applied to the corrected data. The results suggest that the corrected values account for the non-equidistant interpretation of the Likert metrics, thereby allowing mathematical operations to be viable when applied to the data. - From the abstract show less
Music recommendation system
Parametric t-SNE (ANN) for music recommendation system, playlist generation and browser GUI for online music streaming providers
address: Aalborg University, Copenhagen
link: VIMEO
This project presented the development of the model and a user interface for a music-space exploration based on the t-SNE dimension reduction technique, aiming at ... read more
This project presented the development of the model and a user interface for a music-space exploration based on the t-SNE dimension reduction technique, aiming at preserving the shapes and structure of a high dimensional dataset of songs, dictated by N-dimensional features vector, to its projection onto a plane. We investigate different models obtained from using different structures of hidden layers, pre-training technique, features selection, and data pre-processing. The resulting output model has been used to build a music-space of 20000 songs, visually rendered for browser interaction, providing the user a certain degree of freedom to explore it by changing the features to highlight, offering an immersive experience for music exploration and playlist generation. show less
Deep Learning for Music
Violin Vibrato modelling (SVM) and GUI on android rendering with real time pitch detection
address: MTG Pompeu Fabra Universitat, Barcellona
link: A Machine Learning Approach to Violin Vibrato Modelling in Audio Performances and a Didactic Application for Mobile Devices
link
VIMEO
We present a machine learning approach to model vibrato in classical music violin audio performances. A set of descriptors have been
extracted from the music scores of the performed pieces and used to train a model for classifying notes into vibrato or non-vibrato,
as well as for predicting the performed vibrato amplitude and frequency.
... read more
We present a machine learning approach to model vibrato in classical music violin audio performances.
A set of descriptors have been extracted from the music scores of the performed pieces and used to train a model for classifying
notes into vibrato or non-vibrato, as well as for predicting the performed vibrato amplitude and frequency. In addition to score
features we have included a feature regarding the fingering used in the performance. The results show that the fingering feature
affects consistently the prediction of the vibrato amplitude. Finally, an implementation of the resulting models is proposed as a
didactic real-time feedback system to assist violin students in performing pieces using vibrato as an expressive resource.
show less
LSTM modelling for melody and rhythmic structure extraction and generation
address: Aalborg University, Copenhagen
One of the most suggestive concept in artificial intelligence applied to music, shown in several recent studies, is the concept of style. Even if the definition of style is hard to be explicated without considering also historical and social contexts, from a basic overview we can think about it as the sequence of patterns composing the structure of a music piece. The style could be thought indeed as a particular pattern hidden ... read more
One of the most suggestive concept in artificial intelligence applied to music, shown in several recent studies, is the concept of style. Even if the definition of style is hard to be explicated without considering also historical and social contexts, from a basic overview we can think about it as the sequence of patterns composing the structure of a music piece. The style could be thought indeed as a particular pattern hidden in a sequence of symbols describing a music work, doing so it will be easy to manage the information represented by the structure containing the symbols, using them in order to manipulate the style itself or to replicate it.
Making a deeper step into this subject we can separate the main goal in two different problems. One first problem is, given an audio support from which extract the style information, find an automatic way to reduce the original audio to a stream of symbols in which the style is encoded. The second problem concerns using this stream of symbols to decode the style they are carrying; in order to do that it is introduced a support structure capable to decode and memorize the piece structure that will be re-used to compose new music sequences referring to the same style of the original one.
The purpose of this study is the evaluation of Long Short Term Memory network, for music generation from a percussive sequence as an example. The sequence segmented and symbolized through a single-linkage algorithm based on MFCC analysis, and the fed into the network. Then the network is trained with the analyzed data and gains the ability to generate new percussive sequences, according to the example. The results are compared with the previously implemented method of Variable Length Markov Chain models for music generation. show less
Variational Autoencoder for sounds morphing
address: Aalborg university
link: End-To-End_Dilated Variational Autoencoder with Bottleneck Discriminative Loss for Sound Morphing
This project was developed with tensorflow 1.4 where probabilistic inference was not yet available. Two strategies for end-to-end variational autoencoders (VAE) for sound morphing are compared: VAE with ... read more
This project was developed with tensorflow 1.4 where probabilistic inference was not yet available. Two strategies for end-to-end variational autoencoders (VAE) for sound morphing are compared: VAE with dilation layers (DC-VAE) and VAE only with regular convolutional layers (CC-VAE). The training strategy used a combination of the following loss functions: 1) the time-domain mean-squared error for reconstructing the input signal, 2) the Kullback-Leibler divergence to the standard normal distribution in the bottleneck layer, and 3) the classification loss calculated from the bottleneck representation. On a database of spoken digits, 1-nearest neighbor classification was used to show that the sound classes separate in the bottleneck layer. We introduce the Mel-frequency cepstrum coefficient dynamic time warping (MFCC-DTW) deviation as a measure of how well the VAE decoder projects the class center in the latent (bottleneck) layer to the center of the sounds of that class in the audio domain. In terms of MFCC-DTW deviation and 1-NN classification, DC-VAE outperforms CC-VAE. These results limited to the current parametrization and the dataset indicate that DC-VAE is more suitable for sound morphing than CC-VAE, since the DC-VAE decoder better preserves the topology when mapping from the audio domain to the latent space. show less
Artistic augmented space
An Interactive Soundscape augmented space
Address: University of Padua
link: VIMEO
A sonic augmented physical space through granular synthesis: this installation was inspired by Truax's Entrance to the Harbour, from The Vancouver Soundscape 1973 and Pacific Fanfare from The Vancouver Soundscape 1996. It reproduced a collage of sounds referring to a real soundscape. The sounds were spacialized in ... read more
... read more
A sonic augmented physical space through granular synthesis: this installation was inspired by Truax's Entrance to the Harbour, from The Vancouver Soundscape 1973 and Pacific Fanfare from The Vancouver Soundscape 1996. It reproduced a collage of sounds referring to a real soundscape. The sounds were spacialized in a room according to the users's detected position by a camera set on the top of the room. The user had the possibility to explore both the spatial structure of the soundscape and the acoustic structure of its sounds by entering in a central area of the room, where a granular decomposition of the sounds was applied. Launched in the late '70s at the Simon Fraser University, Soundscape Composition is a set of composite strategies working on sound environment. Traditionally linked with granular synthesis, it uses electronic music tools to elaborate environmental sounds, preserving their original contexts. The current state of the art technologies allowed the design and the development of an interactive environment inspired by soundscape composition, in which a user can explore a sound augmented reality referring to a real soundscape. The soundscape exploration occurs on two different layers: on a first layer the user can spatially explore the soundscape, while on a second layer an exploration of the structural composition of the sounds that build the soundscape takes place. This second kind of exploration happens through a granular analysis of the sounds. The user moving through the installation modifies the synthesis and sound dynamic parameters, building a cognitive structure of the augmented environment. The sound feedback of the environment modifies user's awareness and, consequently, his decisions on how to move within it. show less
Environmental Sonification
address: Aalborg University, Copenhagen
link: YOUTUBE
Competitive workshop for developing solutions to the new Lighting-Sound system at the AAU University bridge - winning project