Estefanía Cano

SD
4papers
2citations
Novelty38%
AI Score17

4 Papers

SDDec 9, 2021
Personalized musically induced emotions of not-so-popular Colombian music

Juan Sebastián Gómez-Cañón, Perfecto Herrera, Estefanía Cano et al.

This work presents an initial proof of concept of how Music Emotion Recognition (MER) systems could be intentionally biased with respect to annotations of musically induced emotions in a political context. In specific, we analyze traditional Colombian music containing politically charged lyrics of two types: (1) vallenatos and social songs from the "left-wing" guerrilla Fuerzas Armadas Revolucionarias de Colombia (FARC) and (2) corridos from the "right-wing" paramilitaries Autodefensas Unidas de Colombia (AUC). We train personalized machine learning models to predict induced emotions for three users with diverse political views - we aim at identifying the songs that may induce negative emotions for a particular user, such as anger and fear. To this extent, a user's emotion judgements could be interpreted as problematizing data - subjective emotional judgments could in turn be used to influence the user in a human-centered machine learning environment. In short, highly desired "emotion regulation" applications could potentially deviate to "emotion manipulation" - the recent discredit of emotion recognition technologies might transcend ethical issues of diversity and inclusion.

ASSep 2, 2020
Degradation effects of water immersion on earbud audio quality

Scott Beveridge, Steffen A. Herff, Estefanía Cano

Earbuds are subjected to constant use and scenarios that may degrade sound quality. Indeed, a common fate of earbuds is being forgotten in pockets and faced with a laundry cycle (LC). Manufacturers' accounts of the extent to which LCs affect earbud sound quality are vague at best, leaving users to their own devices in assessing the damage caused. This paper offers a systematic, empirical approach to measure the effects of laundering earbuds on sound quality. Three earbud pairs were subjected to LCs spaced 24 hours apart. After each LC, a professional microphone as well as a mid-market smartphone were used to record i) a test tone ii) a frequency sweep and iii) a music signal played through the earbuds. We deployed mixed effects models and found significant degradation in terms of RMS noise loudness, Total Harmonic Distortion (THD), as well as measures of change in the frequency responses of the earbuds. All transducers showed degradation already after the first cycle, and no transducers produced a measurable signal after the sixth LC. The degradation effects were detectable in both, the professional microphone as well as the smartphone recordings. We hope that the present work is a first step in establishing a practical, and ecologically valid method for everyday users to assess the degree of degradation of their personal earbuds.

SDSep 12, 2019
The emotions that we perceive in music: the influence of language and lyrics comprehension on agreement

Juan Sebastián Gómez Cañón, Perfecto Herrera, Emilia Gómez et al.

In the present study, we address the relationship between the emotions perceived in pop and rock music (mainly in Euro-American styles with English lyrics) and the language spoken by the listener. Our goal is to understand the influence of lyrics comprehension on the perception of emotions and use this information to improve Music Emotion Recognition (MER) models. Two main research questions are addressed: 1. Are there differences and similarities between the emotions perceived in pop/rock music by listeners raised with different mother tongues? 2. Do personal characteristics have an influence on the perceived emotions for listeners of a given language? Personal characteristics include the listeners' general demographics, familiarity and preference for the fragments, and music sophistication. Our hypothesis is that inter-rater agreement (as defined by Krippendorff's alpha coefficient) from subjects is directly influenced by the comprehension of lyrics.

ASApr 12, 2019
Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation

Stylianos Ioannis Mimilakis, Konstantinos Drossos, Estefanía Cano et al.

The goal of this work is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA, we examine the mapping functions of three fundamental DAE-based models in music source separation; one with single-layer encoder and decoder, one with multi-layer encoder and single-layer decoder, and one using skip-filtering connections (SF) with a single-layer encoding and decoding. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structures.