Vassilis Lyberatos

h-index2

7papers

39citations

Novelty29%

AI Score47

Ranked #31,746 of 194,257 authors (top 16%)#205 in SD (top 11%)

7 Papers

5.5HCJun 12, 2023Code

Employing Crowdsourcing for Enriching a Music Knowledge Base in Higher Education

Vassilis Lyberatos, Spyridon Kantarelis, Eirini Kaldeli et al.

This paper describes the methodology followed and the lessons learned from employing crowdsourcing techniques as part of a homework assignment involving higher education students of computer science. Making use of a platform that supports crowdsourcing in the cultural heritage domain students were solicited to enrich the metadata associated with a selection of music tracks. The results of the campaign were further analyzed and exploited by students through the use of semantic web technologies. In total, 98 students participated in the campaign, contributing more than 6400 annotations concerning 854 tracks. The process also led to the creation of an openly available annotated dataset, which can be useful for machine learning models for music tagging. The campaign's results and the comments gathered through an online survey enable us to draw some useful insights about the benefits and challenges of integrating crowdsourcing into computer science curricula and how this can enhance students' engagement in the learning process.

1.5CVApr 28, 2023Code

Synergy of Machine and Deep Learning Models for Multi-Painter Recognition

Vassilis Lyberatos, Paraskevi-Antonia Theofilou, Jason Liartis et al.

The growing availability of digitized art collections has created the need to manage, analyze and categorize large amounts of data related to abstract concepts, highlighting a demanding problem of computer science and leading to new research perspectives. Advances in artificial intelligence and neural networks provide the right tools for this challenge. The analysis of artworks to extract features useful in certain works is at the heart of the era. In the present work, we approach the problem of painter recognition in a set of digitized paintings, derived from the WikiArt repository, using transfer learning to extract the appropriate features and classical machine learning methods to evaluate the result. Through the testing of various models and their fine tuning we came to the conclusion that RegNet performs better in exporting features, while SVM makes the best classification of images based on the painter with a performance of up to 85%. Also, we introduced a new large dataset for painting recognition task including 62 artists achieving good results.

6.8HCMay 3

Music Interpretation and Emotion Perception: A Computational and Neurophysiological Investigation

Vassilis Lyberatos, Spyridon Kantarelis, Ioanna Zioga et al.

This study investigates emotional expression and perception in music performance using computational and neurophysiological methods. The influence of different performance settings, such as repertoire, diatonic modal etudes, and improvisation, as well as levels of expressiveness, on performers' emotional communication and listeners' reactions is explored. Professional musicians performed various tasks, and emotional annotations were provided by both performers and the audience. Audio analysis revealed that expressive and improvisational performances exhibited unique acoustic features, while emotion analysis showed stronger emotional responses. Neurophysiological measurements indicated greater relaxation in improvisational performances. This multimodal study highlights the significance of expressivity in enhancing emotional communication and audience engagement.

4.0SDOct 2, 2025Code

Go witheFlow: Real-time Emotion Driven Audio Effects Modulation

Edmund Dervakos, Spyridon Kantarelis, Vassilis Lyberatos et al.

Music performance is a distinctly human activity, intrinsically linked to the performer's ability to convey, evoke, or express emotion. Machines cannot perform music in the human sense; they can produce, reproduce, execute, or synthesize music, but they lack the capacity for affective or emotional experience. As such, music performance is an ideal candidate through which to explore aspects of collaboration between humans and machines. In this paper, we introduce the witheFlow system, designed to enhance real-time music performance by automatically modulating audio effects based on features extracted from both biosignals and the audio itself. The system, currently in a proof-of-concept phase, is designed to be lightweight, able to run locally on a laptop, and is open-source given the availability of a compatible Digital Audio Workstation and sensors.

11.5SDDec 18, 2023Code

Perceptual Musical Features for Interpretable Audio Tagging

Vassilis Lyberatos, Spyridon Kantarelis, Edmund Dervakos et al.

In the age of music streaming platforms, the task of automatically tagging music audio has garnered significant attention, driving researchers to devise methods aimed at enhancing performance metrics on standard datasets. Most recent approaches rely on deep neural networks, which, despite their impressive performance, possess opacity, making it challenging to elucidate their output for a given input. While the issue of interpretability has been emphasized in other fields like medicine, it has not received attention in music-related tasks. In this study, we explored the relevance of interpretability in the context of automatic music tagging. We constructed a workflow that incorporates three different information extraction techniques: a) leveraging symbolic knowledge, b) utilizing auxiliary deep neural networks, and c) employing signal processing to extract perceptual features from audio files. These features were subsequently used to train an interpretable machine-learning model for tag prediction. We conducted experiments on two datasets, namely the MTG-Jamendo dataset and the GTZAN dataset. Our method surpassed the performance of baseline models in both tasks and, in certain instances, demonstrated competitiveness with the current state-of-the-art. We conclude that there are use cases where the deterioration in performance is outweighed by the value of interpretability.

10.4CLOct 20, 2024Code

BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training

Nikitas Theodoropoulos, Giorgos Filandrianos, Vassilis Lyberatos et al.

We describe our contribution to the Strict and Strict-Small tracks of the 2nd iteration of the BabyLM Challenge. The shared task is centered around efficient pre-training given data constraints motivated by human development. In response, we study the effect of synthetic story data in language pre-training using TinyStories: a recently introduced dataset of short stories. Initially, we train GPT-Neo models on subsets of TinyStories, while varying the amount of available data. We find that, even with access to less than 100M words, the models are able to generate high-quality, original completions to a given story, and acquire substantial linguistic knowledge. To measure the effect of synthetic story data, we train LTG-BERT encoder models on a combined dataset of: a subset of TinyStories, story completions generated by GPT-Neo, and a subset of the BabyLM dataset. Our experimentation reveals that synthetic data can occasionally offer modest gains, but overall have a negative influence on linguistic understanding. Our work offers an initial study on synthesizing story data in low resource settings and underscores their potential for augmentation in data-constrained language modeling. We publicly release our models and implementation on our GitHub.

6.7SDOct 29, 2024Code

CHORDONOMICON: A Dataset of 666,000 Songs and their Chord Progressions

Spyridon Kantarelis, Konstantinos Thomas, Vassilis Lyberatos et al.

Chord progressions encapsulate important information about music, pertaining to its structure and conveyed emotions. They serve as the backbone of musical composition, and in many cases, they are the sole information required for a musician to play along and follow the music. Despite their importance, chord progressions as a data domain remain underexplored. There is a lack of large-scale datasets suitable for deep learning applications, and limited research exploring chord progressions as an input modality. In this work, we present Chordonomicon, a dataset of over 666,000 songs and their chord progressions, annotated with structural parts, genre, and release date - created by scraping various sources of user-generated progressions and associated metadata. We demonstrate the practical utility of the Chordonomicon dataset for classification and generation tasks, and discuss its potential to provide valuable insights to the research community. Chord progressions are unique in their ability to be represented in multiple formats (e.g. text, graph) and the wealth of information chords convey in given contexts, such as their harmonic function . These characteristics make the Chordonomicon an ideal testbed for exploring advanced machine learning techniques, including transformers, graph machine learning, and hybrid systems that combine knowledge representation and machine learning.