SD ASJul 20, 2021

PERSA+: A Deep Learning Front-End for Context-Agnostic Audio Classification

Lazaros Vrysis, Iordanis Thoidis, Charalampos Dimoulas, George Papanikolaou

arXiv:2107.09311v1

Originality Synthesis-oriented

AI Analysis

This work addresses the issue of overfitting in audio classification for real-world applications, but it appears incremental as it builds on existing deep learning methods.

The authors tackled the problem of deep learning audio classification algorithms performing well only on benchmarks by proposing a deep learning front-end to discard detrimental information, aiming to develop robust and context-agnostic classification algorithms.

Deep learning has been applied to diverse audio semantics tasks, enabling the construction of models that learn hierarchical levels of features from high-dimensional raw data, delivering state-of-the-art performance. But do these algorithms perform similarly in real-world conditions, or just at the benchmark, where their high learning capability assures the complete memorization of the employed datasets? This work presents a deep learning front-end, aiming at discarding detrimental information before entering the modeling stage, bringing the learning process closer to the point, anticipating the development of robust and context-agnostic classification algorithms.

View on arXiv PDF

Similar