CLJun 11, 2024

Sustainable self-supervised learning for speech representations

arXiv:2406.07696v13 citations
Originality Incremental advance
AI Analysis

This work addresses environmental concerns for AI practitioners by making speech representation models more computationally efficient, though it is incremental as it builds on existing resource-efficient baselines.

The paper tackles the high energy consumption of speech representation models by proposing a sustainable self-supervised learning approach that reduces memory usage by an order of magnitude and computing costs by almost three orders of magnitude compared to large models, while improving error rates in downstream tasks.

Sustainable artificial intelligence focuses on data, hardware, and algorithms to make machine learning models more environmentally responsible. In particular, machine learning models for speech representations are computationally expensive, generating environmental concerns because of their high energy consumption. Thus, we propose a sustainable self-supervised model to learn speech representation, combining optimizations in neural layers and training to reduce computing costs. The proposed model improves over a resource-efficient baseline, reducing both memory usage and computing cost estimations. It pretrains using a single GPU in less than a day. On top of that, it improves the error rate performance of the baseline in downstream task evaluations. When comparing it to large speech representation approaches, there is an order of magnitude reduction in memory usage, while computing cost reductions represent almost three orders of magnitude improvement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes