Machine learning in acoustics: theory and applications
This is an incremental survey paper that reviews existing methods and applications in acoustics for researchers and practitioners in fields such as biology, communications, and Earth science.
The paper surveys recent advances in applying machine learning, particularly deep learning, to acoustics, highlighting its data-driven approach for discovering complex relationships in acoustic phenomena like speech and reverberation, with compelling results across various fields.
Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of techniques, which are often based in statistics, for automatically detecting and utilizing patterns in data. Relative to conventional acoustics and signal processing, ML is data-driven. Given sufficient training data, ML can discover complex relationships between features and desired labels or actions, or between features themselves. With large volumes of training data, ML can discover models describing complex acoustic phenomena such as human speech and reverberation. ML in acoustics is rapidly developing with compelling results and significant future promise. We first introduce ML, then highlight ML developments in four acoustics research areas: source localization in speech processing, source localization in ocean acoustics, bioacoustics, and environmental sounds in everyday scenes.