N-HANS: Introducing the Augsburg Neuro-Holistic Audio-eNhancement System
This provides a tool for researchers and developers working on audio enhancement, but it is incremental as it builds on existing neural network architectures.
The authors tackled the problem of in-the-wild audio enhancement for speech, music, and general audio by introducing N-HANS, a Python toolkit that achieves outstanding performance with very high audio and speech quality in real-life tasks.
N-HANS is a Python toolkit for in-the-wild audio enhancement, including speech, music, and general audio denoising, separation, and selective noise or source suppression. The functionalities are realised based on two neural network models sharing the same architecture, but trained separately. The models are comprised of stacks of residual blocks, each conditioned on additional speech or environmental noise recordings for adapting to different unseen speakers or environments in real life. In addition to a Python API, a command line interface is provided to researchers and developers, both of which are documented at https://github.com/N-HANS/N-HANS. Experimental results indicate that N-HANS achieves outstanding performance, and ensure its reliable usage in real-life audio and speech-related tasks, reaching very high audio and speech quality.