auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks
This provides a tool for researchers and developers in audio processing to perform unsupervised representation learning, though it is incremental as it builds on existing autoencoder approaches.
The authors tackled the problem of learning representations from audio data without supervision by developing auDeep, a toolkit based on recurrent sequence-to-sequence autoencoders that accounts for temporal dynamics, and found that its features are competitive with state-of-the-art audio classification methods.
auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data. It is based on a recurrent sequence to sequence autoencoder approach which can learn representations of time series data by taking into account their temporal dynamics. We provide an extensive command line interface in addition to a Python API for users and developers, both of which are comprehensively documented and publicly available at https://github.com/auDeep/auDeep. Experimental results indicate that auDeep features are competitive with state-of-the art audio classification.