Introducing Latent Timbre Synthesis
This work addresses the problem of timbre manipulation for composers and sound designers, offering an accessible tool, though it appears incremental as it builds on existing variational autoencoder architectures.
The paper introduces Latent Timbre Synthesis (LTS), a deep learning method for audio synthesis that enables interpolation and extrapolation between sound timbres using latent spaces, and provides an open-source application with a graphical interface for practical use.
We present the Latent Timbre Synthesis (LTS), a new audio synthesis method using Deep Learning. The synthesis method allows composers and sound designers to interpolate and extrapolate between the timbre of multiple sounds using the latent space of audio frames. We provide the details of two Variational Autoencoder architectures for LTS, and compare their advantages and drawbacks. The implementation includes a fully working application with graphical user interface, called \textit{interpolate\_two}, which enables practitioners to explore the timbre between two audio excerpts of their selection using interpolation and extrapolation in the latent space of audio frames. Our implementation is open-source, and we aim to improve the accessibility of this technology by providing a guide for users with any technical background.