Weakly Supervised Multi-Embeddings Learning of Acoustic Models
This work addresses speech processing challenges by enabling efficient multi-task learning, but it appears incremental as it builds on existing Siamese network methods without introducing a new paradigm.
The paper tackled the problem of training acoustic models with weakly supervised multi-embedding learning using a Siamese network on a speech dataset, and found that sharing a network for both word and talker discrimination tasks did not result in performance loss.
We trained a Siamese network with multi-task same/different information on a speech dataset, and found that it was possible to share a network for both tasks without a loss in performance. The first task was to discriminate between two same or different words, and the second was to discriminate between two same or different talkers.