AS LG SD MLNov 29, 2018

Tuplemax Loss for Language Identification

Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno

arXiv:1811.12290v26.619 citations

Originality Incremental advance

AI Analysis

This work addresses language identification for users with limited language sets, offering a significant but incremental improvement over existing methods.

The paper tackled the problem of language identification by incorporating prior knowledge that users typically speak only a few languages, introducing a novel tuplemax loss function to replace softmax loss. This resulted in a 2.33% error rate, a 39.4% relative improvement over the 3.85% error rate of the standard method.

In many scenarios of a language identification task, the user will specify a small set of languages which he/she can speak instead of a large set of all possible languages. We want to model such prior knowledge into the way we train our neural networks, by replacing the commonly used softmax loss function with a novel loss function named tuplemax loss. As a matter of fact, a typical language identification system launched in North America has about 95% users who could speak no more than two languages. Using the tuplemax loss, our system achieved a 2.33% error rate, which is a relative 39.4% improvement over the 3.85% error rate of standard softmax loss method.

View on arXiv PDF

Similar