Real Additive Margin Softmax for Speaker Verification
This work addresses a specific issue in speaker verification by correcting a loss function, representing an incremental improvement for the field.
The authors identified that the additive margin softmax (AM-Softmax) loss does not achieve true max-margin training in speaker verification, and proposed a corrected Real AM-Softmax loss that consistently outperforms the original on datasets like VoxCeleb1, SITW, and CNCeleb.
The additive margin softmax (AM-Softmax) loss has delivered remarkable performance in speaker verification. A supposed behavior of AM-Softmax is that it can shrink within-class variation by putting emphasis on target logits, which in turn improves margin between target and non-target classes. In this paper, we conduct a careful analysis on the behavior of AM-Softmax loss, and show that this loss does not implement real max-margin training. Based on this observation, we present a Real AM-Softmax loss which involves a true margin function in the softmax training. Experiments conducted on VoxCeleb1, SITW and CNCeleb demonstrated that the corrected AM-Softmax loss consistently outperforms the original one. The code has been released at https://gitlab.com/csltstu/sunine.