CLFeb 19, 2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

arXiv:2002.07972v21020 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This toolkit addresses the need for researchers and developers to easily build and deploy NLU models, though it is incremental as it builds on existing frameworks like PyTorch and Transformers.

They tackled the challenge of training customized deep learning models for natural language understanding by developing MT-DNN, an open-source toolkit that supports rapid customization across various tasks and domains, with features like adversarial multi-task learning and knowledge distillation for efficient deployment.

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes