CLFeb 19, 2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao

arXiv:2002.07972v231.41020 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This toolkit addresses the need for researchers and developers to easily build and deploy NLU models, though it is incremental as it builds on existing frameworks like PyTorch and Transformers.

They tackled the challenge of training customized deep learning models for natural language understanding by developing MT-DNN, an open-source toolkit that supports rapid customization across various tasks and domains, with features like adversarial multi-task learning and knowledge distillation for efficient deployment.

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

View on arXiv PDF Code

Similar