CLJan 22, 2021

Multilingual Pre-Trained Transformers and Convolutional NN Classification Models for Technical Domain Identification

arXiv:2101.09012v1712 citations
Originality Synthesis-oriented
AI Analysis

This work addresses domain identification for technical text in multiple languages, but it is incremental as it applies existing models to a specific dataset.

The paper tackled technical domain identification on multilingual text data using transfer learning with BERT and XLM-ROBERTa with CNN models, achieving best rankings for subtasks 1d and 1g in the ICON 2020 shared task.

In this paper, we present a transfer learning system to perform technical domain identification on multilingual text data. We have submitted two runs, one uses the transformer model BERT, and the other uses XLM-ROBERTa with the CNN model for text classification. These models allowed us to identify the domain of the given sentences for the ICON 2020 shared Task, TechDOfication: Technical Domain Identification. Our system ranked the best for the subtasks 1d, 1g for the given TechDOfication dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes