Efficient Multilingual Name Type Classification Using Convolutional Networks
This provides an efficient solution for multilingual NLP applications where speed and energy consumption matter, though it is incremental in optimizing existing CNN approaches.
The paper tackles multilingual name classification by language and entity type, achieving 92.1% accuracy and processing 2,813 names per second on a CPU, which is 46 times faster than transformer baselines with comparable accuracy.
We present a convolutional neural network approach for classifying proper names by language and entity type. Our model, Onomas-CNN X, combines parallel convolution branches with depthwise-separable operations and hierarchical classification to process names efficiently on CPU hardware. We evaluate the architecture on a large multilingual dataset covering 104 languages and four entity types (person, organization, location, other). Onomas-CNN X achieves 92.1% accuracy while processing 2,813 names per second on a single CPU core - 46 times faster than fine-tuned XLM-RoBERTa with comparable accuracy. The model reduces energy consumption by a factor of 46 compared to transformer baselines. Our experiments demonstrate that specialized CNN architectures remain competitive with large pre-trained models for focused NLP tasks when sufficient training data exists.