CLApr 21, 2024

How to Encode Domain Information in Relation Classification

Elisa Bassignana, Viggo Unmack Gascou, Frida Nøhr Laustsen, Gustav Kristensen, Marie Haahr Petersen, Rob van der Goot, Barbara Plank

arXiv:2404.13760v123.781 citationsh-index: 19Has CodeLREC

Originality Incremental advance

AI Analysis

This work addresses the challenge of combining domain-specific datasets for relation classification, which is incremental as it builds on existing methods by incorporating domain encoding.

The paper tackled the problem of improving relation classification performance across domain-specific datasets by encoding domain information in a multi-domain training setup, achieving over 2 Macro-F1 improvement against the baseline.

Current language models require a lot of training data to obtain high performance. For Relation Classification (RC), many datasets are domain-specific, so combining datasets to obtain better performance is non-trivial. We explore a multi-domain training setup for RC, and attempt to improve performance by encoding domain information. Our proposed models improve > 2 Macro-F1 against the baseline setup, and our analysis reveals that not all the labels benefit the same: The classes which occupy a similar space across domains (i.e., their interpretation is close across them, for example "physical") benefit the least, while domain-dependent relations (e.g., "part-of'') improve the most when encoding domain information.

View on arXiv PDF Code

Similar