Human and Automatic Interpretation of Romanian Noun Compounds
This work addresses the challenge of context-dependent meaning interpretation for Romanian noun compounds in NLP, but it is incremental as it builds on existing relation inventories with a focus on a specific language.
The study tackled the problem of interpreting Romanian noun compounds by proposing a new set of semantic relations and testing it with human annotators and a neural net classifier, finding alignment between network predictions and human judgments even in low-agreement cases, but noting that the most frequent relation was unlabeled, indicating inventory limitations.
Determining the intended, context-dependent meanings of noun compounds like "shoe sale" and "fire sale" remains a challenge for NLP. Previous work has relied on inventories of semantic relations that capture the different meanings between compound members. Focusing on Romanian compounds, whose morphosyntax differs from that of their English counterparts, we propose a new set of relations and test it with human annotators and a neural net classifier. Results show an alignment of the network's predictions and human judgments, even where the human agreement rate is low. Agreement tracks with the frequency of the selected relations, regardless of structural differences. However, the most frequently selected relation was none of the sixteen labeled semantic relations, indicating the need for a better relation inventory.