MASALA: Modelling and Analysing the Semantics of Adpositions in Linguistic Annotation of Hindi
This provides a resource for Hindi NLP tasks like semantic role labeling, but is incremental as it applies an existing annotation scheme to a new language.
The researchers created a publicly available corpus of annotated semantic relations for adpositions and case markers in Hindi using the SNACS scheme, and achieved competitive automatic labeling results with language models compared to prior English work.
We present a completed, publicly available corpus of annotated semantic relations of adpositions and case markers in Hindi. We used the multilingual SNACS annotation scheme, which has been applied to a variety of typologically diverse languages. Building on past work examining linguistic problems in SNACS annotation, we use language models to attempt automatic labelling of SNACS supersenses in Hindi and achieve results competitive with past work on English. We look towards upstream applications in semantic role labelling and extension to related languages such as Gujarati.