Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling
This work addresses the problem of integrating symbolic and neural approaches in language modeling for NLP researchers, but it is incremental as it builds on existing neuro-symbolic methods.
The study investigated whether linguistic graph representations can enhance neural language modeling, finding that semantic constituency structures most improve performance, outperforming syntactic constituency and dependency structures, with effects varying by part-of-speech class.
We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling. With an ensemble setup consisting of a pretrained Transformer and ground-truth graphs from one of 7 different formalisms, we find that, overall, semantic constituency structures are most useful to language modeling performance -- outpacing syntactic constituency structures as well as syntactic and semantic dependency structures. Further, effects vary greatly depending on part-of-speech class. In sum, our findings point to promising tendencies in neuro-symbolic language modeling and invite future research quantifying the design choices made by different formalisms.