Parsing with CYK over Distributed Representations
This work addresses the integration of symbolic parsing methods with neural networks for natural language processing, representing an incremental step in bridging these approaches.
The authors tackled the problem of syntactic parsing by reformulating the symbolic Cocke-Younger-Kasami (CYK) algorithm entirely over distributed representations, introducing D-CYK, which approximates the original algorithm using matrix multiplication on real number matrices.
Syntactic parsing is a key task in natural language processing. This task has been dominated by symbolic, grammar-based parsers. Neural networks, with their distributed representations, are challenging these methods. In this article we show that existing symbolic parsing algorithms can cross the border and be entirely formulated over distributed representations. To this end we introduce a version of the traditional Cocke-Younger-Kasami (CYK) algorithm, called D-CYK, which is entirely defined over distributed representations. Our D-CYK uses matrix multiplication on real number matrices of size independent of the length of the input string. These operations are compatible with traditional neural networks. Experiments show that our D-CYK approximates the original CYK algorithm. By showing that CYK can be entirely performed on distributed representations, we open the way to the definition of recurrent layers of CYK-informed neural networks.