Categorical Tools for Natural Language Processing
It offers a foundational approach for natural language processing researchers, though it appears incremental as it builds on existing categorical and computational linguistics concepts.
This thesis develops a categorical framework that translates category theory to computational linguistics, providing a foundation for natural language processing by unifying syntax, semantics, and pragmatics through string diagrams and functors, implemented in the DisCoPy Python library.
This thesis develops the translation between category theory and computational linguistics as a foundation for natural language processing. The three chapters deal with syntax, semantics and pragmatics. First, string diagrams provide a unified model of syntactic structures in formal grammars. Second, functors compute semantics by turning diagrams into logical, tensor, neural or quantum computation. Third, the resulting functorial models can be composed to form games where equilibria are the solutions of language processing tasks. This framework is implemented as part of DisCoPy, the Python library for computing with string diagrams. We describe the correspondence between categorical, linguistic and computational structures, and demonstrate their applications in compositional natural language processing.