Architectures of Meaning, A Systematic Corpus Analysis of NLP Systems
This provides a systematic mechanism for researchers and practitioners to interpret the dynamic and growing field of NLP, though it is incremental as it builds on existing statistical methods.
The paper tackled the problem of interpreting NLP architectural patterns at scale by proposing a statistical corpus analysis framework, which was validated on the full Semeval tasks corpus and demonstrated coherent patterns for data-driven architectural insights.
This paper proposes a novel statistical corpus analysis framework targeted towards the interpretation of Natural Language Processing (NLP) architectural patterns at scale. The proposed approach combines saturation-based lexicon construction, statistical corpus analysis methods and graph collocations to induce a synthesis representation of NLP architectural patterns from corpora. The framework is validated in the full corpus of Semeval tasks and demonstrated coherent architectural patterns which can be used to answer architectural questions on a data-driven fashion, providing a systematic mechanism to interpret a largely dynamic and exponentially growing field.