Structure and Destructure: Dual Forces in the Making of Knowledge Engines
This work addresses the challenge of creating transparent, controllable, and adaptable intelligent systems for AI and NLP researchers, but it appears incremental as it builds on existing paradigms without introducing a fundamentally new method.
The paper tackles the problem of integrating structured and unstructured paradigms in natural language processing to develop knowledge engines, proposing a new recipe that combines structure for organizing symbolic interactions and destructure for improving model plasticity and generalization.
The making of knowledge engines in natural language processing has been shaped by two seemingly distinct paradigms: one grounded in structure, the other driven by massively available unstructured data. The structured paradigm leverages predefined symbolic interactions, such as knowledge graphs, as priors and designs models to capture them. In contrast, the unstructured paradigm centers on scaling transformer architectures with increasingly vast data and model sizes, as seen in modern large language models. Despite their divergence, this thesis seeks to establish conceptual connections bridging these paradigms. Two complementary forces, structure and destructure, emerge across both paradigms: structure organizes seen symbolic interactions, while destructure, through periodic embedding resets, improves model plasticity and generalization to unseen scenarios. These connections form a new recipe for developing general knowledge engines that can support transparent, controllable, and adaptable intelligent systems.