Johnny: Structuring Representation Space to Enhance Machine Abstract Reasoning Ability
This addresses the problem of limited abstract reasoning in AI for tasks like RPM, offering incremental improvements through novel architectures.
The paper tackles the challenge of improving AI's abstract reasoning on Raven's Progressive Matrices (RPM) tasks by proposing the Johnny architecture, which uses a representation space framework to reduce dependency on option pool configurations, and the Spin-Transformer network to capture positional relationships, achieving superior performance in experiments.
This paper thoroughly investigates the challenges of enhancing AI's abstract reasoning capabilities, with a particular focus on Raven's Progressive Matrices (RPM) tasks involving complex human-like concepts. Firstly, it dissects the empirical reality that traditional end-to-end RPM-solving models heavily rely on option pool configurations, highlighting that this dependency constrains the model's reasoning capabilities. To address this limitation, the paper proposes the Johnny architecture - a novel representation space-based framework for RPM-solving. Through the synergistic operation of its Representation Extraction Module and Reasoning Module, Johnny significantly enhances reasoning performance by supplementing primitive negative option configurations with a learned representation space. Furthermore, to strengthen the model's capacity for capturing positional relationships among local features, the paper introduces the Spin-Transformer network architecture, accompanied by a lightweight Straw Spin-Transformer variant that reduces computational overhead through parameter sharing and attention mechanism optimization. Experimental evaluations demonstrate that both Johnny and Spin-Transformer achieve superior performance on RPM tasks, offering innovative methodologies for advancing AI's abstract reasoning capabilities.