Beiming Yuan

h-index2

8papers

20citations

Novelty57%

AI Score39

Ranked #102,936 of 205,806 authors (top 50%)#33,537 in CV (top 57%)

8 Papers

LGOct 26, 2022Code

Multi-Viewpoint and Multi-Evaluation with Felicitous Inductive Bias Boost Machine Abstract Reasoning Ability

Qinglai Wei, Diancheng Chen, Beiming Yuan

Great endeavors have been made to study AI's ability in abstract reasoning, along with which different versions of RAVEN's progressive matrices (RPM) are proposed as benchmarks. Previous works give inkling that without sophisticated design or extra meta-data containing semantic information, neural networks may still be indecisive in making decisions regarding RPM problems, after relentless training. Evidenced by thorough experiments and ablation studies, we showcase that end-to-end neural networks embodied with felicitous inductive bias, intentionally design or serendipitously match, can solve RPM problems elegantly, without the augment of any extra meta-data or preferences of any specific backbone. Our work also reveals that multi-viewpoint with multi-evaluation is a key learning strategy for successful reasoning. Finally, potential explanations for the failure of connectionist models in generalization are provided. We hope that these results will serve as inspections of AI's ability beyond perception and toward abstract reasoning. Source code can be found in https://github.com/QinglaiWeiCASIA/RavenSolver.

CVJul 2, 2024Code

Funny-Valen-Tine: Planning Solution Distribution Enhances Machine Abstract Reasoning Ability

Ruizhuo Song, Beiming Yuan

Visual abstract reasoning is core to image processing. We present Valen, a unified probability-highlighting baseline that excels on both RPM (progression) and Bongard-Logo (clustering) tasks. Analysing its internals, we find solvers implicitly treat each task as a distribution where primary samples fit and auxiliaries do not; hence the learning target is jointly shaped by both sets, not by correct solutions alone. To close the gap we first introduce Tine, an adversarial adapter that nudges Valen toward correct-solution density, but adversarial training is unstable. We therefore replace it with Funny, a fast Gaussian-mixture model that directly estimates the correct-solution density without adversarial games, and extend the same paradigm to SBR for progressive-pattern planning. Extensive experiments show explicit distribution planning is the key to stronger, interpretable abstract reasoning. Codes are available in: https://github.com/Yuanbeiming/Funny-Valen-Tine-Planning-Solution-Distribution-Enhances-Machine-Abstract-Reasoning-Ability

CVSep 29, 2022

EiHi Net: Out-of-Distribution Generalization Paradigm

Qinglai Wei, Beiming Yuan, Diancheng Chen

This paper develops a new EiHi net to solve the out-of-distribution (OoD) generalization problem in deep learning. EiHi net is a model learning paradigm that can be blessed on any visual backbone. This paradigm can change the previous learning method of the deep model, namely find out correlations between inductive sample features and corresponding categories, which suffers from pseudo correlations between indecisive features and labels. We fuse SimCLR and VIC-Reg via explicitly and dynamically establishing the original - positive - negative sample pair as a minimal learning element, the deep model iteratively establishes a relationship close to the causal one between features and labels, while suppressing pseudo correlations. To further validate the proposed model, and strengthen the established causal relationships, we develop a human-in-the-loop strategy, with few guidance samples, to prune the representation space directly. Finally, it is shown that the developed EiHi net makes significant improvements in the most difficult and typical OoD dataset Nico, compared with the current SOTA results, without any domain ($e.g.$ background, irrelevant features) information.

CVMar 5, 2024Code

Triple-CFN: Separating Concepts and Features Enhances Machine Abstract Reasoning Ability

Ruizhuo Song, Beiming Yuan

This paper introduces innovative frameworks for visual abstract reasoning, aiming to boost deep learning model performance. It emphasizes the importance of separating abstract concept and reasoning feature extraction processes. The effectiveness of the Cross-Feature Network (CFN) and its enhanced version, Triple-CFN, validates this approach. Challenges in visual abstract reasoning arise from complex pattern induction and conflicts in low-dimensional representations. To address these, a dual Expectation-Maximization (EM) process is introduced during CFN training, optimizing module parameters to synthesize non-conflicting concepts. However, the dual EM process may overfit, so mutual and decorrelation supervisions are designed to assist feature extraction, with decorrelation supervision proving effective. Leveraging metadata in Raven's Progressive Matrices (RPM), the paper proposes Meta Triple-CFN, improving reasoning accuracy and interpretability. Additionally, a Re-space layer is designed for feature space construction, further enhancing Triple-CFN's reasoning accuracy. These innovative designs provide effective solutions for abstract reasoning problem solvers, benefiting multiple deep learning domains. Codes are available at: https://github.com/Yuanbeiming/Triple-CFN-Separating-Concepts-and-Features-Enhances-Machine-Abstract-Reasoning-Ability.

CVMar 5, 2024

Solving the Clustering Reasoning Problems by Modeling a Deep-Learning-Based Probabilistic Model

Ruizhuo Song, Beiming Yuan

Visual abstract reasoning problems pose significant challenges to the perception and cognition abilities of artificial intelligence algorithms, demanding deeper pattern recognition and inductive reasoning beyond mere identification of explicit image features. Research advancements in this field often provide insights and technical support for other similar domains. In this study, we introduce PMoC, a deep-learning-based probabilistic model, achieving high reasoning accuracy in the Bongard-Logo, which stands as one of the most challenging clustering reasoning tasks. PMoC is a novel approach for constructing probabilistic models based on deep learning, which is distinctly different from previous techniques. PMoC revitalizes the probabilistic approach, which has been relatively weak in visual abstract reasoning.

CVAug 21, 2025

DIO: Refining Mutual Information and Causal Chain to Enhance Machine Abstract Reasoning Ability

Ruizhuo Song, Beiming Yuan

Despite deep learning's broad success, its abstract-reasoning bottleneck persists. We tackle Raven's Progressive Matrices (RPM), the benchmark for pattern, reasoning and problem-solving intelligence. We model the full causal chain image $\rightarrow$ attributes $\rightarrow$ progressive patterns $\rightarrow$ consistency $\rightarrow$ answer and build the baseline DIO. Yet DIO's mutual-information lower-bound objective does not embed human logic: the bound is loose and statistic-based, ignoring causal subject-object links. We therefore present three refinements. 1) Brando introduces trainable negative options to tighten the variational bound. 2) WORLD replaces generation with a Gaussian-mixture feature model that supplies infinite, weighted negatives, further tightening the bound. 3) DIEGO adds metadata supervision to rectify the "attributes $\rightarrow$ patterns" semantic gap, aligning representations with human rules. These upgrades substantially boost discriminative RPM accuracy and, for the first time, let DIO generate valid answers in open-ended RPM. The work provides causal-driven design guidelines, objective-refinement strategies and cross-modal insights for abstract-reasoning research.

LGMay 13, 2025

Johnny: Structuring Representation Space to Enhance Machine Abstract Reasoning Ability

Ruizhuo Song, Beiming Yuan

This paper thoroughly investigates the challenges of enhancing AI's abstract reasoning capabilities, with a particular focus on Raven's Progressive Matrices (RPM) tasks involving complex human-like concepts. Firstly, it dissects the empirical reality that traditional end-to-end RPM-solving models heavily rely on option pool configurations, highlighting that this dependency constrains the model's reasoning capabilities. To address this limitation, the paper proposes the Johnny architecture - a novel representation space-based framework for RPM-solving. Through the synergistic operation of its Representation Extraction Module and Reasoning Module, Johnny significantly enhances reasoning performance by supplementing primitive negative option configurations with a learned representation space. Furthermore, to strengthen the model's capacity for capturing positional relationships among local features, the paper introduces the Spin-Transformer network architecture, accompanied by a lightweight Straw Spin-Transformer variant that reduces computational overhead through parameter sharing and attention mechanism optimization. Experimental evaluations demonstrate that both Johnny and Spin-Transformer achieve superior performance on RPM tasks, offering innovative methodologies for advancing AI's abstract reasoning capabilities.

CVMar 6, 2024

D4C: Improving Negative Example Quality to Enhance Machine Abstract Reasoning Ability

Ruizhuo Song, Beiming Yuan

This paper is dedicated to addressing the challenge of enhancing the abstract reasoning capabilities of AI, particularly for tasks involving complex human concepts. We introduce Lico-Net, a novel reasoning engine grounded in deep learning theory, which encodes the logical structure of Raven's Progressive Matrices (RPM) problems into probabilistic representations. Lico-Net excels in solving RPM tasks. Furthermore, we propose Lico-Net-Bongard, a tailored version of Lico-Net for the Bongard-Logo problem, which also achieves high reasoning accuracy through probabilistic representations. However, we observe a mismatch between the way deep learning algorithms and humans induce reasoning concepts, primarily attributed to the inadequate quality of negative samples. Improper configuration of negative samples can convey erroneous conceptual information to deep learning algorithms, thereby distorting their learning objectives. To address this issue, we propose two novel approaches: first, treating different sample points within reasoning problems as mutual negative samples to alter the existing negative sample structure in the data; second, designing a negative sample generator based on a step-wise linear attention mechanism to produce high-quality negative samples. Experimental results demonstrate that these methods significantly improve the performance of Lico-Net (-Bongard) and other baseline models on the RPM and Bongard-Logo datasets, as well as in the domain of foundational vision model processing, particularly when addressing the NICO dataset's distribution shift problem. Our findings emphasize the importance of improving negative sample quality for enhancing the abstract reasoning capabilities of deep learning algorithms and suggest that systems represent a promising direction for future research in this field.