AIJul 16, 2024

COMET: "Cone of experience" enhanced large multimodal model for mathematical problem generation

arXiv:2407.11315v114 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses the problem of automatic mathematical problem generation for educational applications, but it appears incremental as it builds on existing large multimodal models with a novel fine-tuning approach.

The paper tackles the challenge of generating high-quality mathematical problems using large multimodal models by proposing COMET, which unifies stem generation and problem solving and uses a three-stage fine-tuning framework based on the 'Cone of Experience', resulting in verified effectiveness through experiments on multiple datasets.

The automatic generation of high-quality mathematical problems is practically valuable in many educational scenarios. Large multimodal model provides a novel technical approach for the mathematical problem generation because of its wide success in cross-modal data scenarios. However, the traditional method of separating problem solving from problem generation and the mainstream fine-tuning framework of monotonous data structure with homogeneous training objectives limit the application of large multimodal model in mathematical problem generation. Addressing these challenges, this paper proposes COMET, a "Cone of Experience" enhanced large multimodal model for mathematical problem generation. Firstly, from the perspective of mutual ability promotion and application logic, we unify stem generation and problem solving into mathematical problem generation. Secondly, a three-stage fine-turning framework guided by the "Cone of Experience" is proposed. The framework divides the fine-tuning data into symbolic experience, iconic experience, and direct experience to draw parallels with experiences in the career growth of teachers. Several fine-grained data construction and injection methods are designed in this framework. Finally, we construct a Chinese multimodal mathematical problem dataset to fill the vacancy of Chinese multimodal data in this field. Combined with objective and subjective indicators, experiments on multiple datasets fully verify the effectiveness of the proposed framework and model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes