4OPS: Structural Difficulty Modeling in Integer Arithmetic Puzzles
This work addresses difficulty modeling for adaptive arithmetic learning systems, providing a transparent and computationally grounded approach, though it is incremental as it builds on existing puzzle-solving methods.
The authors tackled the problem of predicting difficulty in integer arithmetic puzzles by formalizing it and using an exact solver to label over 3.4 million instances based on minimum operations required. They found that difficulty is fully determined by a small set of interpretable structural attributes, such as the number of input values used in minimal constructions, rather than baseline machine learning models.
Arithmetic puzzle games provide a controlled setting for studying difficulty in mathematical reasoning tasks, a core challenge in adaptive learning systems. We investigate the structural determinants of difficulty in a class of integer arithmetic puzzles inspired by number games. We formalize the problem and develop an exact dynamic-programming solver that enumerates reachable targets, extracts minimal-operation witnesses, and enables large-scale labeling. Using this solver, we construct a dataset of over 3.4 million instances and define difficulty via the minimum number of operations required to reach a target. We analyze the relationship between difficulty and solver-derived features. While baseline machine learning models based on bag- and target-level statistics can partially predict solvability, they fail to reliably distinguish easy instances. In contrast, we show that difficulty is fully determined by a small set of interpretable structural attributes derived from exact witnesses. In particular, the number of input values used in a minimal construction serves as a minimal sufficient statistic for difficulty under this labeling. These results provide a transparent, computationally grounded account of puzzle difficulty that bridges symbolic reasoning and data-driven modeling. The framework supports explainable difficulty estimation and principled task sequencing, with direct implications for adaptive arithmetic learning and intelligent practice systems.