AILGJun 30, 2025

BlackBoxToBlueprint: Extracting Interpretable Logic from Legacy Systems using Reinforcement Learning and Counterfactual Analysis

arXiv:2507.00180v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the challenge of understanding undocumented legacy systems for software engineers, though it is incremental as it builds on existing RL and clustering methods.

The paper tackled the problem of modernizing legacy software systems by automatically extracting interpretable decision logic from black-box systems using a pipeline with reinforcement learning and counterfactual analysis, achieving accurate rule extraction that reflects the core logic of dummy systems.

Modernizing legacy software systems is a critical but challenging task, often hampered by a lack of documentation and understanding of the original system's intricate decision logic. Traditional approaches like behavioral cloning merely replicate input-output behavior without capturing the underlying intent. This paper proposes a novel pipeline to automatically extract interpretable decision logic from legacy systems treated as black boxes. The approach uses a Reinforcement Learning (RL) agent to explore the input space and identify critical decision boundaries by rewarding actions that cause meaningful changes in the system's output. These counterfactual state transitions, where the output changes, are collected and clustered using K-Means. Decision trees are then trained on these clusters to extract human-readable rules that approximate the system's decision logic near the identified boundaries. I demonstrated the pipeline's effectiveness on three dummy legacy systems with varying complexity, including threshold-based, combined-conditional, and non-linear range logic. Results show that the RL agent successfully focuses exploration on relevant boundary regions, and the extracted rules accurately reflect the core logic of the underlying dummy systems, providing a promising foundation for generating specifications and test cases during legacy migration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes