SEAIDec 17, 2025

Embedding Software Intent: Lightweight Java Module Recovery

arXiv:2512.15980v1h-index: 5
Originality Highly original
AI Analysis

This addresses the problem of ineffective module recovery for Java developers transitioning to JPMS, though it is incremental as it builds on existing architecture recovery techniques with a novel method.

The paper tackles the challenge of modularizing existing monolithic Java projects into JPMS modules by introducing ClassLAR, a lightweight approach that recovers modules using class names and language models, outperforming state-of-the-art techniques in similarity metrics and achieving 3.99 to 10.50 times faster execution times in evaluations across 20 Java projects.

As an increasing number of software systems reach unprecedented scale, relying solely on code-level abstractions is becoming impractical. While architectural abstractions offer a means to manage these systems, maintaining their consistency with the actual code has been problematic. The Java Platform Module System (JPMS), introduced in Java 9, addresses this limitation by enabling explicit module specification at the language level. JPMS enhances architectural implementation through improved encapsulation and direct specification of ground-truth architectures within Java projects. Although many projects are written in Java, modularizing existing monolithic projects to JPMS modules is an open challenge due to ineffective module recovery by existing architecture recovery techniques. To address this challenge, this paper presents ClassLAR (Class-and Language model-based Architectural Recovery), a novel, lightweight, and efficient approach that recovers Java modules from monolithic Java systems using fully-qualified class names. ClassLAR leverages language models to extract semantic information from package and class names, capturing both structural and functional intent. In evaluations across 20 popular Java projects, ClassLAR outperformed all state-of-the-art techniques in architectural-level similarity metrics while achieving execution times that were 3.99 to 10.50 times faster.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes