Joshua Garcia

SE
h-index5
5papers
63citations
Novelty42%
AI Score37

5 Papers

SEDec 17, 2025
Embedding Software Intent: Lightweight Java Module Recovery

Yirui He, Yuqi Huai, Xingyu Chen et al.

As an increasing number of software systems reach unprecedented scale, relying solely on code-level abstractions is becoming impractical. While architectural abstractions offer a means to manage these systems, maintaining their consistency with the actual code has been problematic. The Java Platform Module System (JPMS), introduced in Java 9, addresses this limitation by enabling explicit module specification at the language level. JPMS enhances architectural implementation through improved encapsulation and direct specification of ground-truth architectures within Java projects. Although many projects are written in Java, modularizing existing monolithic projects to JPMS modules is an open challenge due to ineffective module recovery by existing architecture recovery techniques. To address this challenge, this paper presents ClassLAR (Class-and Language model-based Architectural Recovery), a novel, lightweight, and efficient approach that recovers Java modules from monolithic Java systems using fully-qualified class names. ClassLAR leverages language models to extract semantic information from package and class names, capturing both structural and functional intent. In evaluations across 20 popular Java projects, ClassLAR outperformed all state-of-the-art techniques in architectural-level similarity metrics while achieving execution times that were 3.99 to 10.50 times faster.

CRJan 12, 2022Code
Too Afraid to Drive: Systematic Discovery of Semantic DoS Vulnerability in Autonomous Driving Planning under Physical-World Attacks

Ziwen Wan, Junjie Shen, Jalen Chuang et al.

In high-level Autonomous Driving (AD) systems, behavioral planning is in charge of making high-level driving decisions such as cruising and stopping, and thus highly securitycritical. In this work, we perform the first systematic study of semantic security vulnerabilities specific to overly-conservative AD behavioral planning behaviors, i.e., those that can cause failed or significantly-degraded mission performance, which can be critical for AD services such as robo-taxi/delivery. We call them semantic Denial-of-Service (DoS) vulnerabilities, which we envision to be most generally exposed in practical AD systems due to the tendency for conservativeness to avoid safety incidents. To achieve high practicality and realism, we assume that the attacker can only introduce seemingly-benign external physical objects to the driving environment, e.g., off-road dumped cardboard boxes. To systematically discover such vulnerabilities, we design PlanFuzz, a novel dynamic testing approach that addresses various problem-specific design challenges. Specifically, we propose and identify planning invariants as novel testing oracles, and design new input generation to systematically enforce problemspecific constraints for attacker-introduced physical objects. We also design a novel behavioral planning vulnerability distance metric to effectively guide the discovery. We evaluate PlanFuzz on 3 planning implementations from practical open-source AD systems, and find that it can effectively discover 9 previouslyunknown semantic DoS vulnerabilities without false positives. We find all our new designs necessary, as without each design, statistically significant performance drops are generally observed. We further perform exploitation case studies using simulation and real-vehicle traces. We discuss root causes and potential fixes.

SEDec 17, 2021
scenoRITA: Generating Less-Redundant, Safety-Critical and Motion Sickness-Inducing Scenarios for Autonomous Vehicles

Sumaya Almanee, Xiafa Wu, Yuqi Huai et al.

There is tremendous global enthusiasm for research, development, and deployment of autonomous vehicles (AVs), e.g., self-driving taxis and trucks from Waymo and Baidu. The current practice for testing AVs uses virtual tests-where AVs are tested in software simulations-since they offer a more efficient and safer alternative compared to field operational tests. Specifically, search-based approaches are used to find particularly critical situations. These approaches provide an opportunity to automatically generate tests; however, systematically creating valid and effective tests for AV software remains a major challenge. To address this challenge, we introduce scenoRITA, a test generation approach for AVs that uses evolutionary algorithms with (1) a novel gene representation that allows obstacles to be fully mutable, hence, resulting in more reported violations, (2) 5 test oracles to determine both safety and motion sickness-inducing violations, and (3) a novel technique to identify and eliminate duplicate tests. Our extensive evaluation shows that scenoRITA can produce effective driving scenarios that expose an ego car to safety critical situations. scenoRITA generated tests that resulted in a total of 1,026 unique violations, increasing the number of reported violations by 23.47% and 24.21% compared to random test generation and state-of-the-art partially-mutable test generation, respectively.

SEApr 17, 2021
Architectural Archipelagos: Technical Debt in Long-Lived Software Research Platforms

Marcelo Schmitt Laser, Duc Minh Le, Joshua Garcia et al.

This paper identifies a model of software evolution that is prevalent in large, long-lived academic research tool suites (3L-ARTS). This model results in an "archipelago" of related but haphazardly organized architectural "islands", and inherently induces technical debt. We illustrate the archipelago model with examples from two 3L-ARTS archipelagos identified in literature.

CRNov 21, 2019
Too Quiet in the Library: An Empirical Study of Security Updates in Android Apps' Native Code

Sumaya Almanee, Arda Unal, Mathias Payer et al.

Android apps include third-party native libraries to increase performance and to reuse functionality. Native code is directly executed from apps through the Java Native Interface or the Android Native Development Kit. Android developers add precompiled native libraries to their projects, enabling their use. Unfortunately, developers often struggle or simply neglect to update these libraries in a timely manner. This results in the continuous use of outdated native libraries with unpatched security vulnerabilities years after patches became available. To further understand such phenomena, we study the security updates in native libraries in the most popular 200 free apps on Google Play from Sept. 2013 to May 2020. A core difficulty we face in this study is the identification of libraries and their versions. Developers often rename or modify libraries, making their identification challenging. We create an approach called LibRARIAN (LibRAry veRsion IdentificAtioN) that accurately identifies native libraries and their versions as found in Android apps based on our novel similarity metric bin2sim. LibRARIAN leverages different features extracted from libraries based on their metadata and identifying strings in read-only sections. We discovered 53/200 popular apps (26.5%) with vulnerable versions with known CVEs between Sept. 2013 and May 2020, with 14 of those apps remaining vulnerable. We find that app developers took, on average, 528.71 days to apply security patches, while library developers release a security patch after 54.59 days - a 10 times slower rate of update.