Shihong Huang

LG
h-index2
4papers
10citations
Novelty51%
AI Score41

4 Papers

LGApr 6
Vehicle-as-Prompt: A Unified Deep Reinforcement Learning Framework for Heterogeneous Fleet Vehicle Routing Problem

Shihong Huang, Shengjie Wang, Lei Gao et al.

Unlike traditional homogeneous routing problems, the Heterogeneous Fleet Vehicle Routing Problem (HFVRP) involves heterogeneous fixed costs, variable travel costs, and capacity constraints, rendering solution quality highly sensitive to vehicle selection. Furthermore, real-world logistics applications often impose additional complex constraints, markedly increasing computational complexity. However, most existing Deep Reinforcement Learning (DRL)-based methods are restricted to homogeneous scenarios, leading to suboptimal performance when applied to HFVRP and its complex variants. To bridge this gap, we investigate HFVRP under complex constraints and develop a unified DRL framework capable of solving the problem across various variant settings. We introduce the Vehicle-as-Prompt (VaP) mechanism, which formulates the problem as a single-stage autoregressive decision process. Building on this, we propose VaP-CSMV, a framework featuring a cross-semantic encoder and a multi-view decoder that effectively addresses various problem variants and captures the complex mapping relationships between vehicle heterogeneity and customer node attributes. Extensive experimental results demonstrate that VaP-CSMV significantly outperforms existing state-of-the-art DRL-based neural solvers and achieves competitive solution quality compared to traditional heuristic solvers, while reducing inference time to mere seconds. Furthermore, the framework exhibits strong zero-shot generalization capabilities on large-scale and previously unseen problem variants, while ablation studies validate the vital contribution of each component.

LGOct 11, 2025
One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing Problem

Lei Gao, Shihong Huang, Shengjie Wang et al.

The three-dimensional bin packing problem (3D-BPP) is widely applied in logistics and warehousing. Existing learning-based approaches often neglect practical stability-related constraints and exhibit limitations in generalizing across diverse bin dimensions. To address these limitations, we propose a novel deep reinforcement learning framework, One4Many-StablePacker (O4M-SP). The primary advantage of O4M-SP is its ability to handle various bin dimensions in a single training process while incorporating support and weight constraints common in practice. Our training method introduces two innovative mechanisms. First, it employs a weighted reward function that integrates loading rate and a new height difference metric for packing layouts, promoting improved bin utilization through flatter packing configurations. Second, it combines clipped policy gradient optimization with a tailored policy drifting method to mitigate policy entropy collapse, encouraging exploration at critical decision nodes during packing to avoid suboptimal solutions. Extensive experiments demonstrate that O4M-SP generalizes successfully across diverse bin dimensions and significantly outperforms baseline methods. Furthermore, O4M-SP exhibits strong practical applicability by effectively addressing packing scenarios with stability constraints.

SEOct 4, 2021
Emotionally-Informed Decisions: Bringing Gut's Feelings into Self-adaptive and Co-adaptive Software Systems

Emmanuelle Tognoli, Shihong Huang

Software systems now complement an incredibly vast number of human activities, and much effort has been deployed to make them quasi-autonomous with the build-up of increasingly performant self-adaptive capabilities, so that the burden of failure, interruption and functional loss requiring expert intervention is fewer and far in between. Even as software systems are rapidly gaining skills that beat humans', humans retain greatly superior adaptability, especially in the context of emotionally-informed decisions and decisions under uncertainty; that is to say, self-adaptive and co-adaptive software systems have yet to acquire a "gut's feeling". This provides the double opportunity to conceptualize human-inspired processes of decision-making under uncertainty in the self-adaptive part of a software, as well as to source human unique emotional competences in co-adaptive architectures. In this paper, some algorithms are discussed that can provide software systems with realistic decision-making, and some architectures are conceptualized that resort to human emotions to quantify uncertainty and to contribute in the software's adaptation process.

SEApr 24, 2020
Towards Bridging the Gap between Control and Self-Adaptive System Properties

Javier Cámara, Alessandro V. Papadopoulos, Thomas Vogel et al.

Two of the main paradigms used to build adaptive software employ different types of properties to capture relevant aspects of the system's run-time behavior. On the one hand, control systems consider properties that concern static aspects like stability, as well as dynamic properties that capture the transient evolution of variables such as settling time. On the other hand, self-adaptive systems consider mostly non-functional properties that capture concerns such as performance, reliability, and cost. In general, it is not easy to reconcile these two types of properties or identify under which conditions they constitute a good fit to provide run-time guarantees. There is a need of identifying the key properties in the areas of control and self-adaptation, as well as of characterizing and mapping them to better understand how they relate and possibly complement each other. In this paper, we take a first step to tackle this problem by: (1) identifying a set of key properties in control theory, (2) illustrating the formalization of some of these properties employing temporal logic languages commonly used to engineer self-adaptive software systems, and (3) illustrating how to map key properties that characterize self-adaptive software systems into control properties, leveraging their formalization in temporal logics. We illustrate the different steps of the mapping on an exemplar case in the cloud computing domain and conclude with identifying open challenges in the area.