Bowen Huang

CV
h-index16
8papers
37citations
Novelty56%
AI Score35

8 Papers

SPMar 15, 2019
Data-driven Identification and Prediction of Power System Dynamics Using Linear Operators

Pranav Sharma, Bowen Huang, Umesh Vaidya et al.

In this paper, we propose linear operator theoretic framework involving Koopman operator for the data-driven identification of power system dynamics. We explicitly account for noise in the time series measurement data and propose robust approach for data-driven approximation of Koopman operator for the identification of nonlinear power system dynamics. The identified model is used for the prediction of state trajectories in the power system. The application of the framework is illustrated using an IEEE nine bus test system.

SYJun 10, 2018
Data-Driven Optimal Control Using Perron-Frobenius Operator

Apurba Kumar Das, Bowen Huang, Umesh Vaidya

In this paper, we propose a data-driven approach for control of nonlinear dynamical systems. The proposed data-driven approach relies on transfer Koopman and Perron-Frobenius (P-F) operators for linear representation and control of such systems. Systematic model-based frameworks involving linear transfer P-F operator were proposed for almost everywhere stability analysis and control design of a nonlinear dynamical system in previous works [1-3]. Lyapunov measure can be used as a tool to provide linear programming-based computational framework for stability analysis and almost everywhere stabilizing control design of a nonlinear system. In this paper, we show that those frameworks can be extended to a data-driven setting, where the finite dimensional approximation of linear transfer P-F operator and stabilizing feedback controller can be obtained from time-series data. We exploit the positivity and Markov property of these operators and their finite-dimensional approximation to provide {\it linear programming} based approach for designing an optimally stabilizing feedback controller.

SYJul 26, 2025
Deep Koopman Learning of Nonlinear Time-Varying Systems

Wenjian Hao, Bowen Huang, Wei Pan et al.

This paper presents a data-driven approach to approximate the dynamics of a nonlinear time-varying system (NTVS) by a linear time-varying system (LTVS), which is resulted from the Koopman operator and deep neural networks. Analysis of the approximation error between states of the NTVS and the resulting LTVS is presented. Simulations on a representative NTVS show that the proposed method achieves small approximation errors, even when the system changes rapidly. Furthermore, simulations in an example of quadcopters demonstrate the computational efficiency of the proposed approach.

AIJun 20, 2024
Efficient Strategy Learning by Decoupling Searching and Pathfinding for Object Navigation

Yanwei Zheng, Shaopu Feng, Bowen Huang et al.

Inspired by human-like behaviors for navigation: first searching to explore unknown areas before discovering the target, and then the pathfinding of moving towards the discovered target, recent studies design parallel submodules to achieve different functions in the searching and pathfinding stages, while ignoring the differences in reward signals between the two stages. As a result, these models often cannot be fully trained or are overfitting on training scenes. Another bottleneck that restricts agents from learning two-stage strategies is spatial perception ability, since the studies used generic visual encoders without considering the depth information of navigation scenes. To release the potential of the model on strategy learning, we propose the Two-Stage Reward Mechanism (TSRM) for object navigation that decouples the searching and pathfinding behaviours in an episode, enabling the agent to explore larger area in searching stage and seek the optimal path in pathfinding stage. Also, we propose a pretraining method Depth Enhanced Masked Autoencoders (DE-MAE) that enables agent to determine explored and unexplored areas during the searching stage, locate target object and plan paths during the pathfinding stage more accurately. In addition, we propose a new metric of Searching Success weighted by Searching Path Length (SSSPL) that assesses agent's searching ability and exploring efficiency. Finally, we evaluated our method on AI2-Thor and RoboTHOR extensively and demonstrated it can outperform the state-of-the-art (SOTA) methods in both the success rate and the navigation efficiency.

CVMar 23, 2024
Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation

Bowen Huang, Yanwei Zheng, Chuanlin Lan et al.

Vision-and-Language Navigation (VLN) is a challenging task where an agent is required to navigate to a natural language described location via vision observations. The navigation abilities of the agent can be enhanced by the relations between objects, which are usually learned using internal objects or external datasets. The relationships between internal objects are modeled employing graph convolutional network (GCN) in traditional studies. However, GCN tends to be shallow, limiting its modeling ability. To address this issue, we utilize a cross attention mechanism to learn the connections between objects over a trajectory, which takes temporal continuity into account, termed as Temporal Object Relations (TOR). The external datasets have a gap with the navigation environment, leading to inaccurate modeling of relations. To avoid this problem, we construct object connections based on observations from all viewpoints in the navigational environment, which ensures complete spatial coverage and eliminates the gap, called Spatial Object Relations (SOR). Additionally, we observe that agents may repeatedly visit the same location during navigation, significantly hindering their performance. For resolving this matter, we introduce the Turning Back Penalty (TBP) loss function, which penalizes the agent's repetitive visiting behavior, substantially reducing the navigational distance. Experimental results on the REVERIE, SOON, and R2R datasets demonstrate the effectiveness of the proposed method.

CVJan 18, 2024
CPCL: Cross-Modal Prototypical Contrastive Learning for Weakly Supervised Text-based Person Retrieval

Xinpeng Zhao, Yanwei Zheng, Chuanlin Lan et al.

Weakly supervised text-based person retrieval seeks to retrieve images of a target person using textual descriptions, without relying on identity annotations and is more challenging and practical. The primary challenge is the intra-class differences, encompassing intra-modal feature variations and cross-modal semantic gaps. Prior works have focused on instance-level samples and ignored prototypical features of each person which are intrinsic and invariant. Toward this, we propose a Cross-Modal Prototypical Contrastive Learning (CPCL) method. In practice, the CPCL introduces the CLIP model to weakly supervised text-based person retrieval to map visual and textual instances into a shared latent space. Subsequently, the proposed Prototypical Multi-modal Memory (PMM) module captures associations between heterogeneous modalities of image-text pairs belonging to the same person through the Hybrid Cross-modal Matching (HCM) module in a many-to-many mapping fashion. Moreover, the Outlier Pseudo Label Mining (OPLM) module further distinguishes valuable outlier samples from each modality, enhancing the creation of more reliable clusters by mining implicit relationships between image-text pairs. We conduct extensive experiments on popular benchmarks of weakly supervised text-based person retrieval, which validate the effectiveness, generalizability of CPCL.

CRMay 25, 2021
Leaky Frontends: Security Vulnerabilities in Processor Frontends

Shuwen Deng, Bowen Huang, Jakub Szefer

This paper evaluates new security threats due to the processor frontend in modern Intel processors. The root causes of the security threats are the multiple paths in the processor frontend that the micro-operations can take: through the Micro-Instruction Translation Engine (MITE), through the Decode Stream Buffer (DSB), also called the Micro-operation Cache, or through the Loop Stream Detector (LSD). Each path has its own unique timing and power signatures, which lead to the side- and covert-channel attacks presented in this work. Especially, the switching between the different paths leads to observable timing or power differences which, as this work demonstrates, could be exploited by attackers. Because of the different paths, the switching, and way the components are shared in the frontend between hardware threads, two separate threads are able to be mutually influenced and timing or power can reveal activity on the other thread. The security threats are not limited to multi-threading, and this work further demonstrates new ways for leaking execution information about SGX enclaves or a new in-domain Spectre variant in single-thread setting. Finally, this work demonstrates a new method for fingerprinting the microcode patches of the processor by analyzing the behavior of different paths in the frontend. The findings of this work highlight the security threats associated with the processor frontend and the need for deployment of defenses for the modern processor frontend.

MMMay 4, 2021
A Power and Area Efficient Lepton Hardware Encoder with Hash-based Memory Optimization

Xiao Yan, Zhixiong Di, Bowen Huang et al.

Although it has been surpassed by many subsequent coding standards, JPEG occupies a large share of the storage load of the current data hosting service. To reduce the storage costs, DropBox proposed a lossless secondary compression algorithm, Lepton, to further improve the compression rate of JPEG images. However, the bloated probability models defined by Lepton severely restrict its throughput and energy efficiency. To solve this problem, we construct an efficient access probability-based hash function for the probability models, and then propose a hardware-friendly memory optimization method by combining the proposed hash function and the N-way Set-Associative unit. After that, we design a highly parameterized hardware structure for the probability models and finally implement a power and area efficient Lepton hardware encoder. To the best of our knowledge, this is the first hardware implementation of Lepton. The synthesis result shows that the proposed hardware structure reduces the total area of the probability models by 70.97%. Compared with DropBox's software solution, the throughput and the energy efficiency of the proposed Lepton hardware encoder are increased by 55.25 and 4899 times respectively. In terms of manufacturing cost, the proposed Lepton hardware encoder is also significantly lower than the general-purpose CPU used by DropBox.