Chi Zhang

h-index15

3papers

17citations

Novelty43%

AI Score36

Ranked #100,961 of 194,257 authors (top 52%)#33,920 in CV (top 57%)

3 Papers

2.3SYNov 27, 2022Code

BEAR: Physics-Principled Building Environment for Control and Reinforcement Learning

Chi Zhang, Yuanyuan Shi, Yize Chen

Recent advancements in reinforcement learning algorithms have opened doors for researchers to operate and optimize building energy management systems autonomously. However, the lack of an easily configurable building dynamical model and energy management task simulation and evaluation platform has arguably slowed the progress in developing advanced and dedicated reinforcement learning (RL) and control algorithms for building operation tasks. Here we propose "BEAR", a physics-principled Building Environment for Control And Reinforcement Learning. The platform allows researchers to benchmark both model-based and model-free controllers using a broad collection of standard building models in Python without co-simulation using external building simulators. In this paper, we discuss the design of this platform and compare it with other existing building simulation frameworks. We demonstrate the compatibility and performance of BEAR with different controllers, including both model predictive control (MPC) and several state-of-the-art RL methods with two case studies.

6.2CVJan 3, 2025

Training-Free Defense Against Adversarial Attacks in Deep Learning MRI Reconstruction

Mahdi Saberi, Chi Zhang, Mehmet Akçakaya

Deep learning (DL) methods have become the state-of-the-art for reconstructing sub-sampled magnetic resonance imaging (MRI) data. However, studies have shown that these methods are susceptible to small adversarial input perturbations, or attacks, resulting in major distortions in the output images. Various strategies have been proposed to reduce the effects of these attacks, but they require retraining and may lower reconstruction quality for non-perturbed/clean inputs. In this work, we propose a novel approach for mitigating adversarial attacks on MRI reconstruction models without any retraining. Based on the idea of cyclic measurement consistency, we devise a novel mitigation objective that is minimized in a small ball around the attack input. Results show that our method substantially reduces the impact of adversarial perturbations across different datasets, attack types/strengths and PD-DL networks, and qualitatively and quantitatively outperforms conventional mitigation methods that involve retraining. We also introduce a practically relevant scenario for small adversarial perturbations that models impulse noise in raw data, which relates to \emph{herringbone artifacts}, and show the applicability of our approach in this setting. Finally, we show our mitigation approach remains effective in two \emph{realistic} extension scenarios: a blind setup, where the attack strength or algorithm is not known to the user; and an adaptive attack setup, where the attacker has full knowledge of the defense strategy.

13.0LGOct 1, 2025

Eliciting Chain-of-Thought Reasoning for Time Series Analysis using Reinforcement Learning

Felix Parker, Nimeesha Chan, Chi Zhang et al.

Complex numerical time series analysis often demands multi-step reasoning capabilities beyond current models' reach. Tasks like medical diagnosis and weather forecasting require sequential reasoning processes -- including counterfactual analysis, logical deduction, knowledge application, and multi-modal contextual integration -- that existing time series models cannot explicitly perform. While recent research has shown large language models (LLMs) can achieve sophisticated Chain-of-Thought (CoT) reasoning through reinforcement learning (RL), these advances have primarily focused on mathematical and coding domains, with LLMs still demonstrating poor performance on time series tasks. We introduce Chain Of thought for Understanding Numerical Time Series (COUNTS), the first framework that trains LLMs to perform CoT reasoning across diverse time series tasks using RL with verifiable rewards. Our approach employs a Residual Vector-Quantized VAE to create high-fidelity discrete tokens that seamlessly integrate into a pre-trained LLM's vocabulary. COUNTS undergoes a two-stage training process: first, supervised fine-tuning on time series analysis tasks to master our novel representations, followed by Group Relative Policy Optimization training on verifiable problems using prompting strategies that encourage explicit reasoning steps before producing final answers. Our experiments demonstrate that this RL-driven approach with intermediate CoT reasoning significantly enhances LLM performance across various time series analysis tasks, opening new possibilities for complex temporal data reasoning.