Haohui Chen

LG
h-index3
7papers
353citations
Novelty39%
AI Score39

7 Papers

LGSep 28, 2024
Double Actor-Critic with TD Error-Driven Regularization in Reinforcement Learning

Haohui Chen, Zhiyong Chen, Aoxiang Liu et al.

To obtain better value estimation in reinforcement learning, we propose a novel algorithm based on the double actor-critic framework with temporal difference error-driven regularization, abbreviated as TDDR. TDDR employs double actors, with each actor paired with a critic, thereby fully leveraging the advantages of double critics. Additionally, TDDR introduces an innovative critic regularization architecture. Compared to classical deterministic policy gradient-based algorithms that lack a double actor-critic structure, TDDR provides superior estimation. Moreover, unlike existing algorithms with double actor-critic frameworks, TDDR does not introduce any additional hyperparameters, significantly simplifying the design and implementation process. Experiments demonstrate that TDDR exhibits strong competitiveness compared to benchmark algorithms in challenging continuous control tasks.

LGNov 20, 2025
Mitigating Estimation Bias with Representation Learning in TD Error-Driven Regularization

Haohui Chen, Zhiyong Chen, Aoxiang Liu et al.

Deterministic policy gradient algorithms for continuous control suffer from value estimation biases that degrade performance. While double critics reduce such biases, the exploration potential of double actors remains underexplored. Building on temporal-difference error-driven regularization (TDDR), a double actor-critic framework, this work introduces enhanced methods to achieve flexible bias control and stronger representation learning. We propose three convex combination strategies, symmetric and asymmetric, that balance pessimistic estimates to mitigate overestimation and optimistic exploration via double actors to alleviate underestimation. A single hyperparameter governs this mechanism, enabling tunable control across the bias spectrum. To further improve performance, we integrate augmented state and action representations into the actor and critic networks. Extensive experiments show that our approach consistently outperforms benchmarks, demonstrating the value of tunable bias and revealing that both overestimation and underestimation can be exploited differently depending on the environment.

LGAug 8, 2025
Mildly Conservative Regularized Evaluation for Offline Reinforcement Learning

Haohui Chen, Zhiyong Chen

Offline reinforcement learning (RL) seeks to learn optimal policies from static datasets without further environment interaction. A key challenge is the distribution shift between the learned and behavior policies, leading to out-of-distribution (OOD) actions and overestimation. To prevent gross overestimation, the value function must remain conservative; however, excessive conservatism may hinder performance improvement. To address this, we propose the mildly conservative regularized evaluation (MCRE) framework, which balances conservatism and performance by combining temporal difference (TD) error with a behavior cloning term in the Bellman backup. Building on this, we develop the mildly conservative regularized Q-learning (MCRQ) algorithm, which integrates MCRE into an off-policy actor-critic framework. Experiments show that MCRQ outperforms strong baselines and state-of-the-art offline RL algorithms on benchmark datasets.

CVFeb 8, 2021
A Histogram Thresholding Improvement to Mask R-CNN for Scalable Segmentation of New and Old Rural Buildings

Ying Li, Weipan Xu, Haohui Chen et al.

Mapping new and old buildings are of great significance for understanding socio-economic development in rural areas. In recent years, deep neural networks have achieved remarkable building segmentation results in high-resolution remote sensing images. However, the scarce training data and the varying geographical environments have posed challenges for scalable building segmentation. This study proposes a novel framework based on Mask R-CNN, named HTMask R-CNN, to extract new and old rural buildings even when the label is scarce. The framework adopts the result of single-object instance segmentation from the orthodox Mask R-CNN. Further, it classifies the rural buildings into new and old ones based on a dynamic grayscale threshold inferred from the result of a two-object instance segmentation task where training data is scarce. We found that the framework can extract more buildings and achieve a much higher mean Average Precision (mAP) than the orthodox Mask R-CNN model. We tested the novel framework's performance with increasing training data and found that it converged even when the training samples were limited. This framework's main contribution is to allow scalable segmentation by using significantly fewer training samples than traditional machine learning practices. That makes mapping China's new and old rural buildings viable.

HCAug 6, 2019
Origin-Destination Flow Maps in Immersive Environments

Yalong Yang, Tim Dwyer, Bernhard Jenny et al.

Immersive virtual- and augmented-reality headsets can overlay a flat image against any surface or hang virtual objects in the space around the user. The technology is rapidly improving and may, in the long term, replace traditional flat panel displays in many situations. When displays are no longer intrinsically flat, how should we use the space around the user for abstract data visualisation? In this paper, we ask this question with respect to origin-destination flow data in a global geographic context. We report on the findings of three studies exploring different spatial encodings for flow maps. The first experiment focuses on different 2D and 3D encodings for flows on flat maps. We find that participants are significantly more accurate with raised flow paths whose height is proportional to flow distance but fastest with traditional straight line 2D flows. In our second and third experiment, we compared flat maps, 3D globes and a novel interactive design we call MapsLink, involving a pair of linked flat maps. We find that participants took significantly more time with MapsLink than other flow maps while the 3D globe with raised flows was the fastest, most accurate, and most preferred method. Our work suggests that careful use of the third spatial dimension can resolve visual clutter in complex flow maps.

HCAug 6, 2019
Maps and Globes in Virtual Reality

Yalong Yang, Bernhard Jenny, Tim Dwyer et al.

This paper explores different ways to render world-wide geographic maps in virtual reality (VR). We compare: (a) a 3D exocentric globe, where the user's viewpoint is outside the globe; (b) a flat map (rendered to a plane in VR); (c) an egocentric 3D globe, with the viewpoint inside the globe; and (d) a curved map, created by projecting the map onto a section of a sphere which curves around the user. In all four visualisations the geographic centre can be smoothly adjusted with a standard handheld VR controller and the user, through a head-tracked headset, can physically move around the visualisation. For distance comparison, exocentric globe is more accurate than egocentric globe and flat map. For area comparison, more time is required with exocentric and egocentric globes than with flat and curved maps. For direction estimation, the exocentric globe is more accurate and faster than the other visual presentations. Our study participants had a weak preference for the exocentric globe. Generally, the curved map had benefits over the flat map. In almost all cases the egocentric globe was found to be the least effective visualisation. Overall, our results provide support for the use of exocentric globes for geographic visualisation in mixed-reality.

APAug 31, 2017
Weather impacts expressed sentiment

Patrick Baylis, Nick Obradovich, Yury Kryvasheyeu et al.

We conduct the largest ever investigation into the relationship between meteorological conditions and the sentiment of human expressions. To do this, we employ over three and a half billion social media posts from tens of millions of individuals from both Facebook and Twitter between 2009 and 2016. We find that cold temperatures, hot temperatures, precipitation, narrower daily temperature ranges, humidity, and cloud cover are all associated with worsened expressions of sentiment, even when excluding weather-related posts. We compare the magnitude of our estimates with the effect sizes associated with notable historical events occurring within our data.