Hanxiao Chen

CV
h-index10
7papers
15citations
Novelty41%
AI Score36

7 Papers

CVAug 2, 2023
SkateboardAI: The Coolest Video Action Recognition for Skateboarding

Hanxiao Chen

Impressed by the coolest skateboarding sports program from 2021 Tokyo Olympic Games, we are the first to curate the original real-world video datasets "SkateboardAI" in the wild, even self-design and implement diverse uni-modal and multi-modal video action recognition approaches to recognize different tricks accurately. For uni-modal methods, we separately apply (1) CNN and LSTM; (2) CNN and BiLSTM; (3) CNN and BiLSTM with effective attention mechanisms; (4) Transformer-based action recognition pipeline. Transferred to the multi-modal conditions, we investigated the two-stream Inflated-3D architecture on "SkateboardAI" datasets to compare its performance with uni-modal cases. In sum, our objective is developing an excellent AI sport referee for the coolest skateboarding competitions.

MLMay 3
Adaptive Estimation and Inference in Semi-parametric Heterogeneous Clustered Multitask Learning via Neyman Orthogonality

Hanxiao Chen, Debarghya Mukherjee

We study clustered multitask learning in a semiparametric setting where tasks share a latent cluster structure in their target parameters but exhibit heterogeneous, potentially infinite-dimensional nuisance components. Such heterogeneity poses a major challenge for existing multitask learning methods, which typically rely on aligned feature spaces or homogeneous task structures. To address this challenge, we propose an adaptive fused orthogonal estimator that integrates Neyman-orthogonal losses with data-driven pairwise fusion penalties. Our framework leverages task-specific pilot estimates to calibrate the fusion penalties and combines adaptive aggregation with orthogonalization to mitigate the impact of nuisance-parameter estimation error. Theoretically, we show that the proposed estimator achieves exact recovery of the latent clustering with high probability and attains pooled parametric convergence rates proportional to cluster size. Moreover, we establish asymptotic normality and show that, asymptotically, our estimator matches the performance of an oracle procedure that knows the true clustering in advance. Empirically, we show that the proposed method consistently outperforms strong baselines in various simulation setups. A real-world application to U.S. residential energy consumption demonstrates the effectiveness of our approach in uncovering meaningful regional clustering in electricity price elasticity, showcasing the efficacy of our method.

MAJul 12, 2024
Graph Neural Networks with Model-based Reinforcement Learning for Multi-agent Systems

Hanxiao Chen

Multi-agent systems (MAS) constitute a significant role in exploring machine intelligence and advanced applications. In order to deeply investigate complicated interactions within MAS scenarios, we originally propose "GNN for MBRL" model, which utilizes a state-spaced Graph Neural Networks with Model-based Reinforcement Learning to address specific MAS missions (e.g., Billiard-Avoidance, Autonomous Driving Cars). In detail, we firstly used GNN model to predict future states and trajectories of multiple agents, then applied the Cross-Entropy Method (CEM) optimized Model Predictive Control to assist the ego-agent planning actions and successfully accomplish certain MAS tasks.

CVJan 8, 2024
Color-$S^{4}L$: Self-supervised Semi-supervised Learning with Image Colorization

Hanxiao Chen

This work addresses the problem of semi-supervised image classification tasks with the integration of several effective self-supervised pretext tasks. Different from widely-used consistency regularization within semi-supervised learning, we explored a novel self-supervised semi-supervised learning framework (Color-$S^{4}L$) especially with image colorization proxy task and deeply evaluate performances of various network architectures in such special pipeline. Also, we demonstrated its effectiveness and optimal performance on CIFAR-10, SVHN and CIFAR-100 datasets in comparison to previous supervised and semi-supervised optimal methods.

CVJan 11, 2025
DivTrackee versus DynTracker: Promoting Diversity in Anti-Facial Recognition against Dynamic FR Strategy

Wenshu Fan, Minxing Zhang, Hongwei Li et al.

The widespread adoption of facial recognition (FR) models raises serious concerns about their potential misuse, motivating the development of anti-facial recognition (AFR) to protect user facial privacy. In this paper, we argue that the static FR strategy, predominantly adopted in prior literature for evaluating AFR efficacy, cannot faithfully characterize the actual capabilities of determined trackers who aim to track a specific target identity. In particular, we introduce DynTracker, a dynamic FR strategy where the model's gallery database is iteratively updated with newly recognized target identity images. Surprisingly, such a simple approach renders all the existing AFR protections ineffective. To mitigate the privacy threats posed by DynTracker, we advocate for explicitly promoting diversity in the AFR-protected images. We hypothesize that the lack of diversity is the primary cause of the failure of existing AFR methods. Specifically, we develop DivTrackee, a novel method for crafting diverse AFR protections that builds upon a text-guided image generation framework and diversity-promoting adversarial losses. Through comprehensive experiments on various image benchmarks and feature extractors, we demonstrate DynTracker's strength in breaking existing AFR methods and the superiority of DivTrackee in preventing user facial images from being identified by dynamic FR strategies. We believe our work can act as an important initial step towards developing more effective AFR methods for protecting user facial privacy against determined trackers.

CVMar 11, 2024
Car Damage Detection and Patch-to-Patch Self-supervised Image Alignment

Hanxiao Chen

Most computer vision applications aim to identify pixels in a scene and use them for diverse purposes. One intriguing application is car damage detection for insurance carriers which tends to detect all car damages by comparing both pre-trip and post-trip images, even requiring two components: (i) car damage detection; (ii) image alignment. Firstly, we implemented a Mask R-CNN model to detect car damages on custom images. Whereas for the image alignment section, we especially propose a novel self-supervised Patch-to-Patch SimCLR inspired alignment approach to find perspective transformations between custom pre/post car rental images except for traditional computer vision methods.

LGDec 2, 2019
ExperienceThinking: Constrained Hyperparameter Optimization based on Knowledge and Pruning

Chunnan Wang, Hongzhi Wang, Chang Zhou et al.

Machine learning algorithms are very sensitive to the hyperparameters, and their evaluations are generally expensive. Users desperately need intelligent methods to quickly optimize hyperparameter settings according to known evaluation information, and thus reduce computational cost and promote optimization efficiency. Motivated by this, we propose ExperienceThinking algorithm to quickly find the best possible hyperparameter configuration of machine learning algorithms within a few configuration evaluations. ExperienceThinking design two novel methods, which intelligently infer optimal configurations from two aspects: search space pruning and knowledge utilization respectively. Two methods complement each other and solve the constrained hyperparameter optimization problems effectively. To demonstrate the benefit of ExperienceThinking, we compare it with 3 classical hyperparameter optimization algorithms with a small number of configuration evaluations. The experimental results present that our proposed algorithm provides superior results and achieve better performance.