Xiaowei Chen

CV
h-index2
13papers
87citations
Novelty44%
AI Score42

13 Papers

CVOct 18, 2023Code
HB-net: Holistic bursting cell cluster integrated network for occluded multi-objects recognition

Xudong Gao, Xiao Guang Gao, Jia Rong et al.

Within the realm of image recognition, a specific category of multi-label classification (MLC) challenges arises when objects within the visual field may occlude one another, demanding simultaneous identification of both occluded and occluding objects. Traditional convolutional neural networks (CNNs) can tackle these challenges; however, those models tend to be bulky and can only attain modest levels of accuracy. Leveraging insights from cutting-edge neural science research, specifically the Holistic Bursting (HB) cell, this paper introduces a pioneering integrated network framework named HB-net. Built upon the foundation of HB cell clusters, HB-net is designed to address the intricate task of simultaneously recognizing multiple occluded objects within images. Various Bursting cell cluster structures are introduced, complemented by an evidence accumulation mechanism. Testing is conducted on multiple datasets comprising digits and letters. The results demonstrate that models incorporating the HB framework exhibit a significant $2.98\%$ enhancement in recognition accuracy compared to models without the HB framework ($1.0298$ times, $p=0.0499$). Although in high-noise settings, standard CNNs exhibit slightly greater robustness when compared to HB-net models, the models that combine the HB framework and EA mechanism achieve a comparable level of accuracy and resilience to ResNet50, despite having only three convolutional layers and approximately $1/30$ of the parameters. The findings of this study offer valuable insights for improving computer vision algorithms. The essential code is provided at https://github.com/d-lab438/hb-net.git.

AIMay 19
Generative Auto-Bidding with Unified Modeling and Exploration

Mingming Zhang, Feiqing Zhuang, Na Li et al.

Automated bidding is central to modern digital advertising. Early rule-based methods lacked adaptability, while subsequent Reinforcement Learning approaches modeled bidding as a Markov Decision Process but struggled with long-term dependencies. Recent generative models show promise, yet they lack explicit mechanisms to balance exploration and safety, relying solely on action perturbations or trajectory guidance without a safety fallback. This results in inefficient exploration and elevated financial risk for advertising platforms. To address this gap, we propose GUIDE (Generative Auto-Bidding with Unified Modeling and Exploration), a framework that synergistically integrates directed exploration with a safe fallback mechanism. GUIDE employs a Decision Transformer (DT) to jointly model historical bidding actions and environmental state transitions. A Q-value module guides the DT's exploration via regularization constraints, while an Inverse Dynamics Module (IDM) leverages DT-predicted future states to infer robust, behaviorally consistent actions as a safe policy fallback. The Q-value module then adaptively selects the final action between these two options, balancing exploration and safety. Together, these components form an integrated "explore-safeguard-select" pipeline that unifies efficiency and safety. We conduct extensive experiments on public datasets, in simulated auction environments, and through large-scale online deployment on Taobao, a leading Chinese advertising platform. Results show GUIDE consistently outperforms state-of-the-art baselines across all scenarios. In real-world deployment, GUIDE achieves notable gains: +4.10% ad GMV, +1.40% ad clicks, +1.66% ad cost, and +3.52% ad ROI, demonstrating its effectiveness and strong industrial applicability.

LGDec 6, 2021Code
Distance and Hop-wise Structures Encoding Enhanced Graph Attention Networks

Zhiguo Huang, Xiaowei Chen, Bojuan Wang

Numerous works have proven that existing neighbor-averaging Graph Neural Networks cannot efficiently catch structure features, and many works show that injecting structure, distance, position or spatial features can significantly improve performance of GNNs, however, injecting overall structure and distance into GNNs is an intuitive but remaining untouched idea. In this work, we shed light on the direction. We first extracting hop-wise structure information and compute distance distributional information, gathering with node's intrinsic features, embedding them into same vector space and then adding them up. The derived embedding vectors are then fed into GATs(like GAT, AGDN) and then Correct and Smooth, experiments show that the DHSEGATs achieve competitive result. The code is available at https://github.com/hzg0601/DHSEGATs.

CVJul 27, 2024
Few-Shot Medical Image Segmentation with Large Kernel Attention

Xiaoxiao Wu, Xiaowei Chen, Zhenguo Gao et al.

Medical image segmentation has witnessed significant advancements with the emergence of deep learning. However, the reliance of most neural network models on a substantial amount of annotated data remains a challenge for medical image segmentation. To address this issue, few-shot segmentation methods based on meta-learning have been employed. Presently, the methods primarily focus on aligning the support set and query set to enhance performance, but this approach hinders further improvement of the model's effectiveness. In this paper, our objective is to propose a few-shot medical segmentation model that acquire comprehensive feature representation capabilities, which will boost segmentation accuracy by capturing both local and long-range features. To achieve this, we introduce a plug-and-play attention module that dynamically enhances both query and support features, thereby improving the representativeness of the extracted features. Our model comprises four key modules: a dual-path feature extractor, an attention module, an adaptive prototype prediction module, and a multi-scale prediction fusion module. Specifically, the dual-path feature extractor acquires multi-scale features by obtaining features of 32{\times}32 size and 64{\times}64 size. The attention module follows the feature extractor and captures local and long-range information. The adaptive prototype prediction module automatically adjusts the anomaly score threshold to predict prototypes, while the multi-scale fusion prediction module integrates prediction masks of various scales to produce the final segmentation result. We conducted experiments on publicly available MRI datasets, namely CHAOS and CMR, and compared our method with other advanced techniques. The results demonstrate that our method achieves state-of-the-art performance.

CVMay 13, 2024
Support-Query Prototype Fusion Network for Few-shot Medical Image Segmentation

Xiaoxiao Wu, Zhenguo Gao, Xiaowei Chen et al.

In recent years, deep learning based on Convolutional Neural Networks (CNNs) has achieved remarkable success in many applications. However, their heavy reliance on extensive labeled data and limited generalization ability to unseen classes pose challenges to their suitability for medical image processing tasks. Few-shot learning, which utilizes a small amount of labeled data to generalize to unseen classes, has emerged as a critical research area, attracting substantial attention. Currently, most studies employ a prototype-based approach, in which prototypical networks are used to construct prototypes from the support set, guiding the processing of the query set to obtain the final results. While effective, this approach heavily relies on the support set while neglecting the query set, resulting in notable disparities within the model classes. To mitigate this drawback, we propose a novel Support-Query Prototype Fusion Network (SQPFNet). SQPFNet initially generates several support prototypes for the foreground areas of the support images, thus producing a coarse segmentation mask. Subsequently, a query prototype is constructed based on the coarse segmentation mask, additionally exploiting pattern information in the query set. Thus, SQPFNet constructs high-quality support-query fused prototypes, upon which the query image is segmented to obtain the final refined query mask. Evaluation results on two public datasets, SABS and CMR, show that SQPFNet achieves state-of-the-art performance.

CLDec 13, 2024
MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary Negative Samples

Shuo Xie, Fangzhi Zhu, Jiahui Wang et al.

Aligning Large Language Models (LLMs) with human feedback is crucial for their development. Existing preference optimization methods such as DPO and KTO, while improved based on Reinforcement Learning from Human Feedback (RLHF), are inherently derived from PPO, requiring a reference model that adds GPU memory resources and relies heavily on abundant preference data. Meanwhile, current preference optimization research mainly targets single-question scenarios with two replies, neglecting optimization with multiple replies, which leads to a waste of data in the application. This study introduces the MPPO algorithm, which leverages the average likelihood of model responses to fit the reward function and maximizes the utilization of preference data. Through a comparison of Point-wise, Pair-wise, and List-wise implementations, we found that the Pair-wise approach achieves the best performance, significantly enhancing the quality of model responses. Experimental results demonstrate MPPO's outstanding performance across various benchmarks. On MT-Bench, MPPO outperforms DPO, ORPO, and SimPO. Notably, on Arena-Hard, MPPO surpasses DPO and ORPO by substantial margins. These achievements underscore the remarkable advantages of MPPO in preference optimization tasks.

CVMay 13, 2024
Multi-Task Learning for Fatigue Detection and Face Recognition of Drivers via Tree-Style Space-Channel Attention Fusion Network

Shulei Qu, Zhenguo Gao, Xiaowei Chen et al.

In driving scenarios, automobile active safety systems are increasingly incorporating deep learning technology. These systems typically need to handle multiple tasks simultaneously, such as detecting fatigue driving and recognizing the driver's identity. However, the traditional parallel-style approach of combining multiple single-task models tends to waste resources when dealing with similar tasks. Therefore, we propose a novel tree-style multi-task modeling approach for multi-task learning, which rooted at a shared backbone, more dedicated separate module branches are appended as the model pipeline goes deeper. Following the tree-style approach, we propose a multi-task learning model for simultaneously performing driver fatigue detection and face recognition for identifying a driver. This model shares a common feature extraction backbone module, with further separated feature extraction and classification module branches. The dedicated branches exploit and combine spatial and channel attention mechanisms to generate space-channel fused-attention enhanced features, leading to improved detection performance. As only single-task datasets are available, we introduce techniques including alternating updation and gradient accumulation for training our multi-task model using only the single-task datasets. The effectiveness of our tree-style multi-task learning model is verified through extensive validations.

CPApr 10, 2024
Unveiling Nonlinear Dynamics in Catastrophe Bond Pricing: A Machine Learning Perspective

Xiaowei Chen, Hong Li, Yufan Lu et al.

This paper explores the implications of using machine learning models in the pricing of catastrophe (CAT) bonds. By integrating advanced machine learning techniques, our approach uncovers nonlinear relationships and complex interactions between key risk factors and CAT bond spreads -- dynamics that are often overlooked by traditional linear regression models. Using primary market CAT bond transaction records between January 1999 and March 2021, our findings demonstrate that machine learning models not only enhance the accuracy of CAT bond pricing but also provide a deeper understanding of how various risk factors interact and influence bond prices in a nonlinear way. These findings suggest that investors and issuers can benefit from incorporating machine learning to better capture the intricate interplay between risk factors when pricing CAT bonds. The results also highlight the potential for machine learning models to refine our understanding of asset pricing in markets characterized by complex risk structures.

LGJun 9, 2021
Multi-layered Network Exploration via Random Walks: From Offline Optimization to Online Learning

Xutong Liu, Jinhang Zuo, Xiaowei Chen et al.

Multi-layered network exploration (MuLaNE) problem is an important problem abstracted from many applications. In MuLaNE, there are multiple network layers where each node has an importance weight and each layer is explored by a random walk. The MuLaNE task is to allocate total random walk budget $B$ into each network layer so that the total weights of the unique nodes visited by random walks are maximized. We systematically study this problem from offline optimization to online learning. For the offline optimization setting where the network structure and node weights are known, we provide greedy based constant-ratio approximation algorithms for overlapping networks, and greedy or dynamic-programming based optimal solutions for non-overlapping networks. For the online learning setting, neither the network structure nor the node weights are known initially. We adapt the combinatorial multi-armed bandit framework and design algorithms to learn random walk related parameters and node weights while optimizing the budget allocation in multiple rounds, and prove that they achieve logarithmic regret bounds. Finally, we conduct experiments on a real-world social network dataset to validate our theoretical results.

HCJun 4, 2021
Do Persuasive Designs Make Smartphones More Addictive? -- A Mixed-Methods Study on Chinese University Students

Xiaowei Chen, Anders Hedman, Verena Distler et al.

Persuasive designs become prevalent on smartphones, and an increasing number of users report having problematic smartphone use behaviours. Persuasive designs in smartphones might be accountable for the development and reinforcement of such problematic use. This paper uses a mixed-methods approach to study the relationship between persuasive designs and problematic smartphone use: (1) questionnaires (N=183) to investigate the proportion of participants having multiple problematic smartphone use behaviours and smartphone designs and applications (apps) that they perceived affecting their attitudes and behaviours, and (2) interviews (N=10) to deepen our understanding of users' observations and evaluations of persuasive designs. 25\% of the participants self-reported having multiple problematic smartphone use behaviours, with short video, social networking, game and learning apps perceived as most attitude and behaviour-affecting. Interviewees identified multiple persuasive designs in most of these apps and stated that persuasive designs prolonged their screen time, reinforced phone-checking habits, and caused distractions. Overall, this study provides evidence to argue that persuasive designs contribute to problematic smartphone use, potentially making smartphones more addictive. We end our study by discussing the ethical implications of persuasive designs that became salient in our study.

LGJan 30, 2021
Semantic Borrowing for Generalized Zero-Shot Learning

Xiaowei Chen

Generalized zero-shot learning (GZSL) is one of the most realistic but challenging problems due to the partiality of the classifier to supervised classes, especially under the class-inductive instance-inductive (CIII) training setting, where testing data are not available. Instance-borrowing methods and synthesizing methods solve it to some extent with the help of testing semantics, but therefore neither can be used under CIII. Besides, the latter require the training process of a classifier after generating examples. In contrast, a novel non-transductive regularization under CIII called Semantic Borrowing (SB) for improving GZSL methods with compatibility metric learning is proposed in this paper, which not only can be used for training linear models, but also nonlinear ones such as artificial neural networks. This regularization item in the loss function borrows similar semantics in the training set, so that the classifier can model the relationship between the semantics of zero-shot and supervised classes more accurately during training. In practice, the information of semantics of unknown classes would not be available for training while this approach does NOT need it. Extensive experiments on GZSL benchmark datasets show that SB can reduce the partiality of the classifier to supervised classes and improve the performance of generalized zero-shot classification, surpassing inductive GZSL state of the arts.

SIJan 15, 2020
Evolution of Ethereum: A Temporal Graph Perspective

Qianlan Bai, Chao Zhang, Yuedong Xu et al.

Ethereum is one of the most popular blockchain systems that supports more than half a million transactions every day and fosters miscellaneous decentralized applications with its Turing-complete smart contract machine. Whereas it remains mysterious what the transaction pattern of Ethereum is and how it evolves over time. In this paper, we study the evolutionary behavior of Ethereum transactions from a temporal graph point of view. We first develop a data analytics platform to collect external transactions associated with users as well as internal transactions initiated by smart contracts. Three types of temporal graphs, user-to-user, contract-to-contract and user-contract graphs, are constructed according to trading relationship and are segmented with an appropriate time window. We observe a strong correlation between the size of user-to-user transaction graph and the average Ether price in a time window, while no evidence of such linkage is shown at the average degree, average edge weights and average triplet closure duration. The macroscopic and microscopic burstiness of Ethereum transactions is validated. We analyze the Gini indexes of the transaction graphs and the user wealth in which Ethereum is found to be very unfair since the very beginning, in a sense, "the rich is already very rich".

LGNov 13, 2018
Community Exploration: From Offline Optimization to Online Learning

Xiaowei Chen, Weiran Huang, Wei Chen et al.

We introduce the community exploration problem that has many real-world applications such as online advertising. In the problem, an explorer allocates limited budget to explore communities so as to maximize the number of members he could meet. We provide a systematic study of the community exploration problem, from offline optimization to online learning. For the offline setting where the sizes of communities are known, we prove that the greedy methods for both of non-adaptive exploration and adaptive exploration are optimal. For the online setting where the sizes of communities are not known and need to be learned from the multi-round explorations, we propose an `upper confidence' like algorithm that achieves the logarithmic regret bounds. By combining the feedback from different rounds, we can achieve a constant regret bound.