Qingxin Xia

AI
h-index14
3papers
21citations
Novelty57%
AI Score39

3 Papers

HCJun 3, 2023
Unsupervised Human Activity Recognition through Two-stage Prompting with ChatGPT

Qingxin Xia, Takuya Maekawa, Takahiro Hara

Wearable sensor devices, which offer the advantage of recording daily objects used by a person while performing an activity, enable the feasibility of unsupervised Human Activity Recognition (HAR). Unfortunately, previous unsupervised approaches using the usage sequence of objects usually require a proper description of activities manually prepared by humans. Instead, we leverage the knowledge embedded in a Large Language Model (LLM) of ChatGPT. Because the sequence of objects robustly characterizes the activity identity, it is possible that ChatGPT already learned the association between activities and objects from existing contexts. However, previous prompt engineering for ChatGPT exhibits limited generalization ability when dealing with a list of words (i.e., sequence of objects) due to the similar weighting assigned to each word in the list. In this study, we propose a two-stage prompt engineering, which first guides ChatGPT to generate activity descriptions associated with objects while emphasizing important objects for distinguishing similar activities; then outputs activity classes and explanations for enhancing the contexts that are helpful for HAR. To the best of our knowledge, this is the first study that utilizes ChatGPT to recognize activities using objects in an unsupervised manner. We conducted our approach on three datasets and demonstrated the state-of-the-art performance.

AIFeb 20
Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies

Zhuoran Li, Hai Zhong, Xun Wang et al.

Online Multi-Agent Reinforcement Learning (MARL) is a prominent framework for efficient agent coordination. Crucially, enhancing policy expressiveness is pivotal for achieving superior performance. Diffusion-based generative models are well-positioned to meet this demand, having demonstrated remarkable expressiveness and multimodal representation in image generation and offline settings. Yet, their potential in online MARL remains largely under-explored. A major obstacle is that the intractable likelihoods of diffusion models impede entropy-based exploration and coordination. To tackle this challenge, we propose among the first \underline{O}nline off-policy \underline{MA}RL framework using \underline{D}iffusion policies (\textbf{OMAD}) to orchestrate coordination. Our key innovation is a relaxed policy objective that maximizes scaled joint entropy, facilitating effective exploration without relying on tractable likelihood. Complementing this, within the centralized training with decentralized execution (CTDE) paradigm, we employ a joint distributional value function to optimize decentralized diffusion policies. It leverages tractable entropy-augmented targets to guide the simultaneous updates of diffusion policies, thereby ensuring stable coordination. Extensive evaluations on MPE and MAMuJoCo establish our method as the new state-of-the-art across $10$ diverse tasks, demonstrating a remarkable $2.5\times$ to $5\times$ improvement in sample efficiency.

IVDec 29, 2024
Unlocking adaptive digital pathology through dynamic feature learning

Jiawen Li, Tian Guan, Qingxin Xia et al.

Foundation models have revolutionized the paradigm of digital pathology, as they leverage general-purpose features to emulate real-world pathological practices, enabling the quantitative analysis of critical histological patterns and the dissection of cancer-specific signals. However, these static general features constrain the flexibility and pathological relevance in the ever-evolving needs of clinical applications, hindering the broad use of the current models. Here we introduce PathFiT, a dynamic feature learning method that can be effortlessly plugged into various pathology foundation models to unlock their adaptability. Meanwhile, PathFiT performs seamless implementation across diverse pathology applications regardless of downstream specificity. To validate PathFiT, we construct a digital pathology benchmark with over 20 terabytes of Internet and real-world data comprising 28 H\&E-stained tasks and 7 specialized imaging tasks including Masson's Trichrome staining and immunofluorescence images. By applying PathFiT to the representative pathology foundation models, we demonstrate state-of-the-art performance on 34 out of 35 tasks, with significant improvements on 23 tasks and outperforming by 10.20% on specialized imaging tasks. The superior performance and versatility of PathFiT open up new avenues in computational pathology.