Jian Zhao

h-index67

4papers

23citations

Novelty55%

AI Score30

Ranked #140,176 of 194,257 authors (top 72%)#46,115 in CV (top 78%)

4 Papers

1.8LGNov 21, 2022

Neural Dependencies Emerging from Learning Massive Categories

Ruili Feng, Kecheng Zheng, Kai Zhu et al.

This work presents two astonishing findings on neural networks learned for large-scale image classification. 1) Given a well-trained model, the logits predicted for some category can be directly obtained by linearly combining the predictions of a few other categories, which we call \textbf{neural dependency}. 2) Neural dependencies exist not only within a single model, but even between two independently learned models, regardless of their architectures. Towards a theoretical analysis of such phenomena, we demonstrate that identifying neural dependencies is equivalent to solving the Covariance Lasso (CovLasso) regression problem proposed in this paper. Through investigating the properties of the problem solution, we confirm that neural dependency is guaranteed by a redundant logit covariance matrix, which condition is easily met given massive categories, and that neural dependency is highly sparse, implying that one category correlates to only a few others. We further empirically show the potential of neural dependencies in understanding internal data correlations, generalizing models to unseen categories, and improving model robustness with a dependency-derived regularizer. Code for this work will be made publicly available.

7.3CVMay 18, 2022

RandoMix: A mixed sample data augmentation method with multiple mixed modes

Xiaoliang Liu, Furao Shen, Jian Zhao et al.

Data augmentation plays a crucial role in enhancing the robustness and performance of machine learning models across various domains. In this study, we introduce a novel mixed-sample data augmentation method called RandoMix. RandoMix is specifically designed to simultaneously address robustness and diversity challenges. It leverages a combination of linear and mask-mixed modes, introducing flexibility in candidate selection and weight adjustments. We evaluate the effectiveness of RandoMix on diverse datasets, including CIFAR-10/100, Tiny-ImageNet, ImageNet, and Google Speech Commands. Our results demonstrate its superior performance compared to existing techniques such as Mixup, CutMix, Fmix, and ResizeMix. Notably, RandoMix excels in enhancing model robustness against adversarial noise, natural noise, and sample occlusion. The comprehensive experimental results and insights into parameter tuning underscore the potential of RandoMix as a versatile and effective data augmentation method. Moreover, it seamlessly integrates into the training pipeline.

4.0SDMay 21, 2025

AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars

Tianbao Zhang, Jian Zhao, Yuer Li et al.

Whole-body audio-driven avatar pose and expression generation is a critical task for creating lifelike digital humans and enhancing the capabilities of interactive virtual agents, with wide-ranging applications in virtual reality, digital entertainment, and remote communication. Existing approaches often generate audio-driven facial expressions and gestures independently, which introduces a significant limitation: the lack of seamless coordination between facial and gestural elements, resulting in less natural and cohesive animations. To address this limitation, we propose AsynFusion, a novel framework that leverages diffusion transformers to achieve harmonious expression and gesture synthesis. The proposed method is built upon a dual-branch DiT architecture, which enables the parallel generation of facial expressions and gestures. Within the model, we introduce a Cooperative Synchronization Module to facilitate bidirectional feature interaction between the two modalities, and an Asynchronous LCM Sampling strategy to reduce computational overhead while maintaining high-quality outputs. Extensive experiments demonstrate that AsynFusion achieves state-of-the-art performance in generating real-time, synchronized whole-body animations, consistently outperforming existing methods in both quantitative and qualitative evaluations.

21.4HCDec 23, 2021

Human-AI Collaboration for UX Evaluation: Effects of Explanation and Synchronization

Mingming Fan, Xianyou Yang, Tsz Tung Yu et al.

Analyzing usability test videos is arduous. Although recent research showed the promise of AI in assisting with such tasks, it remains largely unknown how AI should be designed to facilitate effective collaboration between user experience (UX) evaluators and AI. Inspired by the concepts of agency and work context in human and AI collaboration literature, we studied two corresponding design factors for AI-assisted UX evaluation: explanations and synchronization. Explanations allow AI to further inform humans how it identifies UX problems from a usability test session; synchronization refers to the two ways humans and AI collaborate: synchronously and asynchronously. We iteratively designed a tool, AI Assistant, with four versions of UIs corresponding to the two levels of explanations (with/without) and synchronization (sync/async). By adopting a hybrid wizard-of-oz approach to simulating an AI with reasonable performance, we conducted a mixed-method study with 24 UX evaluators identifying UX problems from usability test videos using AI Assistant. Our quantitative and qualitative results show that AI with explanations, regardless of being presented synchronously or asynchronously, provided better support for UX evaluators' analysis and was perceived more positively; when without explanations, synchronous AI better improved UX evaluators' performance and engagement compared to the asynchronous AI. Lastly, we present the design implications for AI-assisted UX evaluation and facilitating more effective human-AI collaboration.