30.0CVMay 25Code
Detail Consistent Stage-Wise Distillation for Efficient 3D MRI SegmentationMengchen Fan, Baocheng Geng, Xi Xiao et al.
Deploying high-performing 3D medical image segmenters (e.g., nnU-Net) is often limited by memory footprint and inference latency. Compression is therefore necessary, but compact 3D encoders tend to lose fine structural cues (small lesions and sharp boundaries) as downsampling repeats across multi-resolution stages. We propose Detail Consistent Distillation (DCD), a stage-wise distillation framework that preserves structural detail across scales by aligning teacher-student features in a wavelet-decomposed representation. At each encoder stage, DCD distills directional detail components in the wavelet domain while leaving the coarse approximation comparatively unconstrained, avoiding over-regularization of global semantics. DCD is used only during training and introduces no inference-time overhead. Experiments on the BraTS 2024 and ISLES 2022 benchmarks demonstrate that our approach achieves superior performance in MRI segmentation using 3D multi-modal data. Code and implementation details for DCD are publicly available at https://github.com/ClinicaAlpha/DCD-3D-MedSeg.
CVMar 30, 2023
NN-Copula-CD: A Copula-Guided Interpretable Neural Network for Change Detection in Heterogeneous Remote Sensing ImagesWeiming Li, Xueqian Wang, Gang Li et al.
Change detection (CD) in heterogeneous remote sensing images has been widely used for disaster monitoring and land-use management. In the past decade, the heterogeneous CD problem has significantly benefited from the development of deep neural networks (DNNs). However, the purely data-driven DNNs perform like a black box where the lack of interpretability limits the trustworthiness and controllability of DNNs in most practical CD applications. As a powerful knowledge-driven tool, copula theory performs well in modeling relationships among random variables. To enhance the interpretability of existing neural networks for CD, we propose a knowledge-data-driven heterogeneous CD method based on a copula-guided neural network, named NN-Copula-CD. In our NN-Copula-CD, the mathematical characteristics of copula are employed as the loss functions to supervise a neural network to learn the dependence between bi-temporal heterogeneous superpixel pairs, and then the changed regions are identified via binary classification based on the degrees of dependence of all the superpixel pairs in the bi-temporal images. We conduct in-depth experiments on three datasets with heterogeneous images, where both quantitative and visual results demonstrate the effectiveness of our proposed NN-Copula-CD method.
SPJan 18, 2023
Sequential Processing of Observations in Human Decision-Making SystemsNandan Sriranga, Baocheng Geng, Pramod K. Varshney
In this work, we consider a binary hypothesis testing problem involving a group of human decision-makers. Due to the nature of human behavior, each human decision-maker observes the phenomenon of interest sequentially up to a random length of time. The humans use a belief model to accumulate the log-likelihood ratios until they cease observing the phenomenon. The belief model is used to characterize the perception of the human decision-maker towards observations at different instants of time, i.e., some decision-makers may assign greater importance to observations that were observed earlier, rather than later and vice-versa. The global decision-maker is a machine that fuses human decisions using the Chair-Varshney rule with different weights for the human decisions, where the weights are determined by the number of observations that were used by the humans to arrive at their respective decisions.
MLJan 27, 2025
Measuring Heterogeneity in Machine Learning with Distributed Energy DistanceMengchen Fan, Baocheng Geng, Roman Shterenberg et al.
In distributed and federated learning, heterogeneity across data sources remains a major obstacle to effective model aggregation and convergence. We focus on feature heterogeneity and introduce energy distance as a sensitive measure for quantifying distributional discrepancies. While we show that energy distance is robust for detecting data distribution shifts, its direct use in large-scale systems can be prohibitively expensive. To address this, we develop Taylor approximations that preserve key theoretical quantitative properties while reducing computational overhead. Through simulation studies, we show how accurately capturing feature discrepancies boosts convergence in distributed learning. Finally, we propose a novel application of energy distance to assign penalty weights for aligning predictions across heterogeneous nodes, ultimately enhancing coordination in federated and distributed settings.
LGFeb 11, 2025
PFedDST: Personalized Federated Learning with Decentralized Selection TrainingMengchen Fan, Keren Li, Tianyun Zhang et al.
Distributed Learning (DL) enables the training of machine learning models across multiple devices, yet it faces challenges like non-IID data distributions and device capability disparities, which can impede training efficiency. Communication bottlenecks further complicate traditional Federated Learning (FL) setups. To mitigate these issues, we introduce the Personalized Federated Learning with Decentralized Selection Training (PFedDST) framework. PFedDST enhances model training by allowing devices to strategically evaluate and select peers based on a comprehensive communication score. This score integrates loss, task similarity, and selection frequency, ensuring optimal peer connections. This selection strategy is tailored to increase local personalization and promote beneficial peer collaborations to strengthen the stability and efficiency of the training process. Our experiments demonstrate that PFedDST not only enhances model accuracy but also accelerates convergence. This approach outperforms state-of-the-art methods in handling data heterogeneity, delivering both faster and more effective training in diverse and decentralized systems.
SDAug 28, 2025
Full-Frequency Temporal Patching and Structured Masking for Enhanced Audio ClassificationAditya Makineni, Baocheng Geng, Qing Tian
Transformers and State-Space Models (SSMs) have advanced audio classification by modeling spectrograms as sequences of patches. However, existing models such as the Audio Spectrogram Transformer (AST) and Audio Mamba (AuM) adopt square patching from computer vision, which disrupts continuous frequency patterns and produces an excessive number of patches, slowing training, and increasing computation. We propose Full-Frequency Temporal Patching (FFTP), a patching strategy that better matches the time-frequency asymmetry of spectrograms by spanning full frequency bands with localized temporal context, preserving harmonic structure, and significantly reducing patch count and computation. We also introduce SpecMask, a patch-aligned spectrogram augmentation that combines full-frequency and localized time-frequency masks under a fixed masking budget, enhancing temporal robustness while preserving spectral continuity. When applied on both AST and AuM, our patching method with SpecMask improves mAP by up to +6.76 on AudioSet-18k and accuracy by up to +8.46 on SpeechCommandsV2, while reducing computation by up to 83.26%, demonstrating both performance and efficiency gains.
LGMay 6, 2024
Interpretable Data Fusion for Distributed Learning: A Representative Approach via Gradient MatchingMengchen Fan, Baocheng Geng, Keren Li et al.
This paper introduces a representative-based approach for distributed learning that transforms multiple raw data points into a virtual representation. Unlike traditional distributed learning methods such as Federated Learning, which do not offer human interpretability, our method makes complex machine learning processes accessible and comprehensible. It achieves this by condensing extensive datasets into digestible formats, thus fostering intuitive human-machine interactions. Additionally, this approach maintains privacy and communication efficiency, and it matches the training performance of models using raw data. Simulation results show that our approach is competitive with or outperforms traditional Federated Learning in accuracy and convergence, especially in scenarios with complex models and a higher number of clients. This framework marks a step forward in integrating human intuition with machine intelligence, which potentially enhances human-machine learning interfaces and collaborative efforts.
CRSep 27, 2021
Enhanced Audit Bit Based Distributed Bayesian Detection in the Presence of Strategic AttacksChen Quan, Baocheng Geng, Yunghsiang S. Han et al.
This paper employs an audit bit based mechanism to mitigate the effect of Byzantine attacks. In this framework, the optimal attacking strategy for intelligent attackers is investigated for the traditional audit bit based scheme (TAS) to evaluate the robustness of the system. We show that it is possible for an intelligent attacker to degrade the performance of TAS to the system without audit bits. To enhance the robustness of the system in the presence of intelligent attackers, we propose an enhanced audit bit based scheme (EAS). The optimal fusion rule for the proposed scheme is derived and the detection performance of the system is evaluated via the probability of error for the system. Simulation results show that the proposed EAS improves the robustness and the detection performance of the system. Moreover, based on EAS, another new scheme called the reduced audit bit based scheme (RAS) is proposed which further improves system performance. We derive the new optimal fusion rule and the simulation results show that RAS outperforms EAS and TAS in terms of both robustness and detection performance of the system. Then, we extend the proposed RAS for a wide-area cluster based distributed wireless sensor networks (CWSNs). Simulation results show that the proposed RAS significantly reduces the communication overhead between the sensors and the FC, which prolongs the lifetime of the network.
HCSep 3, 2019
Prospect Theory Based Crowdsourcing for Classification in the Presence of SpammersBaocheng Geng, Qunwei Li, Pramod K. Varshney
We consider the $M$-ary classification problem via crowdsourcing, where crowd workers respond to simple binary questions and the answers are aggregated via decision fusion. The workers have a reject option to skip answering a question when they do not have the expertise, or when the confidence of answering that question correctly is low. We further consider that there are spammers in the crowd who respond to the questions with random guesses. Under the payment mechanism that encourages the reject option, we study the behavior of honest workers and spammers, whose objectives are to maximize their monetary rewards. To accurately characterize human behavioral aspects, we employ prospect theory to model the rationality of the crowd workers, whose perception of costs and probabilities are distorted based on some value and weight functions, respectively. Moreover, we estimate the number of spammers and employ a weighted majority voting decision rule, where we assign an optimal weight for every worker to maximize the system performance. The probability of correct classification and asymptotic system performance are derived. We also provide simulation results to demonstrate the effectiveness of our approach.
LGMay 1, 2018
Decision Tree Design for Classification in Crowdsourcing SystemsBaocheng Geng, Qunwei Li, Pramod K. Varshney
In this paper, we present a novel sequential paradigm for classification in crowdsourcing systems. Considering that workers are unreliable and they perform the tests with errors, we study the construction of decision trees so as to minimize the probability of mis-classification. By exploiting the connection between probability of mis-classification and entropy at each level of the decision tree, we propose two algorithms for decision tree design. Furthermore, the worker assignment problem is studied when workers can be assigned to different tests of the decision tree to provide a trade-off between classification cost and resulting error performance. Numerical results are presented for illustration.