CVApr 26, 2022
Self-recoverable Adversarial Examples: A New Effective Protection Mechanism in Social NetworksJiawei Zhang, Jinwei Wang, Hao Wang et al.
Malicious intelligent algorithms greatly threaten the security of social users' privacy by detecting and analyzing the uploaded photos to social network platforms. The destruction to DNNs brought by the adversarial attack sparks the potential that adversarial examples serve as a new protection mechanism for privacy security in social networks. However, the existing adversarial example does not have recoverability for serving as an effective protection mechanism. To address this issue, we propose a recoverable generative adversarial network to generate self-recoverable adversarial examples. By modeling the adversarial attack and recovery as a united task, our method can minimize the error of the recovered examples while maximizing the attack ability, resulting in better recoverability of adversarial examples. To further boost the recoverability of these examples, we exploit a dimension reducer to optimize the distribution of adversarial perturbation. The experimental results prove that the adversarial examples generated by the proposed method present superior recoverability, attack ability, and robustness on different datasets and network architectures, which ensure its effectiveness as a protection mechanism in social networks.
CVMay 28
IP-Adapter Is All You Need: Towards Fine-Tuning-Free Diffusion-Based Talking Face GenerationHao Wu, Xiangyang Luo, Hao Wang et al.
With the rapid advancement of diffusion models, talking face generation has made remarkable progress. However, existing diffusion-based methods still require task-specific fine-tuning and large-scale audiovisual datasets, resulting in high computational costs that hinder scalability and accessibility of diffusion-based approaches across the research community. To address this, we propose a finetuning-free paradigm that directly performs talking face generation using the pretrained weights of Stable Diffusion and IP-Adapter. This backbone leverages the visual embedding capability of IP-Adapter to mine lip-related semantics from the pretrained Stable Diffusion. To address the challenges of identity drift, synchronization errors, and temporal instability, we also design three trainable-parameterfree components: (1) the Structurist, which explicitly disentangles and reassembles lip and appearance features to mitigate identity drift and appearance distortion; (2) the Structure Controller, which adaptively refines embeddings based on quasi-monotonic motion trends for precise lip synchronization; and (3) the Noise Sensor, which introduces Gaussian prior to detect and suppress flicker and jitter artifacts and enhance temporal consistency. Experimental results show that our method outperforms existing SOTA approaches in both lip-sync accuracy (at least 0.16 gain in PCLD) and visual fidelity (at least 0.7 improvement in FID), establishing a novel fine-tuning-free diffusion framework for talking face generation.
CVMar 8, 2023
Immune Defense: A Novel Adversarial Defense Mechanism for Preventing the Generation of Adversarial ExamplesJinwei Wang, Hao Wu, Haihua Wang et al.
The vulnerability of Deep Neural Networks (DNNs) to adversarial examples has been confirmed. Existing adversarial defenses primarily aim at preventing adversarial examples from attacking DNNs successfully, rather than preventing their generation. If the generation of adversarial examples is unregulated, images within reach are no longer secure and pose a threat to non-robust DNNs. Although gradient obfuscation attempts to address this issue, it has been shown to be circumventable. Therefore, we propose a novel adversarial defense mechanism, which is referred to as immune defense and is the example-based pre-defense. This mechanism applies carefully designed quasi-imperceptible perturbations to the raw images to prevent the generation of adversarial examples for the raw images, and thereby protecting both images and DNNs. These perturbed images are referred to as Immune Examples (IEs). In the white-box immune defense, we provide a gradient-based and an optimization-based approach, respectively. Additionally, the more complex black-box immune defense is taken into consideration. We propose Masked Gradient Sign Descent (MGSD) to reduce approximation error and stabilize the update to improve the transferability of IEs and thereby ensure their effectiveness against black-box adversarial attacks. The experimental results demonstrate that the optimization-based approach has superior performance and better visual quality in white-box immune defense. In contrast, the gradient-based approach has stronger transferability and the proposed MGSD significantly improve the transferability of baselines.
LGMar 15
Artificial intelligence-enabled single-lead ECG for non-invasive hyperkalemia detection: development, multicenter validation, and proof-of-concept deploymentGongzheng Tang, Qinghao Zhao, Guangkun Nie et al.
Hyperkalemia is a life-threatening electrolyte disorder that is common in patients with chronic kidney disease and heart failure, yet frequent monitoring remains difficult outside hospital settings. We developed and validated Pocket-K, a single-lead AI-ECG system initialized from the ECGFounder foundation model for non-invasive hyperkalemia screening and handheld deployment. In this multicentre observational study using routinely collected clinical ECG and laboratory data, 34,439 patients contributed 62,290 ECG--potassium pairs. Lead I data were used to fine-tune the model. Data from Peking University People's Hospital were divided into development and temporal validation sets, and data from The Second Hospital of Tianjin Medical University served as an independent external validation set. Hyperkalemia was defined as venous serum potassium > 5.5 mmol/L. Pocket-K achieved AUROCs of 0.936 in internal testing, 0.858 in temporal validation, and 0.808 in external validation. For KDIGO-defined moderate-to-severe hyperkalemia (serum potassium >= 6.0 mmol/L), AUROCs increased to 0.940 and 0.861 in the temporal and external sets, respectively. External negative predictive value exceeded 99.3%. Model-predicted high risk below the hyperkalemia threshold was more common in patients with chronic kidney disease and heart failure. A handheld prototype enabled near-real-time inference, supporting future prospective evaluation in native handheld and wearable settings.
AIMar 6, 2025Code
KidneyTalk-open: No-code Deployment of a Private Large Language Model with Medical Documentation-Enhanced Knowledge Database for Kidney DiseaseYongchao Long, Chao Yang, Gongzheng Tang et al.
Privacy-preserving medical decision support for kidney disease requires localized deployment of large language models (LLMs) while maintaining clinical reasoning capabilities. Current solutions face three challenges: 1) Cloud-based LLMs pose data security risks; 2) Local model deployment demands technical expertise; 3) General LLMs lack mechanisms to integrate medical knowledge. Retrieval-augmented systems also struggle with medical document processing and clinical usability. We developed KidneyTalk-open, a desktop system integrating three technical components: 1) No-code deployment of state-of-the-art (SOTA) open-source LLMs (such as DeepSeek-r1, Qwen2.5) via local inference engine; 2) Medical document processing pipeline combining context-aware chunking and intelligent filtering; 3) Adaptive Retrieval and Augmentation Pipeline (AddRep) employing agents collaboration for improving the recall rate of medical documents. A graphical interface was designed to enable clinicians to manage medical documents and conduct AI-powered consultations without technical expertise. Experimental validation on 1,455 challenging nephrology exam questions demonstrates AddRep's effectiveness: achieving 29.1% accuracy (+8.1% over baseline) with intelligent knowledge integration, while maintaining robustness through 4.9% rejection rate to suppress hallucinations. Comparative case studies with the mainstream products (AnythingLLM, Chatbox, GPT4ALL) demonstrate KidneyTalk-open's superior performance in real clinical query. KidneyTalk-open represents the first no-code medical LLM system enabling secure documentation-enhanced medical Q&A on desktop. Its designs establishes a new framework for privacy-sensitive clinical AI applications. The system significantly lowers technical barriers while improving evidence traceability, enabling more medical staff or patients to use SOTA open-source LLMs conveniently.
CVFeb 26, 2025
HDM: Hybrid Diffusion Model for Unified Image Anomaly DetectionZekang Weng, Jinjin Shi, Jinwei Wang et al.
Image anomaly detection plays a vital role in applications such as industrial quality inspection and medical imaging, where it directly contributes to improving product quality and system reliability. However, existing methods often struggle with complex and diverse anomaly patterns. In particular, the separation between generation and discrimination tasks limits the effective coordination between anomaly sample generation and anomaly region detection. To address these challenges, we propose a novel hybrid diffusion model (HDM) that integrates generation and discrimination into a unified framework. The model consists of three key modules: the Diffusion Anomaly Generation Module (DAGM), the Diffusion Discriminative Module (DDM), and the Probability Optimization Module (POM). DAGM generates realistic and diverse anomaly samples, improving their representativeness. DDM then applies a reverse diffusion process to capture the differences between generated and normal samples, enabling precise anomaly region detection and localization based on probability distributions. POM refines the probability distributions during both the generation and discrimination phases, ensuring high-quality samples are used for training. Extensive experiments on multiple industrial image datasets demonstrate that our method outperforms state-of-the-art approaches, significantly improving both image-level and pixel-level anomaly detection performance, as measured by AUROC.
LGJan 25, 2025
Development and Validation of a Dynamic Kidney Failure Prediction Model based on Deep Learning: A Real-World Study with External ValidationJingying Ma, Jinwei Wang, Lanlan Lu et al.
Background: Chronic kidney disease (CKD), a progressive disease with high morbidity and mortality, has become a significant global public health problem. Most existing models are static and fail to capture temporal trends in disease progression, limiting their ability to inform timely interventions. We address this gap by developing a dynamic model that leverages common longitudinal clinical indicators from real-world Electronic Health Records (EHRs) for real-time kidney failure prediction. Findings: A retrospective cohort of 4,587 patients from Yinzhou, China, was used for model development (2,752 patients for training, 917 patients for validation) and internal validation (918 patients), while external validation was conducted on a prospective PKUFH cohort (934 patients). The model demonstrated competitive performance across datasets, with an AUROC of 0.9311 (95%CI, 0.8873-0.9749) in the internal validation cohort and 0.8141 (95%CI, 0.7728-0.8554) in the external validation cohort, alongside progressively improving dynamic predictions, good calibration, and clinically consistent interpretability. KFDeep has been deployed on an open-access website and in primary care settings. Interpretation: The KFDeep model enables dynamic prediction of kidney failure without increasing clinical examination costs. It has been integrated into existing hospital systems, providing physicians with a continuously updated decision-support tool in routine care.
LGDec 18, 2021
GNN-Geo: A Graph Neural Network-based Fine-grained IP geolocation FrameworkShichang Ding, Xiangyang Luo, Jinwei Wang et al.
Rule-based fine-grained IP geolocation methods are hard to generalize in computer networks which do not follow hypothetical rules. Recently, deep learning methods, like multi-layer perceptron (MLP), are tried to increase generalization capabilities. However, MLP is not so suitable for graph-structured data like networks. MLP treats IP addresses as isolated instances and ignores the connection information, which limits geolocation accuracy. In this work, we research how to increase the generalization capability with an emerging graph deep learning method -- Graph Neural Network (GNN). First, IP geolocation is re-formulated as an attributed graph node regression problem. Then, we propose a GNN-based IP geolocation framework named GNN-Geo. GNN-Geo consists of a preprocessor, an encoder, messaging passing (MP) layers and a decoder. The preprocessor and encoder transform measurement data into the initial node embeddings. MP layers refine the initial node embeddings by modeling the connection information. The decoder maps the refined embeddings to nodes' locations and relieves the convergence problem by considering prior knowledge. The experiments in 8 real-world IPv4/IPv6 networks in North America, Europe and Asia show the proposed GNN-Geo clearly outperforms the state-of-art rule-based and learning-based baselines. This work verifies the great potential of GNN for fine-grained IP geolocation.
CVAug 28, 2017
Digital image splicing detection based on Markov features in QDCT and QWT domainRuxin Wang, Wei Lu, Shijun Xiang et al.
Image splicing detection is of fundamental importance in digital forensics and therefore has attracted increasing attention recently. In this paper, a color image splicing detection approach is proposed based on Markov transition probability of quaternion component separation in quaternion discrete cosine transform (QDCT) domain and quaternion wavelet transform (QWT) domain. Firstly, Markov features of the intra-block and inter-block between block QDCT coefficients are obtained from the real part and three imaginary parts of QDCT coefficients respectively. Then, additional Markov features are extracted from luminance (Y) channel in quaternion wavelet transform domain to characterize the dependency of position among quaternion wavelet subband coefficients. Finally, ensemble classifier (EC) is exploited to classify the spliced and authentic color images. The experiment results demonstrate that the proposed approach can outperforms some state-of-the-art methods.