YangTian Yan

CV
3papers
586citations
Novelty52%
AI Score40

3 Papers

IVNov 12, 2022Code
DeltaNet:Conditional Medical Report Generation for COVID-19 Diagnosis

Xian Wu, Shuxin Yang, Zhaopeng Qiu et al.

Fast screening and diagnosis are critical in COVID-19 patient treatment. In addition to the gold standard RT-PCR, radiological imaging like X-ray and CT also works as an important means in patient screening and follow-up. However, due to the excessive number of patients, writing reports becomes a heavy burden for radiologists. To reduce the workload of radiologists, we propose DeltaNet to generate medical reports automatically. Different from typical image captioning approaches that generate reports with an encoder and a decoder, DeltaNet applies a conditional generation process. In particular, given a medical image, DeltaNet employs three steps to generate a report: 1) first retrieving related medical reports, i.e., the historical reports from the same or similar patients; 2) then comparing retrieved images and current image to find the differences; 3) finally generating a new report to accommodate identified differences based on the conditional report. We evaluate DeltaNet on a COVID-19 dataset, where DeltaNet outperforms state-of-the-art approaches. Besides COVID-19, the proposed DeltaNet can be applied to other diseases as well. We validate its generalization capabilities on the public IU-Xray and MIMIC-CXR datasets for chest-related diseases. Code is available at \url{https://github.com/LX-doctorAI1/DeltaNet}.

CVApr 1, 2023
Multi-view reconstruction of bullet time effect based on improved NSFF model

Linquan Yu, Yan Gao, Yangtian Yan et al.

Bullet time is a type of visual effect commonly used in film, television and games that makes time seem to slow down or stop while still preserving dynamic details in the scene. It usually requires multiple sets of cameras to move slowly with the subject and is synthesized using post-production techniques, which is costly and one-time. The dynamic scene perspective reconstruction technology based on neural rendering field can be used to solve this requirement, but most of the current methods are poor in reconstruction accuracy due to the blurred input image and overfitting of dynamic and static regions. Based on the NSFF algorithm, this paper reconstructed the common time special effects scenes in movies and television from a new perspective. To improve the accuracy of the reconstructed images, fuzzy kernel was added to the network for reconstruction and analysis of the fuzzy process, and the clear perspective after analysis was input into the NSFF to improve the accuracy. By using the optical flow prediction information to suppress the dynamic network timely, the network is forced to improve the reconstruction effect of dynamic and static networks independently, and the ability to understand and reconstruct dynamic and static scenes is improved. To solve the overfitting problem of dynamic and static scenes, a new dynamic and static cross entropy loss is designed. Experimental results show that compared with original NSFF and other new perspective reconstruction algorithms of dynamic scenes, the improved NSFF-RFCT improves the reconstruction accuracy and enhances the understanding ability of dynamic and static scenes.

LGMar 28, 2025
Data-Free Universal Attack by Exploiting the Intrinsic Vulnerability of Deep Models

YangTian Yan, Jinyu Tian

Deep neural networks (DNNs) are susceptible to Universal Adversarial Perturbations (UAPs), which are instance agnostic perturbations that can deceive a target model across a wide range of samples. Unlike instance-specific adversarial examples, UAPs present a greater challenge as they must generalize across different samples and models. Generating UAPs typically requires access to numerous examples, which is a strong assumption in real-world tasks. In this paper, we propose a novel data-free method called Intrinsic UAP (IntriUAP), by exploiting the intrinsic vulnerabilities of deep models. We analyze a series of popular deep models composed of linear and nonlinear layers with a Lipschitz constant of 1, revealing that the vulnerability of these models is predominantly influenced by their linear components. Based on this observation, we leverage the ill-conditioned nature of the linear components by aligning the UAP with the right singular vectors corresponding to the maximum singular value of each linear layer. Remarkably, our method achieves highly competitive performance in attacking popular image classification deep models without using any image samples. We also evaluate the black-box attack performance of our method, showing that it matches the state-of-the-art baseline for data-free methods on models that conform to our theoretical framework. Beyond the data-free assumption, IntriUAP also operates under a weaker assumption, where the adversary only can access a few of the victim model's layers. Experiments demonstrate that the attack success rate decreases by only 4% when the adversary has access to just 50% of the linear layers in the victim model.