IVMar 17, 2023
Mpox-AISM: AI-Mediated Super Monitoring for Mpox and Like-MpoxYubiao Yue, Minghua Jiang, Xinyue Zhang et al.
Swift and accurate diagnosis for earlier-stage monkeypox (mpox) patients is crucial to avoiding its spread. However, the similarities between common skin disorders and mpox and the need for professional diagnosis unavoidably impaired the diagnosis of earlier-stage mpox patients and contributed to mpox outbreak. To address the challenge, we proposed "Super Monitoring", a real-time visualization technique employing artificial intelligence (AI) and Internet technology to diagnose earlier-stage mpox cheaply, conveniently, and quickly. Concretely, AI-mediated "Super Monitoring" (mpox-AISM) integrates deep learning models, data augmentation, self-supervised learning, and cloud services. According to publicly accessible datasets, mpox-AISM's Precision, Recall, Specificity, and F1-score in diagnosing mpox reach 99.3%, 94.1%, 99.9%, and 96.6%, respectively, and it achieves 94.51% accuracy in diagnosing mpox, six like-mpox skin disorders, and normal skin. With the Internet and communication terminal, mpox-AISM has the potential to perform real-time and accurate diagnosis for earlier-stage mpox in real-world scenarios, thereby preventing mpox outbreak.
CVAug 25, 2023
Ultrafast-and-Ultralight ConvNet-Based Intelligent Monitoring System for Diagnosing Early-Stage Mpox Anytime and AnywhereYubiao Yue, Xiaoqiang Shi, Li Qin et al.
Due to the absence of more efficient diagnostic tools, the spread of mpox continues to be unchecked. Although related studies have demonstrated the high efficiency of deep learning models in diagnosing mpox, key aspects such as model inference speed and parameter size have always been overlooked. Herein, an ultrafast and ultralight network named Fast-MpoxNet is proposed. Fast-MpoxNet, with only 0.27M parameters, can process input images at 68 frames per second (FPS) on the CPU. To detect subtle image differences and optimize model parameters better, Fast-MpoxNet incorporates an attention-based feature fusion module and a multiple auxiliary losses enhancement strategy. Experimental results indicate that Fast-MpoxNet, utilizing transfer learning and data augmentation, produces 98.40% classification accuracy for four classes on the mpox dataset. Furthermore, its Recall for early-stage mpox is 93.65%. Most importantly, an application system named Mpox-AISM V2 is developed, suitable for both personal computers and smartphones. Mpox-AISM V2 can rapidly and accurately diagnose mpox and can be easily deployed in various scenarios to offer the public real-time mpox diagnosis services. This work has the potential to mitigate future mpox outbreaks and pave the way for developing real-time diagnostic tools in the healthcare field.
CVApr 28, 2024
Out-of-distribution Detection in Medical Image Analysis: A surveyZesheng Hong, Yubiao Yue, Yubin Chen et al.
Computer-aided diagnostics has benefited from the development of deep learning-based computer vision techniques in these years. Traditional supervised deep learning methods assume that the test sample is drawn from the identical distribution as the training data. However, it is possible to encounter out-of-distribution samples in real-world clinical scenarios, which may cause silent failure in deep learning-based medical image analysis tasks. Recently, research has explored various out-of-distribution (OOD) detection situations and techniques to enable a trustworthy medical AI system. In this survey, we systematically review the recent advances in OOD detection in medical image analysis. We first explore several factors that may cause a distributional shift when using a deep-learning-based model in clinic scenarios, with three different types of distributional shift well defined on top of these factors. Then a framework is suggested to categorize and feature existing solutions, while the previous studies are reviewed based on the methodology taxonomy. Our discussion also includes evaluation protocols and metrics, as well as the challenge and a research direction lack of exploration.
CVMay 17, 2024
VideoQA-SC: Adaptive Semantic Communication for Video Question AnsweringJiangyuan Guo, Wei Chen, Yuxuan Sun et al.
Although semantic communication (SC) has shown its potential in efficiently transmitting multimodal data such as texts, speeches and images, SC for videos has focused primarily on pixel-level reconstruction. However, these SC systems may be suboptimal for downstream intelligent tasks. Moreover, SC systems without pixel-level video reconstruction present advantages by achieving higher bandwidth efficiency and real-time performance of various intelligent tasks. The difficulty in such system design lies in the extraction of task-related compact semantic representations and their accurate delivery over noisy channels. In this paper, we propose an end-to-end SC system, named VideoQA-SC for video question answering (VideoQA) tasks. Our goal is to accomplish VideoQA tasks directly based on video semantics over noisy or fading wireless channels, bypassing the need for video reconstruction at the receiver. To this end, we develop a spatiotemporal semantic encoder for effective video semantic extraction, and a learning-based bandwidth-adaptive deep joint source-channel coding (DJSCC) scheme for efficient and robust video semantic transmission. Experiments demonstrate that VideoQA-SC outperforms traditional and advanced DJSCC-based SC systems that rely on video reconstruction at the receiver under a wide range of channel conditions and bandwidth constraints. In particular, when the signal-to-noise ratio is low, VideoQA-SC can improve the answer accuracy by 5.17% while saving almost 99.5\% of the bandwidth at the same time, compared with the advanced DJSCC-based SC system. Our results show the great potential of SC system design for video applications.
ITJun 23, 2024
Belief Information based Deep Channel Estimation for Massive MIMO SystemsJialong Xu, Liu Liu, Xin Wang et al.
In the next generation wireless communication system, transmission rates should continue to rise to support emerging scenarios, e.g., the immersive communications. From the perspective of communication system evolution, multiple-input multiple-output (MIMO) technology remains pivotal for enhancing transmission rates. However, current MIMO systems rely on inserting pilot signals to achieve accurate channel estimation. As the increase of transmit stream, the pilots consume a significant portion of transmission resources, severely reducing the spectral efficiency. In this correspondence, we propose a belief information based mechanism. By introducing a plug-and-play belief information module, existing single-antenna channel estimation networks could be seamlessly adapted to multi-antenna channel estimation and fully exploit the spatial correlation among multiple antennas. Experimental results demonstrate that the proposed method can either improve 1 ~ 2 dB channel estimation performance or reduce 1/3 ~ 1/2 pilot overhead, particularly in bad channel conditions.
CRNov 5, 2021
Deep Joint Source-Channel Coding for Image Transmission with Visual ProtectionJialong Xu, Bo Ai, Wei Chen et al.
Joint source-channel coding (JSCC) has achieved great success due to the introduction of deep learning (DL). Compared to traditional separate source-channel coding (SSCC) schemes, the advantages of DL-based JSCC (DJSCC) include high spectrum efficiency, high reconstruction quality, and relief of "cliff effect". However, it is difficult to couple existing secure communication mechanisms (e.g., encryption-decryption mechanism) with DJSCC in contrast with traditional SSCC schemes, which hinders the practical usage of this emerging technology. To this end, our paper proposes a novel method called DL-based joint protection and source-channel coding (DJPSCC) for images that can successfully protect the visual content of the plain image without significantly sacrificing image reconstruction performance. The idea of the design is to use a neural network to conduct visual protection, which converts the plain image to a visually protected one with the consideration of its interaction with DJSCC. During the training stage, the proposed DJPSCC method learns: 1) deep neural networks for image protection and image deprotection, and 2) an effective DJSCC network for image transmission in the protected domain. Compared to existing source protection methods applied with DJSCC transmission, the DJPSCC method achieves much better reconstruction performance.