Shiqi Xu

CL
h-index3
13papers
1,684citations
Novelty46%
AI Score51

13 Papers

IVMar 14, 2023
Digital staining in optical microscopy using deep learning -- a review

Lucas Kreiss, Shaowei Jiang, Xiang Li et al. · pku

Until recently, conventional biochemical staining had the undisputed status as well-established benchmark for most biomedical problems related to clinical diagnostics, fundamental research and biotechnology. Despite this role as gold-standard, staining protocols face several challenges, such as a need for extensive, manual processing of samples, substantial time delays, altered tissue homeostasis, limited choice of contrast agents for a given sample, 2D imaging instead of 3D tomography and many more. Label-free optical technologies, on the other hand, do not rely on exogenous and artificial markers, by exploiting intrinsic optical contrast mechanisms, where the specificity is typically less obvious to the human observer. Over the past few years, digital staining has emerged as a promising concept to use modern deep learning for the translation from optical contrast to established biochemical contrast of actual stainings. In this review article, we provide an in-depth analysis of the current state-of-the-art in this field, suggest methods of good practice, identify pitfalls and challenges and postulate promising advances towards potential future implementations and applications.

IVApr 4, 2022
Transient motion classification through turbid volumes via parallelized single-photon detection and deep contrastive embedding

Shiqi Xu, Wenhui Liu, Xi Yang et al.

Fast noninvasive probing of spatially varying decorrelating events, such as cerebral blood flow beneath the human skull, is an essential task in various scientific and clinical settings. One of the primary optical techniques used is diffuse correlation spectroscopy (DCS), whose classical implementation uses a single or few single-photon detectors, resulting in poor spatial localization accuracy and relatively low temporal resolution. Here, we propose a technique termed Classifying Rapid decorrelation Events via Parallelized single photon dEtection (CREPE)}, a new form of DCS that can probe and classify different decorrelating movements hidden underneath turbid volume with high sensitivity using parallelized speckle detection from a $32\times32$ pixel SPAD array. We evaluate our setup by classifying different spatiotemporal-decorrelating patterns hidden beneath a 5mm tissue-like phantom made with rapidly decorrelating dynamic scattering media. Twelve multi-mode fibers are used to collect scattered light from different positions on the surface of the tissue phantom. To validate our setup, we generate perturbed decorrelation patterns by both a digital micromirror device (DMD) modulated at multi-kilo-hertz rates, as well as a vessel phantom containing flowing fluid. Along with a deep contrastive learning algorithm that outperforms classic unsupervised learning methods, we demonstrate our approach can accurately detect and classify different transient decorrelation events (happening in 0.1-0.4s) underneath turbid scattering media, without any data labeling. This has the potential to be applied to noninvasively monitor deep tissue motion patterns, for example identifying normal or abnormal cerebral blood flow events, at multi-Hertz rates within a compact and static detection probe.

OPTICSApr 25, 2022
Tensorial tomographic differential phase-contrast microscopy

Shiqi Xu, Xiang Dai, Xi Yang et al.

We report Tensorial Tomographic Differential Phase-Contrast microscopy (T2DPC), a quantitative label-free tomographic imaging method for simultaneous measurement of phase and anisotropy. T2DPC extends differential phase-contrast microscopy, a quantitative phase imaging technique, to highlight the vectorial nature of light. The method solves for permittivity tensor of anisotropic samples from intensity measurements acquired with a standard microscope equipped with an LED matrix, a circular polarizer, and a polarization-sensitive camera. We demonstrate accurate volumetric reconstructions of refractive index, birefringence, and orientation for various validation samples, and show that the reconstructed polarization structures of a biological specimen are predictive of pathology.

CVApr 30Code
ClimateVID -- Social Media Videos Analysis and Challenges Involved

Shiqi Xu, Moritz Burmester, Katharina Prasse et al.

The pervasive growth of digital content, specifically short videos on social media platforms, has significantly altered how topics are discussed and understood in public discourse. In this work, we advance automated visual theme detection by assessing zero-shot and clustering capabilities on social media data. (1) We evaluated the capabilities of notable VLMs such as VideoChatGPT, PandaGPT, and VideoLLava using zero-shot image classification and compared their performance to the baseline provided by frame-wise CLIP image classification. (2) By treating clustering as a minimum cost multicut problem, we aim to uncover insightful patterns in an unsupervised manner. For both analysis strategies, we provide extensive evaluations and practical guidance to practitioners. While VLMs are currently not able to detect climate change specific classes, the clustering results are distinct visual frames. %Given that VLMs are not currently capable to grasp the climate change discourse, we focus the clustering evaluation of image embedding models. We find that both ConvNeXt V2 and DINOv2 produce meaningful clusters, with DINOv2 focusing more on style differences and abstract categories, while ConvNeXt V2 clusters differ in more fine-grained ways. Code available at https://github.com/KathPra/ClimateVID.git.

AIMay 7, 2024Code
Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework

Xiangpeng Wan, Haicheng Deng, Kai Zou et al.

Structured finance, which involves restructuring diverse assets into securities like MBS, ABS, and CDOs, enhances capital market efficiency but presents significant due diligence challenges. This study explores the integration of artificial intelligence (AI) with traditional asset review processes to improve efficiency and accuracy in structured finance. Using both open-sourced and close-sourced large language models (LLMs), we demonstrate that AI can automate the verification of information between loan applications and bank statements effectively. While close-sourced models such as GPT-4 show superior performance, open-sourced models like LLAMA3 offer a cost-effective alternative. Dual-agent systems further increase accuracy, though this comes with higher operational costs. This research highlights AI's potential to minimize manual errors and streamline due diligence, suggesting a broader application of AI in financial document analysis and risk management.

CLDec 5, 2023
Inherent limitations of LLMs regarding spatial information

He Yan, Xinyao Hu, Xiangpeng Wan et al.

Despite the significant advancements in natural language processing capabilities demonstrated by large language models such as ChatGPT, their proficiency in comprehending and processing spatial information, especially within the domains of 2D and 3D route planning, remains notably underdeveloped. This paper investigates the inherent limitations of ChatGPT and similar models in spatial reasoning and navigation-related tasks, an area critical for applications ranging from autonomous vehicle guidance to assistive technologies for the visually impaired. In this paper, we introduce a novel evaluation framework complemented by a baseline dataset, meticulously crafted for this study. This dataset is structured around three key tasks: plotting spatial points, planning routes in two-dimensional (2D) spaces, and devising pathways in three-dimensional (3D) environments. We specifically developed this dataset to assess the spatial reasoning abilities of ChatGPT. Our evaluation reveals key insights into the model's capabilities and limitations in spatial understanding.

NIApr 9
Real-Time Cross-Layer Semantic Error Correction Using Language Models and Software-Defined Radio

Yuchen Pan, Yuyang Du, Yirun Wang et al.

As Language Models (LMs) advance, Semantic Error Correction (SEC) has emerged as a promising approach for reliable network designs. Yet existing methods prioritize intent over accuracy, falling short of verbatim recovery. Our recent work, Cross-Layer SEC (CL-SEC), addressed this by fusing physical-layer Log-Likelihood Ratios (LLRs) with semantic context, but its real-time feasibility remained unvalidated. This paper demonstrates CL-SEC on a live Software-Defined Radio (SDR) testbed, resolving implementation barriers with: 1) an SDR middleware enabling real-time LLR extraction from FPGA hardware, and 2) a generalized inference interface supporting modern encoder-decoder LMs. Real-world experiments confirm that the cross-layer fusion significantly outperforms either source alone.

CRApr 1
LightGuard: Transparent WiFi Security via Physical-Layer LiFi Key Bootstrapping

Shiqi Xu, Yuyang Du, Mingyue Zhang et al.

WiFi is inherently vulnerable to eavesdropping because RF signals may penetrate many physical boundaries, such as walls and floors. LiFi, by contrast, is an optical method confined to line-of-sight and blocked by opaque surfaces. We present LightGuard, a dual-link architecture built on this insight: cryptographic key establishment can be offloaded from WiFi to a physically confined LiFi channel to mitigate the risk of key exposure over RF. LightGuard derives session keys over a LiFi link and installs them on the WiFi interface, ensuring cryptographic material never traverses the open RF medium. A prototype with off-the-shelf WiFi NICs and our LiFi transceiver frontend validates the design.

CVOct 10, 2021
Increasing a microscope's effective field of view via overlapped imaging and machine learning

Xing Yao, Vinayak Pathak, Haoran Xi et al.

This work demonstrates a multi-lens microscopic imaging system that overlaps multiple independent fields of view on a single sensor for high-efficiency automated specimen analysis. Automatic detection, classification and counting of various morphological features of interest is now a crucial component of both biomedical research and disease diagnosis. While convolutional neural networks (CNNs) have dramatically improved the accuracy of counting cells and sub-cellular features from acquired digital image data, the overall throughput is still typically hindered by the limited space-bandwidth product (SBP) of conventional microscopes. Here, we show both in simulation and experiment that overlapped imaging and co-designed analysis software can achieve accurate detection of diagnostically-relevant features for several applications, including counting of white blood cells and the malaria parasite, leading to multi-fold increase in detection and processing throughput with minimal reduction in accuracy.

OPTICSJul 3, 2021
Imaging dynamics beneath turbid media via parallelized single-photon detection

Shiqi Xu, Xi Yang, Wenhui Liu et al.

Noninvasive optical imaging through dynamic scattering media has numerous important biomedical applications but still remains a challenging task. While standard diffuse imaging methods measure optical absorption or fluorescent emission, it is also well-established that the temporal correlation of scattered coherent light diffuses through tissue much like optical intensity. Few works to date, however, have aimed to experimentally measure and process such temporal correlation data to demonstrate deep-tissue video reconstruction of decorrelation dynamics. In this work, we utilize a single-photon avalanche diode (SPAD) array camera to simultaneously monitor the temporal dynamics of speckle fluctuations at the single-photon level from 12 different phantom tissue surface locations delivered via a customized fiber bundle array. We then apply a deep neural network to convert the acquired single-photon measurements into video of scattering dynamics beneath rapidly decorrelating tissue phantoms. We demonstrate the ability to reconstruct images of transient (0.1-0.4s) dynamic events occurring up to 8 mm beneath a decorrelating tissue phantom with millimeter-scale resolution, and highlight how our model can flexibly extend to monitor flow speed within buried phantom vessels.

CLMar 12, 2021
Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

Jason Wei, Chengyu Huang, Soroush Vosoughi et al.

Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories, given only a few training examples per category. This paper explores data augmentation -- a technique particularly suitable for training with limited data -- for this few-shot, highly-multiclass text classification setting. On four diverse text classification tasks, we find that common data augmentation techniques can improve the performance of triplet networks by up to 3.0% on average. To further boost performance, we present a simple training strategy called curriculum data augmentation, which leverages curriculum learning by first training on only original examples and then introducing augmented data as training progresses. We explore a two-stage and a gradual schedule, and find that, compared with standard single-stage training, curriculum data augmentation trains faster, improves performance, and remains robust to high amounts of noising from augmentation.

CLJan 14, 2021
Text Augmentation in a Multi-Task View

Jason Wei, Chengyu Huang, Shiqi Xu et al.

Traditional data augmentation aims to increase the coverage of the input distribution by generating augmented examples that strongly resemble original samples in an online fashion where augmented examples dominate training. In this paper, we propose an alternative perspective -- a multi-task view (MTV) of data augmentation -- in which the primary task trains on original examples and the auxiliary task trains on augmented examples. In MTV data augmentation, both original and augmented samples are weighted substantively during training, relaxing the constraint that augmented examples must resemble original data and thereby allowing us to apply stronger levels of augmentation. In empirical experiments using four common data augmentation techniques on three benchmark text classification datasets, we find that the MTV leads to higher and more robust performance improvements than traditional augmentation.

CVOct 31, 2018
Regularized Fourier Ptychography using an Online Plug-and-Play Algorithm

Yu Sun, Shiqi Xu, Yunzhe Li et al.

The plug-and-play priors (PnP) framework has been recently shown to achieve state-of-the-art results in regularized image reconstruction by leveraging a sophisticated denoiser within an iterative algorithm. In this paper, we propose a new online PnP algorithm for Fourier ptychographic microscopy (FPM) based on the fast iterative shrinkage/threshold algorithm (FISTA). Specifically, the proposed algorithm uses only a subset of measurements, which makes it scalable to a large set of measurements. We validate the algorithm by showing that it can lead to significant performance gains on both simulated and experimental data.