Renjie Zhou

CL
h-index3
5papers
189citations
Novelty41%
AI Score31

5 Papers

OPTICSAug 2, 2023Code
On the use of deep learning for phase recovery

Kaiqiang Wang, Li Song, Chutian Wang et al.

Phase recovery (PR) refers to calculating the phase of the light field from its intensity measurements. As exemplified from quantitative phase imaging and coherent diffraction imaging to adaptive optics, PR is essential for reconstructing the refractive index distribution or topography of an object and correcting the aberration of an imaging system. In recent years, deep learning (DL), often implemented through deep neural networks, has provided unprecedented support for computational imaging, leading to more efficient solutions for various PR problems. In this review, we first briefly introduce conventional methods for PR. Then, we review how DL provides support for PR from the following three stages, namely, pre-processing, in-processing, and post-processing. We also review how DL is used in phase image processing. Finally, we summarize the work in DL for PR and outlook on how to better use DL to improve the reliability and efficiency in PR. Furthermore, we present a live-updating resource (https://github.com/kqwang/phase-recovery) for readers to learn more about PR.

CLMar 14, 2022
WCL-BBCD: A Contrastive Learning and Knowledge Graph Approach to Named Entity Recognition

Renjie Zhou, Qiang Hu, Jian Wan et al.

Named Entity Recognition task is one of the core tasks of information extraction. Word ambiguity and word abbreviation are important reasons for the low recognition rate of named entities. In this paper, we propose a novel named entity recognition model WCL-BBCD (Word Contrastive Learning with BERT-BiLSTM-CRF-DBpedia), which incorporates the idea of contrastive learning. The model first trains the sentence pairs in the text, calculate similarity between sentence pairs, and fine-tunes BERT used for the named entity recognition task according to the similarity, so as to alleviate word ambiguity. Then, the fine-tuned BERT is combined with BiLSTM-CRF to perform the named entity recognition task. Finally, the recognition results are corrected in combination with prior knowledge such as knowledge graphs, so as to alleviate the low-recognition-rate problem caused by word abbreviations. The results of experimentals conducted on the CoNLL-2003 English dataset and OntoNotes V5 English dataset show that our model outperforms other similar models on.

OPTICSJul 22, 2023
High-performance real-world optical computing trained by in situ gradient-based model-free optimization

Guangyuan Zhao, Xin Shu, Renjie Zhou

Optical computing systems provide high-speed and low-energy data processing but face deficiencies in computationally demanding training and simulation-to-reality gaps. We propose a gradient-based model-free optimization (G-MFO) method based on a Monte Carlo gradient estimation algorithm for computationally efficient in situ training of optical computing systems. This approach treats an optical computing system as a black box and back-propagates the loss directly to the optical computing weights' probability distributions, circumventing the need for a computationally heavy and biased system simulation. Our experiments on diffractive optical computing systems show that G-MFO outperforms hybrid training on the MNIST and FMNIST datasets. Furthermore, we demonstrate image-free and high-speed classification of cells from their marker-free phase maps. Our method's model-free and high-performance nature, combined with its low demand for computational resources, paves the way for accelerating the transition of optical computing from laboratory demonstrations to practical, real-world applications.

IVOct 25, 2022
Neural Architecture Search generated Phase Retrieval Net for Real-time Off-axis Quantitative Phase Imaging

Xin Shu, Mengxuan Niu, Yi Zhang et al.

In off-axis Quantitative Phase Imaging (QPI), artificial neural networks have been recently applied for phase retrieval with aberration compensation and phase unwrapping. However, the involved neural network architectures are largely unoptimized and inefficient with low inference speed, which hinders the realization of real-time imaging. Here, we propose a Neural Architecture Search (NAS) generated Phase Retrieval Net (NAS-PRNet) for accurate and fast phase retrieval. NAS-PRNet is an encoder-decoder style neural network, automatically found from a large neural network architecture search space through NAS. By modifying the differentiable NAS scheme from SparseMask, we learn the optimized skip connections through gradient descent. Specifically, we implement MobileNet-v2 as the encoder and define a synthesized loss that incorporates phase reconstruction loss and network sparsity loss. NAS-PRNet has achieved high-fidelity phase retrieval by achieving a peak Signal-to-Noise Ratio (PSNR) of 36.7 dB and a Structural SIMilarity (SSIM) of 86.6% as tested on interferograms of biological cells. Notably, NAS-PRNet achieves phase retrieval in only 31 ms, representing 15x speedup over the most recent Mamba-UNet with only a slightly lower phase retrieval accuracy.

CLMay 20, 2025
Enhanced Multimodal Aspect-Based Sentiment Analysis by LLM-Generated Rationales

Jun Cao, Jiyi Li, Ziwei Yang et al.

There has been growing interest in Multimodal Aspect-Based Sentiment Analysis (MABSA) in recent years. Existing methods predominantly rely on pre-trained small language models (SLMs) to collect information related to aspects and sentiments from both image and text, with an aim to align these two modalities. However, small SLMs possess limited capacity and knowledge, often resulting in inaccurate identification of meaning, aspects, sentiments, and their interconnections in textual and visual data. On the other hand, Large language models (LLMs) have shown exceptional capabilities in various tasks by effectively exploring fine-grained information in multimodal data. However, some studies indicate that LLMs still fall short compared to fine-tuned small models in the field of ABSA. Based on these findings, we propose a novel framework, termed LRSA, which combines the decision-making capabilities of SLMs with additional information provided by LLMs for MABSA. Specifically, we inject explanations generated by LLMs as rationales into SLMs and employ a dual cross-attention mechanism for enhancing feature interaction and fusion, thereby augmenting the SLMs' ability to identify aspects and sentiments. We evaluated our method using two baseline models, numerous experiments highlight the superiority of our approach on three widely-used benchmarks, indicating its generalizability and applicability to most pre-trained models for MABSA.