QUANT-PHMay 5
Two-stage Quantum Estimation and the Asymptotics of Quantum-enhanced Transmittance SensingZihao Gong, Boulat A. Bash
We consider estimation of a single unknown parameter embedded in a quantum state. Quantum Cramér-Rao bound (QCRB) is the ultimate limit of the mean squared error for any unbiased estimator. While it can be achieved asymptotically for a large number of quantum state copies, the measurement required often depends on the true value of the parameter of interest. Prior work addresses this paradox using a two-stage approach: in the first stage, a preliminary estimate is obtained by applying, on a vanishing fraction of quantum state copies, a sub-optimal measurement that does not depend on the parameter of interest. In the second stage, the preliminary estimate is used to construct the QCRB-achieving measurement that is applied to the remaining quantum state copies. This is akin to two-step estimators for classical problems with nuisance parameters. Unfortunately, the original analysis imposes conditions that severely restrict the class of classical estimators applied to the quantum measurement outcomes, hindering applications of this method. We relax these conditions to substantially broaden the class of usable estimators for single-parameter problems at the cost of slightly weakening the asymptotic properties of the two-stage method. We also account for nuisance parameters. We apply our results to obtain the asymptotics of quantum-enhanced transmittance sensing.
QUANT-PHApr 28
Quantum-enhanced Network TomographyYufei Zheng, Zihao Gong, Saikat Guha et al.
Network tomography refers to the use of inference techniques for inferring internal network states from end-to-end probes. Quantum probes, implemented by sending blocks of $n$ coherent-state pulses augmented with continuous-variable (CV) squeezing ($n=1$) or weak temporal-mode entanglement ($n>1$) over a lossy channel to a receiver with homodyne detection capabilities, are known to carry information about the channel transmissivity. Assuming a subset of nodes in an optical network is capable of sending and receiving such probes through intermediate nodes with all-optical switching capabilities, we leverage these quantum probes to estimate link transmissivities. To determine how to route the probes in a network, we propose a probe construction algorithm that guarantees link identifiability, while maximizing the number of information orthogonal sets of transmissivities. A set of probes induces a Fisher information matrix (FIM). We then derive two metrics, the determinant of the FIM and the trace of its inverse, to evaluate the performance of the probes. In particular, our results can be used to characterize the quantum improvement in estimating link transmissivities in a general optical network.
CVApr 25, 2024
Motor Focus: Fast Ego-Motion Prediction for Assistive Visual NavigationHao Wang, Jiayou Qin, Xiwen Chen et al.
Assistive visual navigation systems for visually impaired individuals have become increasingly popular thanks to the rise of mobile computing. Most of these devices work by translating visual information into voice commands. In complex scenarios where multiple objects are present, it is imperative to prioritize object detection and provide immediate notifications for key entities in specific directions. This brings the need for identifying the observer's motion direction (ego-motion) by merely processing visual information, which is the key contribution of this paper. Specifically, we introduce Motor Focus, a lightweight image-based framework that predicts the ego-motion - the humans (and humanoid machines) movement intentions based on their visual feeds, while filtering out camera motion without any camera calibration. To this end, we implement an optical flow-based pixel-wise temporal analysis method to compensate for the camera motion with a Gaussian aggregation to smooth out the movement prediction area. Subsequently, to evaluate the performance, we collect a dataset including 50 clips of pedestrian scenes in 5 different scenarios. We tested this framework with classical feature detectors such as SIFT and ORB to show the comparison. Our framework demonstrates its superiority in speed (> 40FPS), accuracy (MAE = 60pixels), and robustness (SNR = 23dB), confirming its potential to enhance the usability of vision-based assistive navigation tools in complex environments.
CVMar 19, 2024
VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual NavigationHao Wang, Jiayou Qin, Ashish Bastola et al.
This paper explores the potential of Large Language Models(LLMs) in zero-shot anomaly detection for safe visual navigation. With the assistance of the state-of-the-art real-time open-world object detection model Yolo-World and specialized prompts, the proposed framework can identify anomalies within camera-captured frames that include any possible obstacles, then generate concise, audio-delivered descriptions emphasizing abnormalities, assist in safe visual navigation in complex circumstances. Moreover, our proposed framework leverages the advantages of LLMs and the open-vocabulary object detection model to achieve the dynamic scenario switch, which allows users to transition smoothly from scene to scene, which addresses the limitation of traditional visual navigation. Furthermore, this paper explored the performance contribution of different prompt components, provided the vision for future improvement in visual accessibility, and paved the way for LLMs in video anomaly detection and vision-language understanding.