OPTICSMay 31
Breaking the Cascade: Compact Nonlinear Optical Computing with Single-Layer Encoder-Decoder Co-LocalizationYuntian Wang, Alexander Chen, Md Sadman Sakib Rahman et al.
We demonstrate that nonlinear computing can be achieved with a single linear diffractive surface under coherent illumination. We introduce a compact encoder-decoder co-localization (E+D) architecture in which an input-dependent dynamic encoder and a static optimized decoder are integrated within the same phase-only diffractive plane. Following free-space propagation, coherent interference between the encoder and decoder fields, combined with intensity detection, generates programmable nonlinear input-output mappings without requiring nonlinear optical materials or multiple diffractive layers. We prove that the proposed E+D optical processor is a universal approximator for arbitrary real-valued band-limited nonlinear functions and identify the physical factors governing its approximation fidelity, including the decoder degrees-of-freedom, detector aperture, and axial propagation distance. Crucially, we demonstrate that introducing a trained, frozen phase bias to the encoder region systematically enhances functional expressivity, providing robustness against coarse phase quantization on spatial light modulators. Using this framework, we accurately synthesize diverse nonlinear functions, including commonly used neural network activation functions and complex-valued nonlinear functions. Finally, we experimentally validate the proposed approach using a visible-light optical set-up trained through in situ learning, demonstrating the parallel approximation of 9 nonlinear functions in a single optical forward pass. By collapsing nonlinear optical computation into a single diffractive surface, the E+D architecture substantially reduces hardware and alignment complexity while preserving powerful function-approximation capabilities, providing a compact and scalable framework for analog information processing.
OPTICSAug 10, 2024
Unidirectional imaging with partially coherent lightGuangdong Ma, Che-Yung Shen, Jingxi Li et al.
Unidirectional imagers form images of input objects only in one direction, e.g., from field-of-view (FOV) A to FOV B, while blocking the image formation in the reverse direction, from FOV B to FOV A. Here, we report unidirectional imaging under spatially partially coherent light and demonstrate high-quality imaging only in the forward direction (A->B) with high power efficiency while distorting the image formation in the backward direction (B->A) along with low power efficiency. Our reciprocal design features a set of spatially engineered linear diffractive layers that are statistically optimized for partially coherent illumination with a given phase correlation length. Our analyses reveal that when illuminated by a partially coherent beam with a correlation length of ~1.5 w or larger, where w is the wavelength of light, diffractive unidirectional imagers achieve robust performance, exhibiting asymmetric imaging performance between the forward and backward directions - as desired. A partially coherent unidirectional imager designed with a smaller correlation length of less than 1.5 w still supports unidirectional image transmission, but with a reduced figure of merit. These partially coherent diffractive unidirectional imagers are compact (axially spanning less than 75 w), polarization-independent, and compatible with various types of illumination sources, making them well-suited for applications in asymmetric visual information processing and communication.
CVMar 15, 2024Code
RCooper: A Real-world Large-scale Dataset for Roadside Cooperative PerceptionRuiyang Hao, Siqi Fan, Yingru Dai et al.
The value of roadside perception, which could extend the boundaries of autonomous driving and traffic management, has gradually become more prominent and acknowledged in recent years. However, existing roadside perception approaches only focus on the single-infrastructure sensor system, which cannot realize a comprehensive understanding of a traffic area because of the limited sensing range and blind spots. Orienting high-quality roadside perception, we need Roadside Cooperative Perception (RCooper) to achieve practical area-coverage roadside perception for restricted traffic areas. Rcooper has its own domain-specific challenges, but further exploration is hindered due to the lack of datasets. We hence release the first real-world, large-scale RCooper dataset to bloom the research on practical roadside cooperative perception, including detection and tracking. The manually annotated dataset comprises 50k images and 30k point clouds, including two representative traffic scenes (i.e., intersection and corridor). The constructed benchmarks prove the effectiveness of roadside cooperation perception and demonstrate the direction of further research. Codes and dataset can be accessed at: https://github.com/AIR-THU/DAIR-RCooper.
OPTICSMar 23
Compressive single-pixel imaging via a wavelength-multiplexed spatially incoherent diffractive optical processorXiao Wang, Yiyang Wu, Yuntian Wang et al.
Despite offering high sensitivity, a high signal-to-noise ratio, and a broad spectral range, single-pixel imaging (SPI) is limited by low measurement efficiency and long data-acquisition times. To address this, we propose a wavelength-multiplexed, spatially incoherent diffractive optical processor combined with a compact/shallow digital artificial neural network (ANN) to implement compressive SPI. Specifically, we model the bucket detection process in conventional SPI as a linear intensity transformation with spatially and spectrally varying point-spread functions. This transformation matrix is treated as a learnable parameter and jointly optimized with a shallow digital ANN composed of 2 hidden nonlinear layers. The wavelength-multiplexed diffractive processor is then configured via data-free optimization to approximate this pre-trained transformation matrix; after this optimization, the diffractive processor remains static/fixed. Upon multi-wavelength illumination and diffractive modulation, the target spatial information of the input object is spectrally encoded. A single-pixel detector captures the output spectral power at each illumination band, which is then rapidly decoded by the jointly trained digital ANN to reconstruct the input image. In addition to our numerical analyses demonstrating the feasibility of this approach, we experimentally validated its proof-of-concept using an array of light-emitting diodes (LEDs). Overall, this work demonstrates a computational imaging framework for compressive SPI that can be useful in applications such as biomedical imaging, autonomous devices, and remote sensing.
OPTICSMar 31
Large-scale nonlinear optical computing with incoherent light via linear diffractive systemsAlexander Chen, Yuntian Wang, Md Sadman Sakib Rahman et al.
Nonlinear computation is essential for various information processing tasks. Optical implementations are attractive because passive light propagation can manipulate high-dimensional signals with extreme throughput and parallelism; yet realizing nonlinear mappings in optical hardware remains challenging due to the weak nonlinearity of optical materials and the large intensities required to induce nonlinear interactions. This challenge is further amplified in many systems that operate with incoherent illumination, motivating a coherence-aware framework for scalable optical nonlinear processing. Here, we show that linear optical systems, in particular, optimized diffractive processors comprising passive surfaces, can perform large-scale nonlinear function approximation under spatially incoherent or partially coherent illumination, when preceded by intensity-only input encoding. We quantify how the accuracy of the nonlinear function approximation varies with the degree of parallelism, the number of diffractive layers, and the number of trainable diffractive features. Numerical results demonstrate snapshot computation of up to one million distinct nonlinear functions in a single forward pass through a diffractive processor, with the function outputs spatially multiplexed and read out using densely packed detectors at the output. We further provide a proof-of-concept experimental demonstration under incoherent illumination from a liquid crystal display (LCD), enabled by a model-free in situ learning strategy that jointly optimizes the diffractive profile and detector readout geometry in the presence of hardware imperfections and misalignments. Our findings establish diffractive processors as a massively parallel universal function approximator for both spatially incoherent and partially coherent illumination.
OPTICSJan 15, 2024
Information hiding cameras: optical concealment of object information into ordinary imagesBijie Bai, Ryan Lee, Yuhang Li et al.
Data protection methods like cryptography, despite being effective, inadvertently signal the presence of secret communication, thereby drawing undue attention. Here, we introduce an optical information hiding camera integrated with an electronic decoder, optimized jointly through deep learning. This information hiding-decoding system employs a diffractive optical processor as its front-end, which transforms and hides input images in the form of ordinary-looking patterns that deceive/mislead human observers. This information hiding transformation is valid for infinitely many combinations of secret messages, all of which are transformed into ordinary-looking output patterns, achieved all-optically through passive light-matter interactions within the optical processor. By processing these ordinary-looking output images, a jointly-trained electronic decoder neural network accurately reconstructs the original information hidden within the deceptive output pattern. We numerically demonstrated our approach by designing an information hiding diffractive camera along with a jointly-optimized convolutional decoder neural network. The efficacy of this system was demonstrated under various lighting conditions and noise levels, showing its robustness. We further extended this information hiding camera to multi-spectral operation, allowing the concealment and decoding of multiple images at different wavelengths, all performed simultaneously in a single feed-forward operation. The feasibility of our framework was also demonstrated experimentally using THz radiation. This optical encoder-electronic decoder-based co-design provides a novel information hiding camera interface that is both high-speed and energy-efficient, offering an intriguing solution for visual information security.
CVOct 18, 2025
Universal and Transferable Attacks on Pathology Foundation ModelsYuntian Wang, Xilin Yang, Che-Yung Shen et al.
We introduce Universal and Transferable Adversarial Perturbations (UTAP) for pathology foundation models that reveal critical vulnerabilities in their capabilities. Optimized using deep learning, UTAP comprises a fixed and weak noise pattern that, when added to a pathology image, systematically disrupts the feature representation capabilities of multiple pathology foundation models. Therefore, UTAP induces performance drops in downstream tasks that utilize foundation models, including misclassification across a wide range of unseen data distributions. In addition to compromising the model performance, we demonstrate two key features of UTAP: (1) universality: its perturbation can be applied across diverse field-of-views independent of the dataset that UTAP was developed on, and (2) transferability: its perturbation can successfully degrade the performance of various external, black-box pathology foundation models - never seen before. These two features indicate that UTAP is not a dedicated attack associated with a specific foundation model or image dataset, but rather constitutes a broad threat to various emerging pathology foundation models and their applications. We systematically evaluated UTAP across various state-of-the-art pathology foundation models on multiple datasets, causing a significant drop in their performance with visually imperceptible modifications to the input images using a fixed noise pattern. The development of these potent attacks establishes a critical, high-standard benchmark for model robustness evaluation, highlighting a need for advancing defense mechanisms and potentially providing the necessary assets for adversarial training to ensure the safe and reliable deployment of AI in pathology.
OPTICSJun 3, 2025
Structural Vibration Monitoring with Diffractive Optical ProcessorsYuntian Wang, Zafer Yilmaz, Yuhang Li et al.
Structural Health Monitoring (SHM) is vital for maintaining the safety and longevity of civil infrastructure, yet current solutions remain constrained by cost, power consumption, scalability, and the complexity of data processing. Here, we present a diffractive vibration monitoring system, integrating a jointly optimized diffractive layer with a shallow neural network-based backend to remotely extract 3D structural vibration spectra, offering a low-power, cost-effective and scalable solution. This architecture eliminates the need for dense sensor arrays or extensive data acquisition; instead, it uses a spatially-optimized passive diffractive layer that encodes 3D structural displacements into modulated light, captured by a minimal number of detectors and decoded in real-time by shallow and low-power neural networks to reconstruct the 3D displacement spectra of structures. The diffractive system's efficacy was demonstrated both numerically and experimentally using millimeter-wave illumination on a laboratory-scale building model with a programmable shake table. Our system achieves more than an order-of-magnitude improvement in accuracy over conventional optics or separately trained modules, establishing a foundation for high-throughput 3D monitoring of structures. Beyond SHM, the 3D vibration monitoring capabilities of this cost-effective and data-efficient framework establish a new computational sensing modality with potential applications in disaster resilience, aerospace diagnostics, and autonomous navigation, where energy efficiency, low latency, and high-throughput are critical.