SYMay 7
Realization of Precise Perforating Using Dynamic Threshold and Physical Plausibility Algorithm for Self-Locating Perforating in Oil and Gas WellsSi-Yu Xiao, Guo-Hui Ren, Tian-Hao Mao et al.
Accurate depth measurement is critical for targeting designated perforation intervals to maximize hydrocarbon recovery. While next-generation automated wireless perforating techniques reduce reliance on costly surface infrastructure and personnel, they lack the continuous depth correlation provided by conventional wireline cables. Consequently, correlating real-time casing collar locator (CCL) signals with a pre-recorded casing tally is essential for automatic depth determination. However, implementing this measurement remains challenging: downhole instruments must process CCL signals in real-time to identify collar signatures from complex interference, a task severely restricted by the limited computational resources and power budget of high-temperature downhole electronics. To address these constraints, this work proposes the Dynamic Threshold and Physical Plausibility Depth Measurement and Perforation Control (DTPPMP) system. This integrated solution enables in situ depth calibration by correlating CCL signals with the casing tally using lightweight algorithms for dynamic-threshold-based collar recognition and physical plausibility verification. Field tests demonstrate a collar recognition F1 score of 98.6% at a throughput of 1000 Sa/s. Notably, the algorithm requires only 1.5 μs per sample, confirming its computational efficiency and suitability for deployment on resource-constrained, high-temperature downhole platforms.
LGJan 26
From Human Labels to Literature: Semi-Supervised Learning of NMR Chemical Shifts at ScaleYongqi Jin, Yecheng Wang, Jun-jie Wang et al.
Accurate prediction of nuclear magnetic resonance (NMR) chemical shifts is fundamental to spectral analysis and molecular structure elucidation, yet existing machine learning methods rely on limited, labor-intensive atom-assigned datasets. We propose a semi-supervised framework that learns NMR chemical shifts from millions of literature-extracted spectra without explicit atom-level assignments, integrating a small amount of labeled data with large-scale unassigned spectra. We formulate chemical shift prediction from literature spectra as a permutation-invariant set supervision problem, and show that under commonly satisfied conditions on the loss function, optimal bipartite matching reduces to a sorting-based loss, enabling stable large-scale semi-supervised training beyond traditional curated datasets. Our models achieve substantially improved accuracy and robustness over state-of-the-art methods and exhibit stronger generalization on significantly larger and more diverse molecular datasets. Moreover, by incorporating solvent information at scale, our approach captures systematic solvent effects across common NMR solvents for the first time. Overall, our results demonstrate that large-scale unlabeled spectra mined from the literature can serve as a practical and effective data source for training NMR shift models, suggesting a broader role of literature-derived, weakly structured data in data-centric AI for science.
LGMar 24
SpecXMaster Technical ReportYutang Ge, Yaning Cui, Hanzheng Li et al.
Intelligent spectroscopy serves as a pivotal element in AI-driven closed-loop scientific discovery, functioning as the critical bridge between matter structure and artificial intelligence. However, conventional expert-dependent spectral interpretation encounters substantial hurdles, including susceptibility to human bias and error, dependence on limited specialized expertise, and variability across interpreters. To address these challenges, we propose SpecXMaster, an intelligent framework leveraging Agentic Reinforcement Learning (RL) for NMR molecular spectral interpretation. SpecXMaster enables automated extraction of multiplicity information from both 1H and 13C spectra directly from raw FID (free induction decay) data. This end-to-end pipeline enables fully automated interpretation of NMR spectra into chemical structures. It demonstrates superior performance across multiple public NMR interpretation benchmarks and has been refined through iterative evaluations by professional chemical spectroscopists. We believe that SpecXMaster, as a novel methodological paradigm for spectral interpretation, will have a profound impact on the organic chemistry community.
LGMay 28, 2025Code
Weakly-Supervised Contrastive Learning for Imprecise Class LabelsZi-Hao Zhou, Jun-Jie Wang, Tong Wei et al.
Contrastive learning has achieved remarkable success in learning effective representations, with supervised contrastive learning often outperforming self-supervised approaches. However, in real-world scenarios, data annotations are often ambiguous or inaccurate, meaning that class labels may not reliably indicate whether two examples belong to the same class. This limitation restricts the applicability of supervised contrastive learning. To address this challenge, we introduce the concept of ``continuous semantic similarity'' to define positive and negative pairs. Instead of directly relying on imprecise class labels, we measure the semantic similarity between example pairs, which quantifies how closely they belong to the same category by iteratively refining weak supervisory signals. Based on this concept, we propose a graph-theoretic framework for weakly-supervised contrastive learning, where semantic similarity serves as the graph weights. Our framework is highly versatile and can be applied to many weakly-supervised learning scenarios. We demonstrate its effectiveness through experiments in two common settings, i.e., noisy label and partial label learning, where existing methods can be easily integrated to significantly improve performance. Theoretically, we establish an error bound for our approach, showing that it can approximate supervised contrastive learning under mild conditions. The implementation code is available at https://github.com/Speechless-10308/WSC.
SYDec 28, 2025
A Neural Network-Based Real-time Casing Collar Recognition System for Downhole InstrumentsSi-Yu Xiao, Xin-Di Zhao, Xiang-Zhan Wang et al.
Casing collar locator (CCL) measurements are widely used as reliable depth markers for positioning downhole instruments in cased-hole operations, enabling accurate depth control for operations such as perforation. However, autonomous collar recognition in downhole environments remains challenging because CCL signals are often corrupted by toolstring- or casing-induced magnetic interference, while stringent size and power budgets limit the use of computationally intensive algorithms and specific operations require real-time, in-situ processing. To address these constraints, we propose Collar Recognition Nets (CRNs), a family of domain-specific lightweight 1-D convolutional neural networks for collar signature recognition from streaming CCL waveforms. With depthwise separable convolutions and input pooling, CRNs optimize efficiency without sacrificing accuracy. Our most compact model achieves an F1-score of 0.972 on field data with only 1,985~parameters and 8,208~MACs, and deployed on an ARM Cortex-M7 based embedded system using TensorFlow Lite for Microcontrollers (TFLM) library, the model demonstrates a throughput of 1,000 inference per second and 343.2 μs latency, confirming the feasibility of robust, autonomous, and real-time collar recognition under stringent downhole constraints.
OPTICSNov 28, 2025
Optical diffraction neural networks assisted computational ghost imaging through dynamic scattering mediaYue-Gang Li, Ze Zheng, Jun-jie Wang et al.
Ghost imaging leverages a single-pixel detector with no spatial resolution to acquire object echo intensity signals, which are correlated with illumination patterns to reconstruct an image. This architecture inherently mitigates scattering interference between the object and the detector but sensitive to scattering between the light source and the object. To address this challenge, we propose an optical diffraction neural networks (ODNNs) assisted ghost imaging method for imaging through dynamic scattering media. In our scheme, a set of fixed ODNNs, trained on simulated datasets, is incorporated into the experimental optical path to actively correct random distortions induced by dynamic scattering media. Experimental validation using rotating single-layer and double-layer ground glass confirms the feasibility and effectiveness of our approach. Furthermore, our scheme can also be combined with physics-prior-based reconstruction algorithms, enabling high-quality imaging under undersampled conditions. This work demonstrates a novel strategy for imaging through dynamic scattering media, which can be extended to other imaging systems.