Siyu Yu

CV
h-index49
7papers
41citations
Novelty58%
AI Score53

7 Papers

MLOct 31, 2024Code
Prospective Learning: Learning for a Dynamic Future

Ashwin De Silva, Rahul Ramesh, Rubing Yang et al.

In real-world applications, the distribution of the data, and our goals, evolve over time. The prevailing theoretical framework for studying machine learning, namely probably approximately correct (PAC) learning, largely ignores time. As a consequence, existing strategies to address the dynamic nature of data and goals exhibit poor real-world performance. This paper develops a theoretical framework called "Prospective Learning" that is tailored for situations when the optimal hypothesis changes over time. In PAC learning, empirical risk minimization (ERM) is known to be consistent. We develop a learner called Prospective ERM, which returns a sequence of predictors that make predictions on future data. We prove that the risk of prospective ERM converges to the Bayes risk under certain assumptions on the stochastic process generating the data. Prospective ERM, roughly speaking, incorporates time as an input in addition to the data. We show that standard ERM as done in PAC learning, without incorporating time, can result in failure to learn when distributions are dynamic. Numerical experiments illustrate that prospective ERM can learn synthetic and visual recognition problems constructed from MNIST and CIFAR-10. Code at https://github.com/neurodata/prolearn.

LGJul 10, 2025Code
Prospective Learning in Retrospect

Yuxin Bai, Cecelia Shuai, Ashwin De Silva et al.

In most real-world applications of artificial intelligence, the distributions of the data and the goals of the learners tend to change over time. The Probably Approximately Correct (PAC) learning framework, which underpins most machine learning algorithms, fails to account for dynamic data distributions and evolving objectives, often resulting in suboptimal performance. Prospective learning is a recently introduced mathematical framework that overcomes some of these limitations. We build on this framework to present preliminary results that improve the algorithm and numerical results, and extend prospective learning to sequential decision-making scenarios, specifically foraging. Code is available at: https://github.com/neurodata/prolearn2.

34.3SEMar 13
LogPTR: Variable-Aware Log Parsing with Pointer Network

Yifan Wu, Bingxu Chai, Siyu Yu et al.

Due to the sheer size of software logs, developers rely on automated log analysis. Log parsing, which parses semi-structured logs into a structured format, is a prerequisite of automated log analysis. However, existing log parsers are unsatisfactory when applied in practice because they 1) ignore categories of variables, and 2) need labor-intensive model tuning. To address these limitations, we propose LogPTR, a variable-aware log parser that can extract the static and dynamic parts in logs, and further identify categories of variables. The key of LogPTR is formulating log parsing as a text summarization problem and using a pointer mechanism to copy words from the log message and label tokens indicating categories of variables. The experimental results on widely-used benchmark datasets show that LogPTR outperforms state-of-the-art log parsers on both general log parsing that extracts log templates and variable-aware log parsing that further identifies categories of variables.

LGSep 29, 2025
LogAction: Consistent Cross-system Anomaly Detection through Logs via Active Domain Adaptation

Chiming Duan, Minghua He, Pei Xiao et al.

Log-based anomaly detection is a essential task for ensuring the reliability and performance of software systems. However, the performance of existing anomaly detection methods heavily relies on labeling, while labeling a large volume of logs is highly challenging. To address this issue, many approaches based on transfer learning and active learning have been proposed. Nevertheless, their effectiveness is hindered by issues such as the gap between source and target system data distributions and cold-start problems. In this paper, we propose LogAction, a novel log-based anomaly detection model based on active domain adaptation. LogAction integrates transfer learning and active learning techniques. On one hand, it uses labeled data from a mature system to train a base model, mitigating the cold-start issue in active learning. On the other hand, LogAction utilize free energy-based sampling and uncertainty-based sampling to select logs located at the distribution boundaries for manual labeling, thus addresses the data distribution gap in transfer learning with minimal human labeling efforts. Experimental results on six different combinations of datasets demonstrate that LogAction achieves an average 93.01% F1 score with only 2% of manual labels, outperforming some state-of-the-art methods by 26.28%. Website: https://logaction.github.io

NEJul 15, 2025
Biological Processing Units: Leveraging an Insect Connectome to Pioneer Biofidelic Neural Architectures

Siyu Yu, Zihan Qin, Tingshan Liu et al.

The complete connectome of the Drosophila larva brain offers a unique opportunity to investigate whether biologically evolved circuits can support artificial intelligence. We convert this wiring diagram into a Biological Processing Unit (BPU), a fixed recurrent network derived directly from synaptic connectivity. Despite its modest size 3,000 neurons and 65,000 weights between them), the unmodified BPU achieves 98% accuracy on MNIST and 58% on CIFAR-10, surpassing size-matched MLPs. Scaling the BPU via structured connectome expansions further improves CIFAR-10 performance, while modality-specific ablations reveal the uneven contributions of different sensory subsystems. On the ChessBench dataset, a lightweight GNN-BPU model trained on only 10,000 games achieves 60% move accuracy, nearly 10x better than any size transformer. Moreover, CNN-BPU models with ~2M parameters outperform parameter-matched Transformers, and with a depth-6 minimax search at inference, reach 91.7% accuracy, exceeding even a 9M-parameter Transformer baseline. These results demonstrate the potential of biofidelic neural architectures to support complex cognitive tasks and motivate scaling to larger and more intelligent connectomes in future work.

CVDec 1, 2017
A Novel Brain Decoding Method: a Correlation Network Framework for Revealing Brain Connections

Siyu Yu, Nanning Zheng, Yongqiang Ma et al.

Brain decoding is a hot spot in cognitive science, which focuses on reconstructing perceptual images from brain activities. Analyzing the correlations of collected data from human brain activities and representing activity patterns are two problems in brain decoding based on functional magnetic resonance imaging (fMRI) signals. However, existing correlation analysis methods mainly focus on the strength information of voxel, which reveals functional connectivity in the cerebral cortex. They tend to neglect the structural information that implies the intracortical or intrinsic connections; that is, structural connectivity. Hence, the effective connectivity inferred by these methods is relatively unilateral. Therefore, we proposed a correlation network (CorrNet) framework that could be flexibly combined with diverse pattern representation models. In the CorrNet framework, the topological correlation was introduced to reveal structural information. Rich correlations were obtained, which contributed to specifying the underlying effective connectivity. We also combined the CorrNet framework with a linear support vector machine (SVM) and a dynamic evolving spike neuron network (SNN) for pattern representation separately, thus providing a novel method for decoding cognitive activity patterns. Experimental results verified the reliability and robustness of our CorrNet framework and demonstrated that the new method achieved significant improvement in brain decoding over comparable methods.

CVMay 1, 2017
Detecting Drivable Area for Self-driving Cars: An Unsupervised Approach

Ziyi Liu, Siyu Yu, Xiao Wang et al.

It has been well recognized that detecting drivable area is central to self-driving cars. Most of existing methods attempt to locate road surface by using lane line, thereby restricting to drivable area on which have a clear lane mark. This paper proposes an unsupervised approach for detecting drivable area utilizing both image data from a monocular camera and point cloud data from a 3D-LIDAR scanner. Our approach locates initial drivable areas based on a "direction ray map" obtained by image-LIDAR data fusion. Besides, a fusion of the feature level is also applied for more robust performance. Once the initial drivable areas are described by different features, the feature fusion problem is formulated as a Markov network and a belief propagation algorithm is developed to perform the model inference. Our approach is unsupervised and avoids common hypothesis, yet gets state-of-the-art results on ROAD-KITTI benchmark. Experiments show that our unsupervised approach is efficient and robust for detecting drivable area for self-driving cars.