Dan Yu

CV
h-index17
12papers
129citations
Novelty37%
AI Score39

12 Papers

LGOct 5, 2023
A 5' UTR Language Model for Decoding Untranslated Regions of mRNA and Function Predictions

Yanyi Chu, Dan Yu, Yupeng Li et al.

The 5' UTR, a regulatory region at the beginning of an mRNA molecule, plays a crucial role in regulating the translation process and impacts the protein expression level. Language models have showcased their effectiveness in decoding the functions of protein and genome sequences. Here, we introduced a language model for 5' UTR, which we refer to as the UTR-LM. The UTR-LM is pre-trained on endogenous 5' UTRs from multiple species and is further augmented with supervised information including secondary structure and minimum free energy. We fine-tuned the UTR-LM in a variety of downstream tasks. The model outperformed the best-known benchmark by up to 42% for predicting the Mean Ribosome Loading, and by up to 60% for predicting the Translation Efficiency and the mRNA Expression Level. The model also applies to identifying unannotated Internal Ribosome Entry Sites within the untranslated region and improves the AUPR from 0.37 to 0.52 compared to the best baseline. Further, we designed a library of 211 novel 5' UTRs with high predicted values of translation efficiency and evaluated them via a wet-lab assay. Experiment results confirmed that our top designs achieved a 32.5% increase in protein production level relative to well-established 5' UTR optimized for therapeutics.

DSMay 3, 2016
A Computationally Optimal Randomized Proper Orthogonal Decomposition Technique

Dan Yu, Suman Chakravorty

In this paper, we consider the model reduction problem of large-scale systems, such as systems obtained through the discretization of partial differential equations. We propose a computationally optimal randomized proper orthogonal decomposition (RPOD*) technique to obtain the reduced order model by perturbing the primal and adjoint system using Gaussian white noise. We show that the computations required by the RPOD* algorithm is orders of magnitude cheaper when compared to the balanced proper orthogonal decomposition (BPOD) algorithm and BPOD output projection algorithm while the performance of the RPOD* algorithm is much better than BPOD output projection algorithm. It is optimal in the sense that a minimal number of snapshots is needed. We also relate the RPOD* algorithm to random projection algorithms. The method is tested on two advection-diffusion equations.

DSApr 4, 2016
An autoregressive (AR) model based stochastic unknown input realization and filtering technique

Dan Yu, Suman Chakravorty

This paper studies the state estimation problem of linear discrete-time systems with stochastic unknown inputs. The unknown input is a wide-sense stationary process while no other prior informaton needs to be known. We propose an autoregressive (AR) model based unknown input realization technique which allows us to recover the input statistics from the output data by solving an appropriate least squares problem, then fit an AR model to the recovered input statistics and construct an innovations model of the unknown inputs using the eigensystem realization algorithm (ERA). An augmented state system is constructed and the standard Kalman filter is applied for state estimation. A reduced order model (ROM) filter is also introduced to reduce the computational cost of the Kalman filter. Two numerical examples are given to illustrate the procedure.

CVAug 26, 2021Code
An Underwater Image Semantic Segmentation Method Focusing on Boundaries and a Real Underwater Scene Semantic Segmentation Dataset

Zhiwei Ma, Haojie Li, Zhihui Wang et al.

With the development of underwater object grabbing technology, underwater object recognition and segmentation of high accuracy has become a challenge. The existing underwater object detection technology can only give the general position of an object, unable to give more detailed information such as the outline of the object, which seriously affects the grabbing efficiency. To address this problem, we label and establish the first underwater semantic segmentation dataset of real scene(DUT-USEG:DUT Underwater Segmentation Dataset). The DUT- USEG dataset includes 6617 images, 1487 of which have semantic segmentation and instance segmentation annotations, and the remaining 5130 images have object detection box annotations. Based on this dataset, we propose a semi-supervised underwater semantic segmentation network focusing on the boundaries(US-Net: Underwater Segmentation Network). By designing a pseudo label generator and a boundary detection subnetwork, this network realizes the fine learning of boundaries between underwater objects and background, and improves the segmentation effect of boundary areas. Experiments show that the proposed method improves by 6.7% in three categories of holothurian, echinus, starfish in DUT-USEG dataset, and achieves state-of-the-art results. The DUT- USEG dataset will be released at https://github.com/baxiyi/DUT-USEG.

CRApr 26
Breaking the Secret: Economic Interventions for Combating Collusion in Embodied Multi-Agent Systems

Qi Liu, Xiaohui Chen, Zhihui Zhao et al.

Collusion among autonomous agents poses a critical security threat in embodied multi-agent systems (MAS), where coordinated behaviors can deviate from global objectives and lead to real-world consequences. Existing defenses, primarily based on identity control or post-hoc behavior analysis, are insufficient to address such threats in embodied settings due to delayed feedback and noisy observations in physical environments, which make behavioral deviations difficult to detect accurately and in a timely manner. To address this challenge, we propose a mutagenic incentive intervention approach that mitigates collusion by reshaping agents' payoff structures. By rewarding agents who report collusive behavior and penalizing identified participants, the mechanism induces strategic defection and renders collusion unstable. We further design supporting mechanisms, including reporting deposits, smart contract-based reward enforcement, and encrypted communication, to ensure robustness against misuse of the incentive mechanism and retaliation from penalized agents. We implement the proposed approach in both simulated and real-world embodied environments. Experimental results show that our method effectively suppresses collusion by inducing defection, while preserving system efficiency. It achieves performance comparable to the non-collusion baseline and outperforms representative reactive defenses, thereby fulfilling the desired security objectives. These results demonstrate the effectiveness of proactive incentive design as a practical paradigm for securing embodied multi-agent systems.

CLDec 16, 2024
Intention Knowledge Graph Construction for User Intention Relation Modeling

Jiaxin Bai, Zhaobo Wang, Junfei Cheng et al.

Understanding user intentions is challenging for online platforms. Recent work on intention knowledge graphs addresses this but often lacks focus on connecting intentions, which is crucial for modeling user behavior and predicting future actions. This paper introduces a framework to automatically generate an intention knowledge graph, capturing connections between user intentions. Using the Amazon m2 dataset, we construct an intention graph with 351 million edges, demonstrating high plausibility and acceptance. Our model effectively predicts new session intentions and enhances product recommendations, outperforming previous state-of-the-art methods and showcasing the approach's practical utility.

CVApr 12, 2024
A Survey of Neural Network Robustness Assessment in Image Recognition

Jie Wang, Jun Ai, Minyan Lu et al.

In recent years, there has been significant attention given to the robustness assessment of neural networks. Robustness plays a critical role in ensuring reliable operation of artificial intelligence (AI) systems in complex and uncertain environments. Deep learning's robustness problem is particularly significant, highlighted by the discovery of adversarial attacks on image classification models. Researchers have dedicated efforts to evaluate robustness in diverse perturbation conditions for image recognition tasks. Robustness assessment encompasses two main techniques: robustness verification/ certification for deliberate adversarial attacks and robustness testing for random data corruptions. In this survey, we present a detailed examination of both adversarial robustness (AR) and corruption robustness (CR) in neural network assessment. Analyzing current research papers and standards, we provide an extensive overview of robustness assessment in image recognition. Three essential aspects are analyzed: concepts, metrics, and assessment methods. We investigate the perturbation metrics and range representations used to measure the degree of perturbations on images, as well as the robustness metrics specifically for the robustness conditions of classification models. The strengths and limitations of the existing methods are also discussed, and some potential directions for future research are provided.

CVMay 23, 2025
EMRA-proxy: Enhancing Multi-Class Region Semantic Segmentation in Remote Sensing Images with Attention Proxy

Yichun Yu, Yuqing Lan, Zhihuan Xing et al.

High-resolution remote sensing (HRRS) image segmentation is challenging due to complex spatial layouts and diverse object appearances. While CNNs excel at capturing local features, they struggle with long-range dependencies, whereas Transformers can model global context but often neglect local details and are computationally expensive.We propose a novel approach, Region-Aware Proxy Network (RAPNet), which consists of two components: Contextual Region Attention (CRA) and Global Class Refinement (GCR). Unlike traditional methods that rely on grid-based layouts, RAPNet operates at the region level for more flexible segmentation. The CRA module uses a Transformer to capture region-level contextual dependencies, generating a Semantic Region Mask (SRM). The GCR module learns a global class attention map to refine multi-class information, combining the SRM and attention map for accurate segmentation.Experiments on three public datasets show that RAPNet outperforms state-of-the-art methods, achieving superior multi-class segmentation accuracy.

LGApr 17, 2019
Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems

Ran Wang, Karthikeya Parunandi, Dan Yu et al.

This paper addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical system with continuous state space, continuous action space and unknown dynamics. This class of problems are typically addressed in stochastic adaptive control and reinforcement learning literature using model-based and model-free approaches respectively. Both methods rely on solving a dynamic programming problem, either directly or indirectly, for finding the optimal closed loop control policy. The inherent `curse of dimensionality' associated with dynamic programming method makes these approaches also computationally difficult. This paper proposes a novel decoupled data-based control (D2C) algorithm that addresses this problem using a decoupled, `open loop - closed loop', approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Then, a closed loop control is developed around this open loop trajectory by linearization of the dynamics about this nominal trajectory. By virtue of linearization, a linear quadratic regulator based algorithm can be used for this closed loop control. We show that the performance of D2C algorithm is approximately optimal. Moreover, simulation performance suggests significant reduction in training time compared to other state of the art algorithms.

SYSep 10, 2018
A Decoupled Data Based Approach to Stochastic Optimal Control Problems

Dan Yu, Mohammandhussen Rafieisakhaei, Suman Chakravorty

This paper studies the stochastic optimal control problem for systems with unknown dynamics. A novel decoupled data based control (D2C) approach is proposed, which solves the problem in a decoupled "open loop-closed loop" fashion that is shown to be near-optimal. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system using a standard nonlinear programming (NLP) solver. Then a Linear Quadratic Regulator (LQR) controller is designed for the nominal trajectory-dependent linearized system which is learned using input-output experimental data. Computational examples are used to illustrate the performance of the proposed approach with three benchmark problems.

SYJul 11, 2017
A Separation-Based Design to Data-Driven Control for Large-Scale Partially Observed Systems

Dan Yu, Mohammadhussein Rafieisakhaei, Suman Chakravorty

This paper studies the partially observed stochastic optimal control problem for systems with state dynamics governed by Partial Differential Equations (PDEs) that leads to an extremely large problem. First, an open-loop deterministic trajectory optimization problem is solved using a black box simulation model of the dynamical system. Next, a Linear Quadratic Gaussian (LQG) controller is designed for the nominal trajectory-dependent linearized system, which is identified using input-output experimental data consisting of the impulse responses of the optimized nominal system. A computational nonlinear heat example is used to illustrate the performance of the approach.

SYMay 27, 2017
Stochastic Feedback Control of Systems with Unknown Nonlinear Dynamics

Dan Yu, Mohammadhussein Rafieisakhaei, Suman Chakravorty

This paper studies the stochastic optimal control problem for systems with unknown dynamics. First, an open-loop deterministic trajectory optimization problem is solved without knowing the explicit form of the dynamical system. Next, a Linear Quadratic Gaussian (LQG) controller is designed for the nominal trajectory-dependent linearized system, such that under a small noise assumption, the actual states remain close to the optimal trajectory. The trajectory-dependent linearized system is identified using input-output experimental data consisting of the impulse responses of the nominal system. A computational example is given to illustrate the performance of the proposed approach.