Jinyuan Sun

CR
6papers
117citations
Novelty49%
AI Score46

6 Papers

79.7QMMay 5Code
ProtDBench: A Unified Benchmark of Protein Binder Design and Evaluation

Cong Liu, Milong Ren, Jiaqi Guan et al.

Recent advances in de novo protein binder design have enabled increasing experimental validation, yet reported in silico metrics remain difficult to interpret or compare across studies due to non-standardized evaluation protocols. We introduce ProtDBench, a standardized and throughput-aware evaluation framework for protein binder design. ProtDBench defines unified benchmark tasks, evaluation protocols, and success criteria, enabling systematic analysis of how evaluation design influences observed performance. Using a large wet-lab annotated dataset, we analyze commonly used structure prediction models as evaluation verifiers, revealing substantial verifier-dependent bias and limited agreement under identical filtering protocols. We then benchmark representative open-source generative binder design methods across ten diverse protein targets under a fixed evaluation protocol. Beyond per-sequence success rates, ProtDBench incorporates throughput-aware metrics based on a fixed 24-hour budget, as well as cluster-level success criteria to account for structural diversity. Together, these results expose systematic differences induced by filtering rules, success definitions, and throughput-aware evaluation between computational efficiency, success rate, and structural diversity. Overall, ProtDBench provides a fair and reproducible evaluation pipeline that supports systematic and controlled comparison of protein binder design methods under realistic evaluation settings.

62.4CRMay 28
Token Inflation: How Dishonest Providers Can Overcharge for Large Language Model Usage

Shahinul Hoque, Jinghuai Zhang, Jinyuan Sun et al.

Per-token billing is now the standard pricing model for commercial large language models (LLMs), so the honesty of reported token counts directly affects what users pay. We show that this kind of billing is hard to audit by design: providers hide the model, the tokenizer, and the execution to protect their IP, mitigate jailbreaks, and preserve user privacy, which means an auditor can only inspect proofs the provider supplies. The audit therefore reduces to a consistency check on the provider's own reports. We call this a trust paradox: every audit must trust some artifact, but current frameworks trust exactly the ones a provider has the strongest reason to manipulate. We study three recent token auditing frameworks and show that a provider with ordinary commercial capabilities can systematically inflate billed token counts. In the most permissive setting, hidden reasoning usage can be inflated by 1,469% on average without detection. At current frontier reasoning prices, that turns a \$100 honest bill into roughly a \$1,569 bill on the same query. Even when the user can see the full reasoning string, tokenization ambiguity alone still allows 50.85% over-reporting below the detection threshold. These results suggest the problem is not in any specific auditor but in any audit whose evidence comes from the audited party. Restoring honest billing will require verification that ties reported token counts to evidence the provider does not control, such as trusted execution attestation, cryptographic proofs of inference, or third-party re-execution.

LGNov 24, 2020
Latent Group Structured Multi-task Learning

Xiangyu Niu, Yifan Sun, Jinyuan Sun

In multi-task learning (MTL), we improve the performance of key machine learning algorithms by training various tasks jointly. When the number of tasks is large, modeling task structure can further refine the task relationship model. For example, often tasks can be grouped based on metadata, or via simple preprocessing steps like K-means. In this paper, we present our group structured latent-space multi-task learning model, which encourages group structured tasks defined by prior information. We use an alternating minimization method to learn the model parameters. Experiments are conducted on both synthetic and real-world datasets, showing competitive performance over single-task learning (where each group is trained separately) and other MTL baselines.

CRJun 15, 2020
BubbleMap: Privilege Mapping for Behavior-based Implicit Authentication Systems

Yingyuan Yang, Xueli Huang, Jiangnan Li et al.

Leveraging users' behavioral data sampled by various sensors during the identification process, implicit authentication (IA) relieves users from explicit actions such as remembering and entering passwords. Various IA schemes have been proposed based on different behavioral and contextual features such as gait, touch, and GPS. However, existing IA schemes suffer from false positives, i.e., falsely accepting an adversary, and false negatives, i.e., falsely rejecting the legitimate user due to users' behavior change and noise. To deal with this problem, we propose BubbleMap (BMap), a framework that can be seamlessly incorporated into any existing IA system to balance between security (reducing false positives) and usability (reducing false negatives) as well as reducing the equal error rate (EER). To evaluate the proposed framework, we implemented BMap on five state-of-the-art IA systems. We also conducted an experiment in a real-world environment from 2016 to 2020. Most of the experimental results show that BMap can greatly enhance the IA schemes' performances in terms of the EER, security, and usability, with a small amount of penalty on energy consumption.

CRJun 13, 2020
EchoIA: Implicit Authentication System Based on User Feedback

Yingyuan Yang, Xueli Huang, Jiangnan Li et al.

Implicit authentication (IA) transparently authenticates users by utilizing their behavioral data sampled from various sensors. Identifying the illegitimate user through constantly analyzing current users' behavior, IA adds another layer of protection to the smart device. Due to the diversity of human behavior, the existing research works tend to simultaneously utilize many different features to identify users, which is less efficient. Irrelevant features may increase system delay and reduce the authentication accuracy. However, dynamically choosing the best suitable features for each user (personal features) requires a massive calculation, especially in the real environment. In this paper, we proposed EchoIA to find personal features with a small amount of calculation by utilizing user feedback. In the authentication phase, our approach maintains the transparency, which is the major advantage of IA. In the past two years, we conducted a comprehensive experiment to evaluate EchoIA. We compared it with other state-of-the-art IA schemes in the aspect of authentication accuracy and efficiency. The experiment results show that EchoIA has better authentication accuracy (93\%) and less energy consumption (23-hour battery lifetimes) than other IA schemes.

CRAug 3, 2018
Dynamic Detection of False Data Injection Attack in Smart Grid using Deep Learning

Xiangyu Niu Jiangnan Li, Jinyuan Sun

Modern advances in sensor, computing, and communication technologies enable various smart grid applications. The heavy dependence on communication technology has highlighted the vulnerability of the electricity grid to false data injection (FDI) attacks that can bypass bad data detection mechanisms. Existing mitigation in the power system either focus on redundant measurements or protect a set of basic measurements. These methods make specific assumptions about FDI attacks, which are often restrictive and inadequate to deal with modern cyber threats. In the proposed approach, a deep learning based framework is used to detect injected data measurement. Our time-series anomaly detector adopts a Convolutional Neural Network (CNN) and a Long Short Term Memory (LSTM) network. To effectively estimate system variables, our approach observes both data measurements and network level features to jointly learn system states. The proposed system is tested on IEEE 39-bus system. Experimental analysis shows that the deep learning algorithm can identify anomalies which cannot be detected by traditional state estimation bad data detection.