Simeng Qi

2papers

2 Papers

DCFeb 14

ACE-Bench: A Lightweight Benchmark for Evaluating Azure SDK Usage Correctness

Wenxing Zhu, Simeng Qi, Junkui Chen et al.

We present ACE-Bench (Azure SDK Coding Evaluation Benchmark), an execution-free benchmark that provides fast, reproducible pass or fail signals for whether large language model (LLM)-based coding agents use Azure SDKs correctly-without provisioning cloud resources or maintaining fragile end-to-end test environments. ACE-Bench turns official Azure SDK documentation examples into self-contained coding tasks and validates solutions with task-specific atomic criteria: deterministic regex checks that enforce required API usage patterns and reference-based LLM-judge checks that capture semantic workflow constraints. This design makes SDK-centric evaluation practical in day-to-day development and CI: it reduces evaluation cost, improves repeatability, and scales to new SDKs and languages as documentation evolves. Using a lightweight coding agent, we benchmark multiple state-of-the-art LLMs and quantify the benefit of retrieval in an MCP-enabled augmented setting, showing consistent gains from documentation access while highlighting substantial cross-model differences.

SDFeb 18, 2016

Audio Recording Device Identification Based on Deep Learning

Simeng Qi, Zheng Huang, Yan Li et al.

In this paper we present a research on identification of audio recording devices from background noise, thus providing a method for forensics. The audio signal is the sum of speech signal and noise signal. Usually, people pay more attention to speech signal, because it carries the information to deliver. So a great amount of researches have been dedicated to getting higher Signal-Noise-Ratio (SNR). There are many speech enhancement algorithms to improve the quality of the speech, which can be seen as reducing the noise. However, noises can be regarded as the intrinsic fingerprint traces of an audio recording device. These digital traces can be characterized and identified by new machine learning techniques. Therefore, in our research, we use the noise as the intrinsic features. As for the identification, multiple classifiers of deep learning methods are used and compared. The identification result shows that the method of getting feature vector from the noise of each device and identifying them with deep learning techniques is viable, and well-preformed.