Kyu Hyung Lee

CR
3papers
22citations
Novelty53%
AI Score43

3 Papers

90.8SEApr 19
Multi-LLM Orchestration for High-Quality Code Generation: Exploiting Complementary Model Strengths

Huashan Chen, Zhenyu Qi, Haotang Li et al.

Large Language Models (LLMs) have become central to automated code generation, yet existing approaches operate within a single-LLM paradigm: one model is selected and applied throughout the entire generation process. We observe that different LLMs exhibit complementary strengths: no single model dominates across all programming languages, algorithmic problem categories, or development stages. Multi-LLM collaboration, structured as per-stage, per-category routing rather than majority voting, produces higher-quality code than any individual model. Based on this observation, we propose PerfOrch, a multi-agent orchestration system that decomposes code generation into four collaborative agents: categorization, generation, debugging, and refinement. Each agent maintains a Memory module: a ranking matrix indexed by programming language and problem category, constructed from offline profiling and consulted at runtime to select the most suitable model for each task. We evaluate PerfOrch on two benchmarks, HumanEval-X and EffiBench-X, totaling 2,500 problems across five languages (Python, Java, C++, Go, and Rust). PerfOrch achieves average pass@1 rates of 97.19% on HumanEval-X and 95.83% on EffiBench-X, improving over the strongest single-model pipeline by 1.22-14.58 percentage points across languages. Notably, Memory rankings constructed solely from HumanEval-X profiling generalize to the entirely unseen EffiBench-X benchmark without re-profiling, demonstrating that the complementary-strength patterns PerfOrch exploits are properties of the models rather than artifacts of a specific problem distribution. Beyond correctness, PerfOrch improves execution time for 61-90% of solved problems with mean speedups of 4.7-29.9%, matching the refinement coverage of exhaustive multi-model evaluation at roughly half the token cost.

CRSep 3, 2019Code
GrAALF:Supporting Graphical Analysis of Audit Logs for Forensics

Omid Setayeshfar, Christian Adkins, Matthew Jones et al.

System-level audit logs often play a critical role in computer forensics. They capture low-level interactions between programs and users in much detail, making them a rich source of insight and provenance on malicious user activity. However, using these logs to discover and understand malicious activities when a typical computer generates more than 2.5 million system events hourly is both compute and time-intensive. We introduce a graphical system called GrAALF for efficiently loading, storing, processing, querying, and displaying system events to support computer forensics. In comparison to other related systems such as AIQL [13] and SAQL [12], GrAALF offers the flexibility of multiple backend storage solutions, easy-to-use and intuitive querying of logs, and the ability to trace back longer sequences of system events in (near) real-time to help identify and isolate attacks. Equally important, both AIQL and SAQL are not available for public use, whereas GrAALF is open-source. GrAALF offers the choice of compactly storing the logs in main memory, in a relational database system, in a hybrid main memory-database system, and a graph-based database. We compare the responsiveness of each of these options, using multiple huge system-call log files. Next, in multiple real-world attack scenarios, we demonstrate the efficacy and usefulness of GrAALF in identifying the attack and discovering its provenance. Consequently, GrAALF offers a robust solution for analysis of audit logs to support computer forensics.

CRFeb 15, 2020
Measuring Abuse in Web Push Advertising

Karthika Subramani, Xingzi Yuan, Omid Setayeshfar et al.

The rapid growth of online advertising has fueled the growth of ad-blocking software, such as new ad-blocking and privacy-oriented browsers or browser extensions. In response, both ad publishers and ad networks are constantly trying to pursue new strategies to keep up their revenues. To this end, ad networks have started to leverage the Web Push technology enabled by modern web browsers. As web push notifications (WPNs) are relatively new, their role in ad delivery has not been yet studied in depth. Furthermore, it is unclear to what extent WPN ads are being abused for malvertising (i.e., to deliver malicious ads). In this paper, we aim to fill this gap. Specifically, we propose a system called PushAdMiner that is dedicated to (1) automatically registering for and collecting a large number of web-based push notifications from publisher websites, (2) finding WPN-based ads among these notifications, and (3) discovering malicious WPN-based ad campaigns. Using PushAdMiner, we collected and analyzed 21,541 WPN messages by visiting thousands of different websites. Among these, our system identified 572 WPN ad campaigns, for a total of 5,143 WPN-based ads that were pushed by a variety of ad networks. Furthermore, we found that 51% of all WPN ads we collected are malicious, and that traditional ad-blockers and malicious URL filters are remarkably ineffective against WPN-based malicious ads, leaving a significant abuse vector unchecked.