Han Dai

16.0DSMay 6

On Tight FPT Time Approximation Algorithms for k-Clustering Problems

Han Dai, Shi Li, Sijin Peng

Following recent advances in combining approximation algorithms with fixed-parameter tractability (FPT), we study FPT-time approximation algorithms for minimum-norm $k$-clustering problems, parameterized by the number $k$ of open facilities. For the capacitated setting, we give a tight $(3+ε)$-approximation for the general-norm capacitated $k$-clustering problem in FPT-time parameterized by $k$ and $ε$. Prior to our work, such a result was only known for the capacitated $k$-median problem [CL, ICALP, 2019]. As a special case, our result yields an FPT-time $3$-approximation for capacitated $k$-center. The problem has not been studied in the FPT-time setting, with the previous best known polynomial-time approximation ratio being 9 [ABCG, MP, 2015]. In the uncapacitated setting, we consider the $top$-$cn$ norm $k$-clustering problem, where the goal of the problem is to minimize the $top$-$cn$ norm of the connection distance vector. Our main result is a tight $\big(1 + \frac 2{ec} + ε\big)$-approximation algorithm for the problem with $c \in \big(\frac1e, 1\big]$. (For the case $c \leq \frac1e$, there is a simple tight $(3+ε)$-approximation.) Our framework can be easily extended to give a tight $\left(3, 1+\frac2e + ε\right)$-bicriteria approximation for the ($k$-center, $k$-median) problem in FPT time, improving the previous best polynomial-time $(4, 8)$ guarantee [AB, WAOA, 2017]. All results are based on a unified framework: computing a $(1+ε)$-approximate solution using $O\left(\frac{k\log n}ε\right)$ facilities $S$ via LP rounding, sampling a few client representatives $R$ based on the solution $S$, guessing a few pivots from $S \cup R$ and some radius information on the pivots, and solving the problem using the guesses. We believe this framework can lead to further results on $k$-clustering problems.

6.5CRMay 4

SCRIBE: Practical Static Binary Patching via Binary-Aware Recompilation of Decompiled Code

Han Dai, Soumyakant Priyadarshan, Abdullah Imran et al.

When source code or the original toolchain is unavailable, patching binaries is difficult because it requires editing low-level assembly code directly. As an alternative, one can decompile the binary, apply the patch at the source level, and then recompile the modified code. However, as this paper demonstrates, this workflow is hindered by pervasive syntactic and semantic inaccuracies in the output of modern decompilers, many of which prior work has overlooked. To address these challenges, we present SCRIBE, a patching framework that handles syntactic and semantic issues in decompiled code, improving both recompilation success and correctness. SCRIBE's novel "binary-aware" recompilation approach repairs semantic inaccuracies in decompiler output by leveraging information extracted directly from the original binary. In our evaluation, SCRIBE resolved approximately 81% of previously incorrect functions produced by the Hex-Rays decompiler, demonstrating the effectiveness of its approach. Moreover, we show that, using SCRIBE, it is possible to patch 13 of 14 real-world CVEs without access to the original source code and without performing any manual binary editing. To further validate our findings, we conducted a user study with 18 participants. Using SCRIBE, participants achieved 100% patching success, compared to 3.7% without it. Finally, we asked three large language models to generate source-level patches via SCRIBE; all three achieved 100% success when using the framework, demonstrating its potential to enable fully automated patching. Overall, these results indicate that SCRIBE makes source-level patching of binaries accessible and reliable, even without access to the original source.

Han Dai

2 Papers