Tien Nguyen

SE
h-index17
5papers
112citations
Novelty44%
AI Score39

5 Papers

SEJun 8, 2019Code
Recovering Variable Names for Minified Code with Usage Contexts

Hieu Tran, Ngoc Tran, Son Nguyen et al.

In modern Web technology, JavaScript (JS) code plays an important role. To avoid the exposure of original source code, the variable names in JS code deployed in the wild are often replaced by short, meaningless names, thus making the code extremely difficult to manually understand and analysis. This paper presents JSNeat, an information retrieval (IR)-based approach to recover the variable names in minified JS code. JSNeat follows a data-driven approach to recover names by searching for them in a large corpus of open-source JS code. We use three types of contexts to match a variable in given minified code against the corpus including the context of properties and roles of the variable, the context of that variable and relations with other variables under recovery, and the context of the task of the function to which the variable contributes. We performed several empirical experiments to evaluate JSNeat on the dataset of more than 322K JS files with 1M functions, and 3.5M variables with 176K unique variable names. We found that JSNeat achieves a high accuracy of 69.1%, which is the relative improvements of 66.1% and 43% over two state-of-the-art approaches JSNice and JSNaughty, respectively. The time to recover for a file or for a variable with JSNeat is twice as fast as with JSNice and 4x as fast as with JNaughty, respectively.

LGNov 2, 2024
From Federated Learning to Quantum Federated Learning for Space-Air-Ground Integrated Networks

Vu Khanh Quy, Nguyen Minh Quy, Tran Thi Hoai et al.

6G wireless networks are expected to provide seamless and data-based connections that cover space-air-ground and underwater networks. As a core partition of future 6G networks, Space-Air-Ground Integrated Networks (SAGIN) have been envisioned to provide countless real-time intelligent applications. To realize this, promoting AI techniques into SAGIN is an inevitable trend. Due to the distributed and heterogeneous architecture of SAGIN, federated learning (FL) and then quantum FL are emerging AI model training techniques for enabling future privacy-enhanced and computation-efficient SAGINs. In this work, we explore the vision of using FL/QFL in SAGINs. We present a few representative applications enabled by the integration of FL and QFL in SAGINs. A case study of QFL over UAV networks is also given, showing the merit of quantum-enabled training approach over the conventional FL benchmark. Research challenges along with standardization for QFL adoption in future SAGINs are also highlighted.

ITNov 1, 2024
Wireless Federated Learning over UAV-enabled Integrated Sensing and Communication

Shaba Shaon, Tien Nguyen, Lina Mohjazi et al.

This paper studies a new latency optimization problem in unmanned aerial vehicles (UAVs)-enabled federated learning (FL) with integrated sensing and communication. In this setup, distributed UAVs participate in model training using sensed data and collaborate with a base station (BS) serving as FL aggregator to build a global model. The objective is to minimize the FL system latency over UAV networks by jointly optimizing UAVs' trajectory and resource allocation of both UAVs and the BS. The formulated optimization problem is troublesome to solve due to its non-convexity. Hence, we develop a simple yet efficient iterative algorithm to find a high-quality approximate solution, by leveraging block coordinate descent and successive convex approximation techniques. Simulation results demonstrate the effectiveness of our proposed joint optimization strategy under practical parameter settings, saving the system latency up to 68.54\% compared to benchmark schemes.

SEApr 6
Assessing Large Language Models for Stabilizing Numerical Expression in Scientific Software

Tien Nguyen, Muhammad Ali Gulzar, Kirshanthan Sundararajah

Scientific software relies on high-precision computation, yet finite floating-point representations can introduce precision errors that propagate in safety-critical domains. Despite the growing use of large language models (LLMs) in scientific applications, their reliability in handling floating-point numerical stability has not been systematically evaluated. This paper evaluates LLMs' reasoning on high-precision numerical computation through two numerical stabilization tasks: (1) detecting instability in numerical expressions by generating error-inducing inputs (detection), and (2) rewriting expressions to improve numerical stability (stabilization). Using popular numerical benchmarks, we assess six LLMs on nearly 2,470 numerical structures, including nested conditionals, high-precision literals, and multi-variable arithmetic. Our results show that LLMs are equally effective as state-of-the-art traditional approaches in detecting and stabilizing numerically unstable computations. More notably, LLMs outperform baseline methods precisely where the latter fail: in 17.4% (431) of expressions where the baseline does not improve accuracy, LLMs successfully stabilize 422 (97.9%) of them, and achieve greater stability than the baseline across 65.4% (1,615) of all expressions. However, LLMs struggle with control flow and high-precision literals, consistently removing such structures rather than reasoning about their numerical implications, whereas they perform substantially better on purely symbolic expressions. Together, these findings suggest that LLMs are effective at stabilizing expressions that classical techniques cannot, yet struggle when exact numerical magnitudes and control flow semantics must be precisely reasoned about, as such concrete patterns are rarely encountered during training.

HCNov 11, 2014
User Session Identification Based on Strong Regularities in Inter-activity Time

Aaron Halfaker, Os Keyes, Daniel Kluver et al.

Session identification is a common strategy used to develop metrics for web analytics and behavioral analyses of user-facing systems. Past work has argued that session identification strategies based on an inactivity threshold is inherently arbitrary or advocated that thresholds be set at about 30 minutes. In this work, we demonstrate a strong regularity in the temporal rhythms of user initiated events across several different domains of online activity (incl. video gaming, search, page views and volunteer contributions). We describe a methodology for identifying clusters of user activity and argue that regularity with which these activity clusters appear implies a good rule-of-thumb inactivity threshold of about 1 hour. We conclude with implications that these temporal rhythms may have for system design based on our observations and theories of goal-directed human activity.