Hailong Jiang

DC
h-index17
7papers
8citations
Novelty39%
AI Score46

7 Papers

33.6DCMar 28
CarbonEdge: Carbon-Aware Deep Learning Inference Framework for Sustainable Edge Computing

Guilin Zhang, Wulan Guo, Ziqi Tan et al.

Deep learning applications at the network edge lead to a significant growth in AI-related carbon emissions, presenting a critical sustainability challenge. The existing edge computing frameworks optimize for latency and throughput, but they largely ignore the environmental impact of inference workloads. This paper introduces CarbonEdge, a carbon-aware deep learning inference framework that extends adaptive model partitioning with carbon footprint estimation and green scheduling apabilities. We propose a carbon-aware scheduling algorithm that extends traditional weighted scoring with a carbon efficiency metric, supporting a tunable performance--carbon trade-off (demonstrated via weight sweep). Experimental evaluations on Docker-simulated heterogeneous edge environments show that CarbonEdge-Green mode achieves a 22.9% reduction in carbon emissions compared to monolithic execution. The framework achieves 1.3x improvement in carbon efficiency (245.8 vs 189.5 inferences per gram CO2) with negligible scheduling overhead (0.03ms per task). These results highlight the framework's potential for sustainable edge AI deployment, providing researchers and practitioners a tool to quantify and minimize the environmental footprint of distributed deep learning inference.

4.5CYApr 27
Workplace Demands and Emotional Expression Among Early Childhood Educators: A Computational Analysis of Professional Online Discourse

Hailong Jiang

Early childhood educators work in settings characterized by heavy regulation, emotional labor, staffing instability, and low pay. Although these conditions are well documented in survey-based research, less is known about how they manifest in the day-to-day language educators use in peer spaces. This study examines 7,506 posts from r/ECEProfessionals, a large online community used by early childhood education practitioners. Using a structured, computer-assisted thematic coding workflow and transformer-based emotion classification, posts were organized into 15 themes and mapped onto an adapted Job Demands-Resources (JD-R) framework. Across the corpus, 56.7% of posts centered on demands when task-level and core job demands were combined, compared with 33.6% focused on resources and 9.6% on career conditions. Emotion estimates indicated a broadly neutral tone overall; however, fear emerged as the most prominent non-neutral emotion. Demand-related categories also exhibited higher levels of sadness and anger than resource-related categories. These findings suggest that professional online discourse in early childhood education reflects a work environment structured more around strain than support. The study offers a practical framework for examining how occupational conditions are discussed and emotionally experienced in large-scale professional texts.

CLJan 23, 2025
Emotions, Context, and Substance Use in Adolescents: A Large Language Model Analysis of Reddit Posts

Jianfeng Zhu, Hailong Jiang, Yulan Wang et al.

Early substance use during adolescence increases the risk of later substance use disorders and mental health problems, yet the emotional and contextual factors driving these behaviors remain poorly understood. This study analyzed 23000 substance-use related posts and an equal number of non-substance posts from Reddit's r/teenagers community (2018-2022). Posts were annotated for six discrete emotions (sadness, anger, joy, guilt, fear, disgust) and contextual factors (family, peers, school) using large language models (LLMs). Statistical analyses compared group differences, and interpretable machine learning (SHAP) identified key predictors of substance-use discussions. LLM-assisted thematic coding further revealed latent psychosocial themes linking emotions with contexts. Negative emotions, especially sadness, guilt, fear, and disgust, were significantly more common in substance-use posts, while joy dominated non-substance discussions. Guilt and shame diverged in function: guilt often reflected regret and self-reflection, whereas shame reinforced risky behaviors through peer performance. Peer influence emerged as the strongest contextual factor, closely tied to sadness, fear, and guilt. Family and school environments acted as both risk and protective factors depending on relational quality and stress levels. Overall, adolescent substance-use discussions reflected a dynamic interplay of emotion, social context, and coping behavior. By integrating statistical analysis, interpretable models, and LLM-based thematic exploration, this study demonstrates the value of mixed computational approaches for uncovering the emotional and contextual mechanisms underlying adolescent risk behavior.

LGFeb 7, 2025
Can Large Language Models Understand Intermediate Representations in Compilers?

Hailong Jiang, Jianfeng Zhu, Yao Wan et al.

Intermediate Representations (IRs) play a critical role in compiler design and program analysis, yet their comprehension by Large Language Models (LLMs) remains underexplored. In this paper, we present an explorative empirical study evaluating the capabilities of six state-of-the-art LLMs: GPT-4, GPT-3, DeepSeek, Gemma 2, Llama 3, and Code Llama, in understanding IRs. Specifically, we assess model performance across four core tasks: control flow graph reconstruction, decompilation, code summarization, and execution reasoning. While LLMs exhibit competence in parsing IR syntax and identifying high-level structures, they consistently struggle with instruction-level reasoning, especially in control flow reasoning, loop handling, and dynamic execution. Common failure modes include misinterpreting branching instructions, omitting critical operations, and relying on heuristic reasoning rather than precise instruction-level logic. Our findings highlight the need for IR-specific enhancements in LLM design. We recommend fine-tuning on structured IR datasets and integrating control-flow-sensitive architectures to improve model effectiveness. All experimental data and source code are publicly available at

CRNov 25, 2025
Readout-Side Bypass for Residual Hybrid Quantum-Classical Models

Guilin Zhang, Wulan Guo, Ziqi Tan et al.

Quantum machine learning (QML) promises compact and expressive representations, but suffers from the measurement bottleneck - a narrow quantum-to-classical readout that limits performance and amplifies privacy risk. We propose a lightweight residual hybrid architecture that concatenates quantum features with raw inputs before classification, bypassing the bottleneck without increasing quantum complexity. Experiments show our model outperforms pure quantum and prior hybrid models in both centralized and federated settings. It achieves up to +55% accuracy improvement over quantum baselines, while retaining low communication cost and enhanced privacy robustness. Ablation studies confirm the effectiveness of the residual connection at the quantum-classical interface. Our method offers a practical, near-term pathway for integrating quantum models into privacy-sensitive, resource-constrained settings like federated edge learning.

DCOct 22, 2025
Serverless GPU Architecture for Enterprise HR Analytics: A Production-Scale BDaaS Implementation

Guilin Zhang, Wulan Guo, Ziqi Tan et al.

Industrial and government organizations increasingly depend on data-driven analytics for workforce, finance, and regulated decision processes, where timeliness, cost efficiency, and compliance are critical. Distributed frameworks such as Spark and Flink remain effective for massive-scale batch or streaming analytics but introduce coordination complexity and auditing overheads that misalign with moderate-scale, latency-sensitive inference. Meanwhile, cloud providers now offer serverless GPUs, and models such as TabNet enable interpretable tabular ML, motivating new deployment blueprints for regulated environments. In this paper, we present a production-oriented Big Data as a Service (BDaaS) blueprint that integrates a single-node serverless GPU runtime with TabNet. The design leverages GPU acceleration for throughput, serverless elasticity for cost reduction, and feature-mask interpretability for IL4/FIPS compliance. We conduct benchmarks on the HR, Adult, and BLS datasets, comparing our approach against Spark and CPU baselines. Our results show that GPU pipelines achieve up to 4.5x higher throughput, 98x lower latency, and 90% lower cost per 1K inferences compared to Spark baselines, while compliance mechanisms add only ~5.7 ms latency with p99 < 22 ms. Interpretability remains stable under peak load, ensuring reliable auditability. Taken together, these findings provide a compliance-aware benchmark, a reproducible Helm-packaged blueprint, and a decision framework that demonstrate the practicality of secure, interpretable, and cost-efficient serverless GPU analytics for regulated enterprise and government settings.

HCDec 31, 2021
BatchLens: A Visualization Approach for Analyzing Batch Jobs in Cloud Systems

Shaolun Ruan, Yong Wang, Hailong Jiang et al.

Cloud systems are becoming increasingly powerful and complex. It is highly challenging to identify anomalous execution behaviors and pinpoint problems by examining the overwhelming intermediate results/states in complex application workflows. Domain scientists urgently need a friendly and functional interface to understand the quality of the computing services and the performance of their applications in real time. To meet these needs, we explore data generated by job schedulers and investigate general performance metrics (e.g., utilization of CPU, memory and disk I/O). Specifically, we propose an interactive visual analytics approach, BatchLens, to provide both providers and users of cloud service with an intuitive and effective way to explore the status of system batch jobs and help them conduct root-cause analysis of anomalous behaviors in batch jobs. We demonstrate the effectiveness of BatchLens through a case study on the public Alibaba bench workload trace datasets.