Hailong Li

LG
h-index59
11papers
134citations
Novelty45%
AI Score43

11 Papers

GRJun 1
MidSurfNet: Learnable Face Pairing and Interference Implicit Fields for Generalized Mid-surface Abstraction

Li Ye, Xinhang Zhou, Xingyu Yang et al.

Mid-surface abstraction is essential for finite element analysis of thin-walled CAD models. Existing face pairing-based methods rely on handcrafted geometric heuristics, yet real-world industrial models frequently exhibit multi-wall-thickness regions, self-matching face configurations, and demand for non-center offset surfaces--scenarios where rule-based approaches consistently fail. We present MidSurfNet, a learning-augmented framework that addresses these limitations through two novel components: (1) a neural face pairing module that learns to predict face pair confidence from geometric and topological features, handling complex pairing scenarios beyond rule-based methods; and (2) an interference implicit field that represents mid-surfaces as the interference of two signed distance functions, enabling generalized offset control for flexible positioning in downstream CAE/FEA-oriented workflows. We construct a large-scale mid-surface dataset containing over 1,500 manually annotated CAD models. Experiments demonstrate that MidSurfNet achieves 87.32% face pairing accuracy and successfully handles multi-wall-thickness (61.90% completion) and self-matching (52.94% completion) scenarios that confound all existing methods. Furthermore, MidSurfNet provides a learning-based approach to generalized mid-surface abstraction with arbitrary offset control for CAE-oriented applications.

AIAug 22, 2024
SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging

Mohammadreza Pourreza, Ruoxi Sun, Hailong Li et al.

Recent advances in Text-to-SQL have largely focused on the SQLite dialect, neglecting the diverse landscape of SQL dialects like BigQuery and PostgreSQL. This limitation is due to the diversity in SQL syntaxes and functions, along with the high cost of collecting and curating SQL-specific training data. To address this, we introduce SQL-GEN, a framework for generating high-quality synthetic training data for any SQL dialect, guided by readily available dialect-specific tutorials. SQL-GEN significantly improves cross-dialect Text-to-SQL performance, boosting execution accuracy by up to 20\% over existing methods. This performance gain narrows the gap with models trained on large-scale human-annotated data. Furthermore, combining synthetic data from SQL-GEN with human-annotated data yields additional improvements of up to 5.6\%. To unify multi-dialect capabilities within a single model, we propose a novel Mixture-of-Experts (MoE) initialization that leverages the shared knowledge across dialects. Our approach merges self-attention layers from dialect-specific models and initializes expert gates using dialect-specific keywords. This leads to a versatile model optimized for multiple SQL dialects, outperforming single-dialect models and significantly enhancing overall performance.

IVFeb 20, 2023
A Novel Collaborative Self-Supervised Learning Method for Radiomic Data

Zhiyuan Li, Hailong Li, Anca L. Ralescu et al.

The computer-aided disease diagnosis from radiomic data is important in many medical applications. However, developing such a technique relies on annotating radiological images, which is a time-consuming, labor-intensive, and expensive process. In this work, we present the first novel collaborative self-supervised learning method to solve the challenge of insufficient labeled radiomic data, whose characteristics are different from text and image data. To achieve this, we present two collaborative pretext tasks that explore the latent pathological or biological relationships between regions of interest and the similarity and dissimilarity information between subjects. Our method collaboratively learns the robust latent feature representations from radiomic data in a self-supervised manner to reduce human annotation efforts, which benefits the disease diagnosis. We compared our proposed method with other state-of-the-art self-supervised learning methods on a simulation study and two independent datasets. Extensive experimental results demonstrated that our method outperforms other self-supervised learning methods on both classification and regression tasks. With further refinement, our method shows the potential advantage in automatic disease diagnosis with large-scale unlabeled data available.

CVSep 18, 2023
Self-supervised Multi-view Clustering in Computer Vision: A Survey

Jiatai Wang, Zhiwei Xu, Xuewen Yang et al.

Multi-view clustering (MVC) has had significant implications in cross-modal representation learning and data-driven decision-making in recent years. It accomplishes this by leveraging the consistency and complementary information among multiple views to cluster samples into distinct groups. However, as contrastive learning continues to evolve within the field of computer vision, self-supervised learning has also made substantial research progress and is progressively becoming dominant in MVC methods. It guides the clustering process by designing proxy tasks to mine the representation of image and video data itself as supervisory information. Despite the rapid development of self-supervised MVC, there has yet to be a comprehensive survey to analyze and summarize the current state of research progress. Therefore, this paper explores the reasons and advantages of the emergence of self-supervised MVC and discusses the internal connections and classifications of common datasets, data issues, representation learning methods, and self-supervised learning methods. This paper does not only introduce the mechanisms for each category of methods but also gives a few examples of how these techniques are used. In the end, some open problems are pointed out for further investigation and development.

LGMar 29, 2025
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

Mohammadreza Pourreza, Shayan Talaei, Ruoxi Sun et al.

Text-to-SQL is a challenging task involving multiple reasoning-intensive subtasks, including natural language understanding, database schema comprehension, and precise SQL query formulation. Existing approaches often rely on handcrafted reasoning paths with inductive biases that can limit their overall effectiveness. Motivated by the recent success of reasoning-enhanced models such as DeepSeek R1 and OpenAI o1, which effectively leverage reward-driven self-exploration to enhance reasoning capabilities and generalization, we propose a novel set of partial rewards tailored specifically for the Text-to-SQL task. Our reward set includes schema-linking, AI feedback, n-gram similarity, and syntax check, explicitly designed to address the reward sparsity issue prevalent in reinforcement learning (RL). Leveraging group relative policy optimization (GRPO), our approach explicitly encourages large language models (LLMs) to develop intrinsic reasoning skills necessary for accurate SQL query generation. With models of different sizes, we demonstrate that RL-only training with our proposed rewards consistently achieves higher accuracy and superior generalization compared to supervised fine-tuning (SFT). Remarkably, our RL-trained 14B-parameter model significantly outperforms larger proprietary models, e.g. o3-mini by 4% and Gemini-1.5-Pro-002 by 3% on the BIRD benchmark. These highlight the efficacy of our proposed RL-training framework with partial rewards for enhancing both accuracy and reasoning capabilities in Text-to-SQL tasks.

CVMar 15, 2024
RadCLIP: Enhancing Radiologic Image Analysis through Contrastive Language-Image Pre-training

Zhixiu Lu, Hailong Li, Nehal A. Parikh et al.

The integration of artificial intelligence (AI) with radiology marks a transformative era in medicine. Vision foundation models have been adopted to enhance radiologic imaging analysis. However, the distinct complexities of radiologic 2D and 3D radiologic data pose unique challenges that existing models, pre-trained on general non-medical images, fail to address adequately. To bridge this gap and capitalize on the diagnostic precision required in radiologic imaging, we introduce Radiologic Contrastive Language-Image Pre-training (RadCLIP): a cross-modal vision-language foundational model that harnesses Vision Language Pre-training (VLP) framework to improve radiologic image analysis. Building upon Contrastive Language-Image Pre-training (CLIP), RadCLIP incorporates a slice pooling mechanism tailored for volumetric image analysis and is pre-trained using a large and diverse dataset of radiologic image-text pairs. The RadCLIP was pre-trained to effectively align radiologic images with their corresponding text annotations, creating a robust vision backbone for radiologic images. Extensive experiments demonstrate RadCLIP's superior performance in both uni-modal radiologic image classification and cross-modal image-text matching, highlighting its significant promise for improving diagnostic accuracy and efficiency in clinical settings. Our Key contributions include curating a large dataset with diverse radiologic 2D/3D radiologic image-text pairs, a slice pooling adapter using an attention mechanism for integrating 2D images, and comprehensive evaluations of RadCLIP on various radiologic downstream tasks.

IVDec 22, 2023
Joint Self-Supervised and Supervised Contrastive Learning for Multimodal MRI Data: Towards Predicting Abnormal Neurodevelopment

Zhiyuan Li, Hailong Li, Anca L. Ralescu et al.

The integration of different imaging modalities, such as structural, diffusion tensor, and functional magnetic resonance imaging, with deep learning models has yielded promising outcomes in discerning phenotypic characteristics and enhancing disease diagnosis. The development of such a technique hinges on the efficient fusion of heterogeneous multimodal features, which initially reside within distinct representation spaces. Naively fusing the multimodal features does not adequately capture the complementary information and could even produce redundancy. In this work, we present a novel joint self-supervised and supervised contrastive learning method to learn the robust latent feature representation from multimodal MRI data, allowing the projection of heterogeneous features into a shared common space, and thereby amalgamating both complementary and analogous information across various modalities and among similar subjects. We performed a comparative analysis between our proposed method and alternative deep multimodal learning approaches. Through extensive experiments on two independent datasets, the results demonstrated that our method is significantly superior to several other deep multimodal learning methods in predicting abnormal neurodevelopment. Our method has the capability to facilitate computer-aided diagnosis within clinical practice, harnessing the power of multimodal data.

LGMar 24, 2025
RLCAD: Reinforcement Learning Training Gym for Revolution Involved CAD Command Sequence Generation

Xiaolong Yin, Xingyu Lu, Jiahang Shen et al.

A CAD command sequence is a typical parametric design paradigm in 3D CAD systems where a model is constructed by overlaying 2D sketches with operations such as extrusion, revolution, and Boolean operations. Although there is growing academic interest in the automatic generation of command sequences, existing methods and datasets only support operations such as 2D sketching, extrusion,and Boolean operations. This limitation makes it challenging to represent more complex geometries. In this paper, we present a reinforcement learning (RL) training environment (gym) built on a CAD geometric engine. Given an input boundary representation (B-Rep) geometry, the policy network in the RL algorithm generates an action. This action, along with previously generated actions, is processed within the gym to produce the corresponding CAD geometry, which is then fed back into the policy network. The rewards, determined by the difference between the generated and target geometries within the gym, are used to update the RL network. Our method supports operations beyond sketches, Boolean, and extrusion, including revolution operations. With this training gym, we achieve state-of-the-art (SOTA) quality in generating command sequences from B-Rep geometries.

CLSep 22, 2025
One Agent to Serve All: a Lite-Adaptive Stylized AI Assistant for Millions of Multi-Style Official Accounts

Xingyu Fan, Feifei Li, Wenhui Que et al.

Conversational agents deployed in industrial-scale official account platforms must generate responses that are both contextually grounded and stylistically aligned-requirements that existing methods struggle to meet. Chain-of-thought (CoT) prompting induces significant latency due to multi-turn reasoning; per-account fine-tuning is computationally prohibitive; and long prompt-based methods degrade the model's ability to grasp injected context and style. In this paper, we propose WeStar, a lite-adaptive framework for stylized contextual question answering that scales to millions of official accounts. WeStar combines context-grounded generation via RAG with style-aware generation using Parametric RAG (PRAG), where LoRA modules are dynamically activated per style cluster. Our contributions are fourfold: (1) We introduce WeStar, a unified framework capable of serving large volumes of official accounts with minimal overhead. (2) We propose a multi-dimensional, cluster-based parameter sharing scheme that enables compact style representation while preserving stylistic diversity. (3) We develop a style-enhanced Direct Preference Optimization (SeDPO) method to optimize each style cluster's parameters for improved generation quality. (4) Experiments on a large-scale industrial dataset validate the effectiveness and efficiency of WeStar, underscoring its pracitical value in real-world deployment.

LGFeb 8, 2022
A Novel Ontology-guided Attribute Partitioning Ensemble Learning Model for Early Prediction of Cognitive Deficits using Quantitative Structural MRI in Very Preterm Infants

Zhiyuan Li, Hailong Li, Adebayo Braimah et al.

Structural magnetic resonance imaging studies have shown that brain anatomical abnormalities are associated with cognitive deficits in preterm infants. Brain maturation and geometric features can be used with machine learning models for predicting later neurodevelopmental deficits. However, traditional machine learning models would suffer from a large feature-to-instance ratio (i.e., a large number of features but a small number of instances/samples). Ensemble learning is a paradigm that strategically generates and integrates a library of machine learning classifiers and has been successfully used on a wide variety of predictive modeling problems to boost model performance. Attribute (i.e., feature) bagging method is the most commonly used feature partitioning scheme, which randomly and repeatedly draws feature subsets from the entire feature set. Although attribute bagging method can effectively reduce feature dimensionality to handle the large feature-to-instance ratio, it lacks consideration of domain knowledge and latent relationship among features. In this study, we proposed a novel Ontology-guided Attribute Partitioning (OAP) method to better draw feature subsets by considering the domain-specific relationship among features. With the better partitioned feature subsets, we developed an ensemble learning framework, which is referred to as OAP-Ensemble Learning (OAP-EL). We applied the OAP-EL to predict cognitive deficits at 2 years of age using quantitative brain maturation and geometric features obtained at term equivalent age in very preterm infants. We demonstrated that the proposed OAP-EL approach significantly outperformed the peer ensemble learning and traditional machine learning approaches.

LGJul 28, 2018
SEA: A Combined Model for Heat Demand Prediction

Jiyang Xie, Jiaxin Guo, Zhanyu Ma et al.

Heat demand prediction is a prominent research topic in the area of intelligent energy networks. It has been well recognized that periodicity is one of the important characteristics of heat demand. Seasonal-trend decomposition based on LOESS (STL) algorithm can analyze the periodicity of a heat demand series, and decompose the series into seasonal and trend components. Then, predicting the seasonal and trend components respectively, and combining their predictions together as the heat demand prediction is a possible way to predict heat demand. In this paper, STL-ENN-ARIMA (SEA), a combined model, was proposed based on the combination of the Elman neural network (ENN) and the autoregressive integrated moving average (ARIMA) model, which are commonly applied to heat demand prediction. ENN and ARIMA are used to predict seasonal and trend components, respectively. Experimental results demonstrate that the proposed SEA model has a promising performance.