LGMay 7
Full-Spectrum Graph Neural Network: Expressive and ScalableXiaohan Wang, Deyu Bo, Longlong Li et al.
It is well established that spectral graph neural networks (GNNs) can universally approximate node signals; however, their expressive power remains bounded by the 1-dimensional Weisfeiler-Lehman test, which is mirrored in their lack of universality for higher-order signals. To go beyond this bound, we propose the Full-Spectrum GNN (FSpecGNN), a second-order generalization of classical spectral GNNs. FSpecGNN advances spectral filtering in two perspectives: (1) it lifts the signal from the node domain to the node-pair domain; and (2) it extends the univariate spectral filter over eigenvalues to a bivariate filter over eigenvalue pairs. We show that classical spectral GNNs arise as a diagonal special case of FSpecGNN, and prove that FSpecGNN can be at most as expressive as Local 2-GNN while universally approximating node-pair signals, the latter being particularly beneficial for heterophilic graph learning. Moreover, FSpecGNN admits scalable implementations that avoid explicit node-pair-level computations; combined with a low-rank approximation that reduces full-spectrum convolution to a combination of polynomial spectral filters, it enables learning on large graphs. Empirically, FSpecGNN validates the predicted expressivity and delivers strong performance on heterophilic benchmarks.
CVMar 2, 2023
Photovoltaic Panel Defect Detection Based on Ghost Convolution with BottleneckCSP and Tiny Target Prediction Head Incorporating YOLOv5Longlong Li, Zhifeng Wang, Tingting Zhang
Photovoltaic (PV) panel surface-defect detection technology is crucial for the PV industry to perform smart maintenance. Using computer vision technology to detect PV panel surface defects can ensure better accuracy while reducing the workload of traditional worker field inspections. However, multiple tiny defects on the PV panel surface and the high similarity between different defects make it challenging to {accurately identify and detect such defects}. This paper proposes an approach named Ghost convolution with BottleneckCSP and a tiny target prediction head incorporating YOLOv5 (GBH-YOLOv5) for PV panel defect detection. To ensure better accuracy on multiscale targets, the BottleneckCSP module is introduced to add a prediction head for tiny target detection to alleviate tiny defect misses, using Ghost convolution to improve the model inference speed and reduce the number of parameters. First, the original image is compressed and cropped to enlarge the defect size physically. Then, the processed images are input into GBH-YOLOv5, and the depth features are extracted through network processing based on Ghost convolution, the application of the BottleneckCSP module, and the prediction head of tiny targets. Finally, the extracted features are classified by a Feature Pyramid Network (FPN) and a Path Aggregation Network (PAN) structure. Meanwhile, we compare our method with state-of-the-art methods to verify the effectiveness of the proposed method. The proposed PV panel surface-defect detection network improves the mAP performance by at least 27.8%.
LGFeb 2
MGKAN: Predicting Asymmetric Drug-Drug Interactions via a Multimodal Graph Kolmogorov-Arnold NetworkKunyi Fan, Mengjie Chen, Longlong Li et al.
Predicting drug-drug interactions (DDIs) is essential for safe pharmacological treatments. Previous graph neural network (GNN) models leverage molecular structures and interaction networks but mostly rely on linear aggregation and symmetric assumptions, limiting their ability to capture nonlinear and heterogeneous patterns. We propose MGKAN, a Graph Kolmogorov-Arnold Network that introduces learnable basis functions into asymmetric DDI prediction. MGKAN replaces conventional MLP transformations with KAN-driven basis functions, enabling more expressive and nonlinear modeling of drug relationships. To capture pharmacological dependencies, MGKAN integrates three network views-an asymmetric DDI network, a co-interaction network, and a biochemical similarity network-with role-specific embeddings to preserve directional semantics. A fusion module combines linear attention and nonlinear transformation to enhance representational capacity. On two benchmark datasets, MGKAN outperforms seven state-of-the-art baselines. Ablation studies and case studies confirm its predictive accuracy and effectiveness in modeling directional drug effects.
LGApr 29
Cheeger--Hodge Contrastive Learning for Structurally Robust Graph Representation LearningMengyang Zhao, Longlong Li, Cunquan Qu
Graph Contrastive Learning (GCL) has emerged as a prominent framework for unsupervised graph representation learning. However, relying on augmentation design alone to define the invariances learned by GCL can be brittle under structural perturbations. To address this issue, we propose Cheeger--Hodge Contrastive Learning (CHCL), a framework that aligns a perturbation-stable Cheeger--Hodge joint signature across augmented views for robust graph representation learning. The proposed signature combines a Cheeger-inspired connectivity signature derived from the algebraic connectivity \(λ_2\) with the low-frequency spectrum of the 1-Hodge Laplacian, thereby capturing both global connectivity and higher-order structural information. By aligning encoder representations with the proposed Cheeger--Hodge joint signature across augmented views, CHCL learns graph embeddings that are robust to local structural perturbations. Extensive experiments on standard benchmarks, transfer settings demonstrate that CHCL consistently improves performance, robustness, and generalization.
LGOct 15, 2024
KA-GNN: Kolmogorov-Arnold Graph Neural Networks for Molecular Property PredictionLonglong Li, Yipeng Zhang, Guanghui Wang et al.
As key models in geometric deep learning, graph neural networks have demonstrated enormous power in molecular data analysis. Recently, a specially-designed learning scheme, known as Kolmogorov-Arnold Network (KAN), shows unique potential for the improvement of model accuracy, efficiency, and explainability. Here we propose the first non-trivial Kolmogorov-Arnold Network-based Graph Neural Networks (KA-GNNs), including KAN-based graph convolutional networks(KA-GCN) and KAN-based graph attention network (KA-GAT). The essential idea is to utilizes KAN's unique power to optimize GNN architectures at three major levels, including node embedding, message passing, and readout. Further, with the strong approximation capability of Fourier series, we develop Fourier series-based KAN model and provide a rigorous mathematical prove of the robust approximation capability of this Fourier KAN architecture. To validate our KA-GNNs, we consider seven most-widely-used benchmark datasets for molecular property prediction and extensively compare with existing state-of-the-art models. It has been found that our KA-GNNs can outperform traditional GNN models. More importantly, our Fourier KAN module can not only increase the model accuracy but also reduce the computational time. This work not only highlights the great power of KA-GNNs in molecular property prediction but also provides a novel geometric deep learning framework for the general non-Euclidean data analysis.
CVJan 14, 2025
Make-A-Character 2: Animatable 3D Character Generation From a Single ImageLin Liu, Yutong Wang, Jiahao Chen et al.
This report introduces Make-A-Character 2, an advanced system for generating high-quality 3D characters from single portrait photographs, ideal for game development and digital human applications. Make-A-Character 2 builds upon its predecessor by incorporating several significant improvements for image-based head generation. We utilize the IC-Light method to correct non-ideal illumination in input photos and apply neural network-based color correction to harmonize skin tones between the photos and game engine renders. We also employ the Hierarchical Representation Network to capture high-frequency facial structures and conduct adaptive skeleton calibration for accurate and expressive facial animations. The entire image-to-3D-character generation process takes less than 2 minutes. Furthermore, we leverage transformer architecture to generate co-speech facial and gesture actions, enabling real-time conversation with the generated character. These technologies have been integrated into our conversational AI avatar products.
LGMay 14, 2025
Rhomboid Tiling for Geometric Graph Deep LearningYipeng Zhang, Longlong Li, Kelin Xia
Graph Neural Networks (GNNs) have proven effective for learning from graph-structured data through their neighborhood-based message passing framework. Many hierarchical graph clustering pooling methods modify this framework by introducing clustering-based strategies, enabling the construction of more expressive and powerful models. However, all of these message passing framework heavily rely on the connectivity structure of graphs, limiting their ability to capture the rich geometric features inherent in geometric graphs. To address this, we propose Rhomboid Tiling (RT) clustering, a novel clustering method based on the rhomboid tiling structure, which performs clustering by leveraging the complex geometric information of the data and effectively extracts its higher-order geometric structures. Moreover, we design RTPool, a hierarchical graph clustering pooling model based on RT clustering for graph classification tasks. The proposed model demonstrates superior performance, outperforming 21 state-of-the-art competitors on all the 7 benchmark datasets.
LGMar 23
MISApp: Multi-Hop Intent-Aware Session Graph Learning for Next App PredictionYunchi Yang, Longlong Li, Jianliang Wu et al.
Predicting the next mobile app a user will launch is essential for proactive mobile services. Yet accurate prediction remains challenging in real-world settings, where user intent can shift rapidly within short sessions and user-specific historical profiles are often sparse or unavailable, especially under cold-start conditions. Existing approaches mainly model app usage as sequential behavior or local session transitions, limiting their ability to capture higher-order structural dependencies and evolving session intent. To address this issue, we propose MISApp, a profile-free framework for next app prediction based on multi-hop session graph learning. MISApp constructs multi-hop session graphs to capture transition dependencies at different structural ranges, learns session representations through lightweight graph propagation, incorporates temporal and spatial context to characterize session conditions, and captures intent evolution from recent interactions. Experiments on two real-world app usage datasets show that MISApp consistently outperforms competitive baselines under both standard and cold-start settings, while maintaining a favorable balance between predictive accuracy and practical efficiency. Further analyses show that the learned hop-level attention weights align well with structural relevance, offering interpretable evidence for the effectiveness of the proposed multi-hop modeling strategy.
LGOct 29, 2025
Dynamically Weighted Momentum with Adaptive Step Sizes for Efficient Deep Network TrainingZhifeng Wang, Longlong Li, Chunyan Zeng
Within the current sphere of deep learning research, despite the extensive application of optimization algorithms such as Stochastic Gradient Descent (SGD) and Adaptive Moment Estimation (Adam), there remains a pronounced inadequacy in their capability to address fluctuations in learning efficiency, meet the demands of complex models, and tackle non-convex optimization issues. These challenges primarily arise from the algorithms' limitations in handling complex data structures and models, for instance, difficulties in selecting an appropriate learning rate, avoiding local optima, and navigating through high-dimensional spaces. To address these issues, this paper introduces a novel optimization algorithm named DWMGrad. This algorithm, building on the foundations of traditional methods, incorporates a dynamic guidance mechanism reliant on historical data to dynamically update momentum and learning rates. This allows the optimizer to flexibly adjust its reliance on historical information, adapting to various training scenarios. This strategy not only enables the optimizer to better adapt to changing environments and task complexities but also, as validated through extensive experimentation, demonstrates DWMGrad's ability to achieve faster convergence rates and higher accuracies under a multitude of scenarios.
LGMay 21, 2025
Beyond Node Attention: Multi-Scale Harmonic Encoding for Feature-Wise Graph Message PassingLonglong Li, Cunquan Qu, Guanghui Wang
Conventional Graph Neural Networks (GNNs) aggregate neighbor embeddings as holistic vectors, lacking the ability to identify fine-grained, direction-specific feature relevance. We propose MSH-GNN (Multi-Scale Harmonic Graph Neural Network), a novel architecture that performs feature-wise adaptive message passing through node-specific harmonic projections. For each node, MSH-GNN dynamically projects neighbor features onto frequency-sensitive directions determined by the target node's own representation. These projections are further modulated using learnable sinusoidal encodings at multiple frequencies, enabling the model to capture both smooth and oscillatory structural patterns across scales. A frequency-aware attention pooling mechanism is introduced to emphasize spectrally and structurally salient nodes during readout. Theoretically, we prove that MSH-GNN approximates shift-invariant kernels and matches the expressive power of the 1-Weisfeiler-Lehman (1-WL) test. Empirically, MSH-GNN consistently outperforms state-of-the-art models on a wide range of graph and node classification tasks. Furthermore, in challenging classification settings involving joint variations in graph topology and spectral frequency, MSH-GNN excels at capturing structural asymmetries and high-frequency modulations, enabling more accurate graph discrimination.
LGFeb 24, 2025
TGT: A Temporal Gating Transformer for Smartphone App Usage PredictionLonglong Li, Cunquan Qu, Guanghui Wang
Accurately predicting smartphone app usage is challenging due to the sparsity and irregularity of user behavior, especially under cold-start and low-activity conditions. Existing approaches mostly rely on static or attention-only architectures, which struggle to model fine-grained temporal dynamics. We propose TGT, a Transformer framework equipped with a temporal gating module that conditions hidden representations on the hour-of-day. Unlike conventional time embeddings, temporal gating adaptively rescales feature dimensions in a time-aware manner, working orthogonally to self-attention and strengthening temporal sensitivity. TGT further incorporates a context-aware encoder that integrates session sequences and user profiles into a unified representation. Experiments on two real-world datasets, Tsinghua App Usage and LSApp, demonstrate that TGT significantly outperforms 15 competitive baselines, achieving notable gains in HR@1 and maintaining robustness under cold-start scenarios. Beyond accuracy, analysis of gating vectors uncovers interpretable daily usage rhythms, showing that TGT learns human-consistent patterns of app behavior. These results establish TGT as both a powerful and interpretable framework for time-aware app usage prediction.