Hongxu Chen

h-index32

29papers

2,001citations

Novelty52%

AI Score51

Ranked #16,302 of 194,257 authors (top 8%)#4,109 in LG (top 10%)

29 Papers

13.0LGJul 1, 2022

Generating Counterfactual Hard Negative Samples for Graph Contrastive Learning

Haoran Yang, Hongxu Chen, Sixiao Zhang et al.

Graph contrastive learning has emerged as a powerful tool for unsupervised graph representation learning. The key to the success of graph contrastive learning is to acquire high-quality positive and negative samples as contrasting pairs for the purpose of learning underlying structural semantics of the input graph. Recent works usually sample negative samples from the same training batch with the positive samples, or from an external irrelevant graph. However, a significant limitation lies in such strategies, which is the unavoidable problem of sampling false negative samples. In this paper, we propose a novel method to utilize \textbf{C}ounterfactual mechanism to generate artificial hard negative samples for \textbf{G}raph \textbf{C}ontrastive learning, namely \textbf{CGC}, which has a different perspective compared to those sampling-based strategies. We utilize counterfactual mechanism to produce hard negative samples, which ensures that the generated samples are similar to, but have labels that different from the positive sample. The proposed method achieves satisfying results on several datasets compared to some traditional unsupervised graph learning methods and some SOTA graph contrastive learning methods. We also conduct some supplementary experiments to give an extensive illustration of the proposed method, including the performances of CGC with different hard negative samples and evaluations for hard negative samples generated with different similarity measurements.

4.6LGJul 24, 2022

Mitigating the Performance Sacrifice in DP-Satisfied Federated Settings through Graph Contrastive Learning

Haoran Yang, Xiangyu Zhao, Muyang Li et al.

Currently, graph learning models are indispensable tools to help researchers explore graph-structured data. In academia, using sufficient training data to optimize a graph model on a single device is a typical approach for training a capable graph learning model. Due to privacy concerns, however, it is infeasible to do so in real-world scenarios. Federated learning provides a practical means of addressing this limitation by introducing various privacy-preserving mechanisms, such as differential privacy (DP) on the graph edges. However, although DP in federated graph learning can ensure the security of sensitive information represented in graphs, it usually causes the performance of graph learning models to degrade. In this paper, we investigate how DP can be implemented on graph edges and observe a performance decrease in our experiments. In addition, we note that DP on graph edges introduces noise that perturbs graph proximity, which is one of the graph augmentations in graph contrastive learning. Inspired by this, we propose leveraging graph contrastive learning to alleviate the performance drop resulting from DP. Extensive experiments conducted with four representative graph models on five widely used benchmark datasets show that contrastive learning indeed alleviates the models' DP-induced performance drops.

11.4LGJun 1

HMPO: Hybrid Median-length Policy Optimization for Chain-of-Thought Compression

Minghui Zheng, Hongxu Chen, Huimin Ren et al.

Large language models achieve remarkable performance via extended chain-of-thought (CoT) reasoning, yet this lengthy process incurs substantial inference overhead. Existing CoT compression methods struggle with inflexible manual length budgets, computationally expensive multi-stage training pipelines, and fragile scalability restricted to small models. We propose HMPO (Hybrid Median-length Policy Optimization), a cost-effective, single-stage reinforcement learning framework. HMPO efficiently compresses CoT via three synergistic components: an adaptive median-based budget derived from successful rollouts to eliminate manual tuning, a cosine-decay token reward for smooth length penalization, and a multiplicative reward formulation that substantially mitigates trivial reward hacking by strictly prioritizing answer correctness. Trained exclusively on mathematical data, HMPO generalizes seamlessly across math, code, science, and instruction-following tasks. Extensive experiments scaling from 9B to 122B parameters across dense and Mixture-of-Experts (MoE) architectures demonstrate that HMPO achieves 19%--46% token compression with negligible accuracy degradation, all while drastically reducing training costs compared to existing multi-stage baselines.

6.6LGOct 25, 2023Code

Defense Against Model Extraction Attacks on Recommender Systems

Sixiao Zhang, Hongzhi Yin, Hongxu Chen et al.

The robustness of recommender systems has become a prominent topic within the research community. Numerous adversarial attacks have been proposed, but most of them rely on extensive prior knowledge, such as all the white-box attacks or most of the black-box attacks which assume that certain external knowledge is available. Among these attacks, the model extraction attack stands out as a promising and practical method, involving training a surrogate model by repeatedly querying the target model. However, there is a significant gap in the existing literature when it comes to defending against model extraction attacks on recommender systems. In this paper, we introduce Gradient-based Ranking Optimization (GRO), which is the first defense strategy designed to counter such attacks. We formalize the defense as an optimization problem, aiming to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model. Since top-k ranking lists are non-differentiable, we transform them into swap matrices which are instead differentiable. These swap matrices serve as input to a student model that emulates the surrogate model's behavior. By back-propagating the loss of the student model, we obtain gradients for the swap matrices. These gradients are used to compute a swap loss, which maximizes the loss of the student model. We conducted experiments on three benchmark datasets to evaluate the performance of GRO, and the results demonstrate its superior effectiveness in defending against model extraction attacks.

4.0IRJul 17, 2024Code

Watermarking Recommender Systems

Sixiao Zhang, Cheng Long, Wei Yuan et al.

Recommender systems embody significant commercial value and represent crucial intellectual property. However, the integrity of these systems is constantly challenged by malicious actors seeking to steal their underlying models. Safeguarding against such threats is paramount to upholding the rights and interests of the model owner. While model watermarking has emerged as a potent defense mechanism in various domains, its direct application to recommender systems remains unexplored and non-trivial. In this paper, we address this gap by introducing Autoregressive Out-of-distribution Watermarking (AOW), a novel technique tailored specifically for recommender systems. Our approach entails selecting an initial item and querying it through the oracle model, followed by the selection of subsequent items with small prediction scores. This iterative process generates a watermark sequence autoregressively, which is then ingrained into the model's memory through training. To assess the efficacy of the watermark, the model is tasked with predicting the subsequent item given a truncated watermark sequence. Through extensive experimentation and analysis, we demonstrate the superior performance and robust properties of AOW. Notably, our watermarking technique exhibits high-confidence extraction capabilities and maintains effectiveness even in the face of distillation and fine-tuning processes.

31.1CLFeb 22, 2022Code

DialMed: A Dataset for Dialogue-based Medication Recommendation

Zhenfeng He, Yuqiang Han, Zhenqiu Ouyang et al.

Medication recommendation is a crucial task for intelligent healthcare systems. Previous studies mainly recommend medications with electronic health records (EHRs). However, some details of interactions between doctors and patients may be ignored or omitted in EHRs, which are essential for automatic medication recommendation. Therefore, we make the first attempt to recommend medications with the conversations between doctors and patients. In this work, we construct DIALMED, the first high-quality dataset for medical dialogue-based medication recommendation task. It contains 11,996 medical dialogues related to 16 common diseases from 3 departments and 70 corresponding common medications. Furthermore, we propose a Dialogue structure and Disease knowledge aware Network (DDN), where a QA Dialogue Graph mechanism is designed to model the dialogue structure and the knowledge graph is used to introduce external disease knowledge. The extensive experimental results demonstrate that the proposed method is a promising solution to recommend medications with medical dialogues. The dataset and code are available at https://github.com/f-window/DialMed.

32.1CVApr 9, 2024Code

MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection

Haoyang He, Yuhu Bai, Jiangning Zhang et al.

Recent advancements in anomaly detection have seen the efficacy of CNN- and transformer-based approaches. However, CNNs struggle with long-range dependencies, while transformers are burdened by quadratic computational complexity. Mamba-based models, with their superior long-range modeling and linear efficiency, have garnered substantial attention. This study pioneers the application of Mamba to multi-class unsupervised anomaly detection, presenting MambaAD, which consists of a pre-trained encoder and a Mamba decoder featuring (Locality-Enhanced State Space) LSS modules at multi-scales. The proposed LSS module, integrating parallel cascaded (Hybrid State Space) HSS blocks and multi-kernel convolutions operations, effectively captures both long-range and local information. The HSS block, utilizing (Hybrid Scanning) HS encoders, encodes feature maps into five scanning methods and eight directions, thereby strengthening global connections through the (State Space Model) SSM. The use of Hilbert scanning and eight directions significantly improves feature sequence modeling. Comprehensive experiments on six diverse anomaly detection datasets and seven metrics demonstrate state-of-the-art performance, substantiating the method's effectiveness. The code and models are available at https://lewandofskee.github.io/projects/MambaAD.

19.0CVDec 11, 2023Code

DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection

Haoyang He, Jiangning Zhang, Hongxu Chen et al.

Reconstruction-based approaches have achieved remarkable outcomes in anomaly detection. The exceptional image reconstruction capabilities of recently popular diffusion models have sparked research efforts to utilize them for enhanced reconstruction of anomalous images. Nonetheless, these methods might face challenges related to the preservation of image categories and pixel-wise structural integrity in the more practical multi-class setting. To solve the above problems, we propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection, which consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor. Firstly, The SG network is proposed for reconstructing anomalous regions while preserving the original image's semantic information. Secondly, we introduce Spatial-aware Feature Fusion (SFF) block to maximize reconstruction accuracy when dealing with extensively reconstructed areas. Thirdly, the input and reconstructed images are processed by a pre-trained feature extractor to generate anomaly maps based on features extracted at different scales. Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach which surpasses the state-of-the-art methods, e.g., achieving 96.8/52.6 and 97.2/99.0 (AUROC/AP) for localization and detection respectively on multi-class MVTec-AD dataset. Code will be available at https://lewandofskee.github.io/projects/diad.

14.7CVNov 24, 2024Code

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Haoyang He, Jiangning Zhang, Yuxuan Cai et al.

Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs. CNNs, with their local receptive fields, struggle to capture long-range dependencies, while Transformers, despite their global modeling capabilities, are limited by quadratic computational complexity in high-resolution scenarios. Recently, state-space models have gained popularity in the visual domain due to their linear computational complexity. Despite their low FLOPs, current lightweight Mamba-based models exhibit suboptimal throughput. In this work, we propose the MobileMamba framework, which balances efficiency and performance. We design a three-stage network to enhance inference speed significantly. At a fine-grained level, we introduce the Multi-Receptive Field Feature Interaction(MRFFI) module, comprising the Long-Range Wavelet Transform-Enhanced Mamba(WTE-Mamba), Efficient Multi-Kernel Depthwise Convolution(MK-DeConv), and Eliminate Redundant Identity components. This module integrates multi-receptive field information and enhances high-frequency detail extraction. Additionally, we employ training and testing strategies to further improve performance and efficiency. MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods which is maximum x21 faster than LocalVim on GPU. Extensive experiments on high-resolution downstream tasks demonstrate that MobileMamba surpasses current efficient models, achieving an optimal balance between speed and accuracy.

14.2LGNov 21, 2024

IterIS: Iterative Inference-Solving Alignment for LoRA Merging

Hongxu Chen, Runshi Li, Bowei Zhu et al.

Low-rank adaptations (LoRA) are widely used to fine-tune large models across various domains for specific downstream tasks. While task-specific LoRAs are often available, concerns about data privacy and intellectual property can restrict access to training data, limiting the acquisition of a multi-task model through gradient-based training. In response, LoRA merging presents an effective solution by combining multiple LoRAs into a unified adapter while maintaining data privacy. Prior works on LoRA merging primarily frame it as an optimization problem, yet these approaches face several limitations, including the rough assumption about input features utilized in optimization, massive sample requirements, and the unbalanced optimization objective. These limitations can significantly degrade performance. To address these, we propose a novel optimization-based method, named IterIS: 1) We formulate LoRA merging as an advanced optimization problem to mitigate the rough assumption. Additionally, we employ an iterative inference-solving framework in our algorithm. It can progressively refine the optimization objective for improved performance. 2) We introduce an efficient regularization term to reduce the need for massive sample requirements (requiring only 1-5% of the unlabeled samples compared to prior methods). 3) We utilize adaptive weights in the optimization objective to mitigate potential unbalances in LoRA merging process. Our method demonstrates significant improvements over multiple baselines and state-of-the-art methods in composing tasks for text-to-image diffusion, vision-language models, and large language models. Furthermore, our layer-wise algorithm can achieve convergence with minimal steps, ensuring efficiency in both memory and computation.

15.1LGFeb 17, 2022Code

Graph Masked Autoencoders with Transformers

Sixiao Zhang, Hongxu Chen, Haoran Yang et al.

Recently, transformers have shown promising performance in learning graph representations. However, there are still some challenges when applying transformers to real-world scenarios due to the fact that deep transformers are hard to train from scratch and the quadratic memory consumption w.r.t. the number of nodes. In this paper, we propose Graph Masked Autoencoders (GMAEs), a self-supervised transformer-based model for learning graph representations. To address the above two challenges, we adopt the masking mechanism and the asymmetric encoder-decoder design. Specifically, GMAE takes partially masked graphs as input, and reconstructs the features of the masked nodes. The encoder and decoder are asymmetric, where the encoder is a deep transformer and the decoder is a shallow transformer. The masking mechanism and the asymmetric design make GMAE a memory-efficient model compared with conventional transformers. We show that, when serving as a conventional self-supervised graph representation model, GMAE achieves state-of-the-art performance on both the graph classification task and the node classification task under common downstream evaluation protocols. We also show that, compared with training in an end-to-end manner from scratch, we can achieve comparable performance after pre-training and fine-tuning using GMAE while simplifying the training process.

16.1LGJan 20, 2022Code

Unsupervised Graph Poisoning Attack via Contrastive Loss Back-propagation

Sixiao Zhang, Hongxu Chen, Xiangguo Sun et al.

Graph contrastive learning is the state-of-the-art unsupervised graph representation learning framework and has shown comparable performance with supervised approaches. However, evaluating whether the graph contrastive learning is robust to adversarial attacks is still an open problem because most existing graph adversarial attacks are supervised models, which means they heavily rely on labels and can only be used to evaluate the graph contrastive learning in a specific scenario. For unsupervised graph representation methods such as graph contrastive learning, it is difficult to acquire labels in real-world scenarios, making traditional supervised graph attack methods difficult to be applied to test their robustness. In this paper, we propose a novel unsupervised gradient-based adversarial attack that does not rely on labels for graph contrastive learning. We compute the gradients of the adjacency matrices of the two views and flip the edges with gradient ascent to maximize the contrastive loss. In this way, we can fully use multiple views generated by the graph contrastive learning models and pick the most informative edges without knowing their labels, and therefore can promisingly support our model adapted to more kinds of downstream tasks. Extensive experiments show that our attack outperforms unsupervised baseline attacks and has comparable performance with supervised attacks in multiple downstream tasks including node classification and link prediction. We further show that our attack can be transferred to other graph representation models as well.

16.5LGJan 19, 2022

Dual Space Graph Contrastive Learning

Haoran Yang, Hongxu Chen, Shirui Pan et al.

Unsupervised graph representation learning has emerged as a powerful tool to address real-world problems and achieves huge success in the graph learning domain. Graph contrastive learning is one of the unsupervised graph representation learning methods, which recently attracts attention from researchers and has achieved state-of-the-art performances on various tasks. The key to the success of graph contrastive learning is to construct proper contrasting pairs to acquire the underlying structural semantics of the graph. However, this key part is not fully explored currently, most of the ways generating contrasting pairs focus on augmenting or perturbating graph structures to obtain different views of the input graph. But such strategies could degrade the performances via adding noise into the graph, which may narrow down the field of the applications of graph contrastive learning. In this paper, we propose a novel graph contrastive learning method, namely \textbf{D}ual \textbf{S}pace \textbf{G}raph \textbf{C}ontrastive (DSGC) Learning, to conduct graph contrastive learning among views generated in different spaces including the hyperbolic space and the Euclidean space. Since both spaces have their own advantages to represent graph data in the embedding spaces, we hope to utilize graph contrastive learning to bridge the spaces and leverage advantages from both sides. The comparison experiment results show that DSGC achieves competitive or better performances among all the datasets. In addition, we conduct extensive experiments to analyze the impact of different graph encoders on DSGC, giving insights about how to better leverage the advantages of contrastive learning between different spaces.

28.8LGJan 17, 2022Code

Towards Unsupervised Deep Graph Structure Learning

Yixin Liu, Yu Zheng, Daokun Zhang et al.

In recent years, graph neural networks (GNNs) have emerged as a successful tool in a variety of graph-related applications. However, the performance of GNNs can be deteriorated when noisy connections occur in the original graph structures; besides, the dependence on explicit structures prevents GNNs from being applied to general unstructured scenarios. To address these issues, recently emerged deep graph structure learning (GSL) methods propose to jointly optimize the graph structure along with GNN under the supervision of a node classification task. Nonetheless, these methods focus on a supervised learning scenario, which leads to several problems, i.e., the reliance on labels, the bias of edge distribution, and the limitation on application tasks. In this paper, we propose a more practical GSL paradigm, unsupervised graph structure learning, where the learned graph topology is optimized by data itself without any external guidance (i.e., labels). To solve the unsupervised GSL problem, we propose a novel StrUcture Bootstrapping contrastive LearnIng fraMEwork (SUBLIME for abbreviation) with the aid of self-supervised contrastive learning. Specifically, we generate a learning target from the original data as an "anchor graph", and use a contrastive loss to maximize the agreement between the anchor graph and the learned graph. To provide persistent guidance, we design a novel bootstrapping mechanism that upgrades the anchor graph with learned structures during model learning. We also design a series of graph learners and post-processing schemes to model the structures to learn. Extensive experiments on eight benchmark datasets demonstrate the significant effectiveness of our proposed SUBLIME and high quality of the optimized graphs.

14.1LGJan 7, 2022

GenLabel: Mixup Relabeling using Generative Models

Jy-yong Sohn, Liang Shang, Hongxu Chen et al.

Mixup is a data augmentation method that generates new data points by mixing a pair of input data. While mixup generally improves the prediction performance, it sometimes degrades the performance. In this paper, we first identify the main causes of this phenomenon by theoretically and empirically analyzing the mixup algorithm. To resolve this, we propose GenLabel, a simple yet effective relabeling algorithm designed for mixup. In particular, GenLabel helps the mixup algorithm correctly label mixup samples by learning the class-conditional data distribution using generative models. Via extensive theoretical and empirical analysis, we show that mixup, when used together with GenLabel, can effectively resolve the aforementioned phenomenon, improving the generalization performance and the adversarial robustness.

7.5IRNov 24, 2021

Reinforcement Learning based Path Exploration for Sequential Explainable Recommendation

Yicong Li, Hongxu Chen, Yile Li et al.

Recent advances in path-based explainable recommendation systems have attracted increasing attention thanks to the rich information provided by knowledge graphs. Most existing explainable recommendations only utilize static knowledge graphs and ignore the dynamic user-item evolutions, leading to less convincing and inaccurate explanations. Although there are some works that realize that modelling user's temporal sequential behaviour could boost the performance and explainability of the recommender systems, most of them either only focus on modelling user's sequential interactions within a path or independently and separately of the recommendation mechanism. In this paper, we propose a novel Temporal Meta-path Guided Explainable Recommendation leveraging Reinforcement Learning (TMER-RL), which utilizes reinforcement item-item path modelling between consecutive items with attention mechanisms to sequentially model dynamic user-item evolutions on dynamic knowledge graph for explainable recommendation. Compared with existing works that use heavy recurrent neural networks to model temporal information, we propose simple but effective neural networks to capture users' historical item features and path-based context to characterize the next purchased item. Extensive evaluations of TMER on two real-world datasets show state-of-the-art performance compared against recent strong baselines.

21.5LGOct 29, 2021Code

Improving Fairness via Federated Learning

Yuchen Zeng, Hongxu Chen, Kangwook Lee

Recently, lots of algorithms have been proposed for learning a fair classifier from decentralized data. However, many theoretical and algorithmic questions remain open. First, is federated learning necessary, i.e., can we simply train locally fair classifiers and aggregate them? In this work, we first propose a new theoretical framework, with which we demonstrate that federated learning can strictly boost model fairness compared with such non-federated algorithms. We then theoretically and empirically show that the performance tradeoff of FedAvg-based fair learning algorithms is strictly worse than that of a fair classifier trained on centralized data. To bridge this gap, we propose FedFB, a private fair learning algorithm on decentralized data. The key idea is to modify the FedAvg protocol so that it can effectively mimic the centralized fair learning. Our experimental results show that FedFB significantly outperforms existing approaches, sometimes matching the performance of the centrally trained model.

25.9CVSep 9, 2021Code

Is Attention Better Than Matrix Decomposition?

Zhengyang Geng, Meng-Hao Guo, Hongxu Chen et al.

As an essential ingredient of modern deep learning, attention mechanism, especially self-attention, plays a vital role in the global correlation discovery. However, is hand-crafted attention irreplaceable when modeling the global context? Our intriguing finding is that self-attention is not better than the matrix decomposition (MD) model developed 20 years ago regarding the performance and computational cost for encoding the long-distance dependencies. We model the global context issue as a low-rank recovery problem and show that its optimization algorithms can help design global information blocks. This paper then proposes a series of Hamburgers, in which we employ the optimization algorithms for solving MDs to factorize the input representations into sub-matrices and reconstruct a low-rank embedding. Hamburgers with different MDs can perform favorably against the popular global context module self-attention when carefully coping with gradients back-propagated through MDs. Comprehensive experiments are conducted in the vision tasks where it is crucial to learn the global context, including semantic segmentation and image generation, demonstrating significant improvements over self-attention and its variants.

16.6IRSep 7, 2021

Hyper Meta-Path Contrastive Learning for Multi-Behavior Recommendation

Haoran Yang, Hongxu Chen, Lin Li et al.

User purchasing prediction with multi-behavior information remains a challenging problem for current recommendation systems. Various methods have been proposed to address it via leveraging the advantages of graph neural networks (GNNs) or multi-task learning. However, most existing works do not take the complex dependencies among different behaviors of users into consideration. They utilize simple and fixed schemes, like neighborhood information aggregation or mathematical calculation of vectors, to fuse the embeddings of different user behaviors to obtain a unified embedding to represent a user's behavioral patterns which will be used in downstream recommendation tasks. To tackle the challenge, in this paper, we first propose the concept of hyper meta-path to construct hyper meta-paths or hyper meta-graphs to explicitly illustrate the dependencies among different behaviors of a user. How to obtain a unified embedding for a user from hyper meta-paths and avoid the previously mentioned limitations simultaneously is critical. Thanks to the recent success of graph contrastive learning, we leverage it to learn embeddings of user behavior patterns adaptively instead of assigning a fixed scheme to understand the dependencies among different behaviors. A new graph contrastive learning based framework is proposed by coupling with hyper meta-paths, namely HMG-CR, which consistently and significantly outperforms all baselines in extensive comparison experiments.

17.5IRMay 19, 2021Code

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

Sixiao Zhang, Hongxu Chen, Xiao Ming et al.

Hyperbolic space and hyperbolic embeddings are becoming a popular research field for recommender systems. However, it is not clear under what circumstances the hyperbolic space should be considered. To fill this gap, This paper provides theoretical analysis and empirical results on when and where to use hyperbolic space and hyperbolic embeddings in recommender systems. Specifically, we answer the questions that which type of models and datasets are more suited for hyperbolic space, as well as which latent size to choose. We evaluate our answers by comparing the performance of Euclidean space and hyperbolic space on different latent space models in both general item recommendation domain and social recommendation domain, with 6 widely used datasets and different latent sizes. Additionally, we propose a new metric learning based recommendation method called SCML and its hyperbolic version HSCML. We evaluate our conclusions regarding hyperbolic space on SCML and show the state-of-the-art performance of hyperbolic space by comparing HSCML with other baseline methods.

8.5IRApr 15, 2021

Hyperbolic Neural Collaborative Recommender

Anchen Li, Bo Yang, Hongxu Chen et al.

This paper explores the use of hyperbolic geometry and deep learning techniques for recommendation. We present Hyperbolic Neural Collaborative Recommender (HNCR), a deep hyperbolic representation learning method that exploits mutual semantic relations among users/items for collaborative filtering (CF) tasks. HNCR contains two major phases: neighbor construction and recommendation framework. The first phase introduces a neighbor construction strategy to construct a semantic neighbor set for each user and item according to the user-item historical interaction. In the second phase, we develop a deep framework based on hyperbolic geometry to integrate constructed neighbor sets into recommendation. Via a series of extensive experiments, we show that HNCR outperforms its Euclidean counterpart and state-of-the-art baselines.

3.3SPJan 8, 2021

FENet: A Frequency Extraction Network for Obstructive Sleep Apnea Detection

Guanhua Ye, Hongzhi Yin, Tong Chen et al.

Obstructive Sleep Apnea (OSA) is a highly prevalent but inconspicuous disease that seriously jeopardizes the health of human beings. Polysomnography (PSG), the gold standard of detecting OSA, requires multiple specialized sensors for signal collection, hence patients have to physically visit hospitals and bear the costly treatment for a single detection. Recently, many single-sensor alternatives have been proposed to improve the cost efficiency and convenience. Among these methods, solutions based on RR-interval (i.e., the interval between two consecutive pulses) signals reach a satisfactory balance among comfort, portability and detection accuracy. In this paper, we advance RR-interval based OSA detection by considering its real-world practicality from energy perspectives. As photoplethysmogram (PPG) pulse sensors are commonly equipped on smart wrist-worn wearable devices (e.g., smart watches and wristbands), the energy efficiency of the detection model is crucial to fully support an overnight observation on patients. This creates challenges as the PPG sensors are unable to keep collecting continuous signals due to the limited battery capacity on smart wrist-worn devices. Therefore, we propose a novel Frequency Extraction Network (FENet), which can extract features from different frequency bands of the input RR-interval signals and generate continuous detection results with downsampled, discontinuous RR-interval signals. With the help of the one-to-multiple structure, FENet requires only one-third of the operation time of the PPG sensor, thus sharply cutting down the energy consumption and enabling overnight diagnosis. Experimental results on real OSA datasets reveal the state-of-the-art performance of FENet.

3.3SINov 25, 2020Code

Interpretable Signed Link Prediction with Signed Infomax Hyperbolic Graph

Yadan Luo, Zi Huang, Hongxu Chen et al.

Signed link prediction in social networks aims to reveal the underlying relationships (i.e. links) among users (i.e. nodes) given their existing positive and negative interactions observed. Most of the prior efforts are devoted to learning node embeddings with graph neural networks (GNNs), which preserve the signed network topology by message-passing along edges to facilitate the downstream link prediction task. Nevertheless, the existing graph-based approaches could hardly provide human-intelligible explanations for the following three questions: (1) which neighbors to aggregate, (2) which path to propagate along, and (3) which social theory to follow in the learning process. To answer the aforementioned questions, in this paper, we investigate how to reconcile the \textit{balance} and \textit{status} social rules with information theory and develop a unified framework, termed as Signed Infomax Hyperbolic Graph (\textbf{SIHG}). By maximizing the mutual information between edge polarities and node embeddings, one can identify the most representative neighboring nodes that support the inference of edge sign. Different from existing GNNs that could only group features of friends in the subspace, the proposed SIHG incorporates the signed attention module, which is also capable of pushing hostile users far away from each other to preserve the geometry of antagonism. The polarity of the learned edge attention maps, in turn, provide interpretations of the social theories used in each aggregation. In order to model high-order user relations and complex hierarchies, the node embeddings are projected and measured in a hyperbolic space with a lower distortion. Extensive experiments on four signed network benchmarks demonstrate that the proposed SIHG framework significantly outperforms the state-of-the-arts in signed link prediction.

17.0SIJun 2, 2020

Multi-level Graph Convolutional Networks for Cross-platform Anchor Link Prediction

Hongxu Chen, Hongzhi Yin, Xiangguo Sun et al.

Cross-platform account matching plays a significant role in social network analytics, and is beneficial for a wide range of applications. However, existing methods either heavily rely on high-quality user generated content (including user profiles) or suffer from data insufficiency problem if only focusing on network topology, which brings researchers into an insoluble dilemma of model selection. In this paper, to address this problem, we propose a novel framework that considers multi-level graph convolutions on both local network structure and hypergraph structure in a unified manner. The proposed method overcomes data insufficiency problem of existing work and does not necessarily rely on user demographic information. Moreover, to adapt the proposed method to be capable of handling large-scale social networks, we propose a two-phase space reconciliation mechanism to align the embedding spaces in both network partitioning based parallel training and account matching across different social networks. Extensive experiments have been conducted on two large-scale real-life social networks. The experimental results demonstrate that the proposed method outperforms the state-of-the-art models with a big margin.

1.2LGJan 6, 2020

A Block-based Generative Model for Attributed Networks Embedding

Xueyan Liu, Bo Yang, Wenzhuo Song et al.

Attributed network embedding has attracted plenty of interest in recent years. It aims to learn task-independent, low-dimensional, and continuous vectors for nodes preserving both topology and attribute information. Most of the existing methods, such as random-walk based methods and GCNs, mainly focus on the local information, i.e., the attributes of the neighbours. Thus, they have been well studied for assortative networks (i.e., networks with communities) but ignored disassortative networks (i.e., networks with multipartite, hubs, and hybrid structures), which are common in the real world. To enable model both assortative and disassortative networks, we propose a block-based generative model for attributed network embedding from a probability perspective. Specifically, the nodes are assigned to several blocks wherein the nodes in the same block share the similar linkage patterns. These patterns can define assortative networks containing communities or disassortative networks with the multipartite, hub, or any hybrid structures. To preserve the attribute information, we assume that each node has a hidden embedding related to its assigned block. We use a neural network to characterize the nonlinearity between node embeddings and node attributes. We perform extensive experiments on real-world and synthetic attributed networks. The results show that our proposed method consistently outperforms state-of-the-art embedding methods for both clustering and classification tasks, especially on disassortative networks.

2.7LGOct 29, 2019

Hyperbolic Node Embedding for Signed Networks

Wenzhuo Song, Hongxu Chen, Xueyan Liu et al.

Signed network embedding methods aim to learn vector representations of nodes in signed networks. However, existing algorithms only managed to embed networks into low-dimensional Euclidean spaces whereas many intrinsic features of signed networks are reported more suitable for non-Euclidean spaces. For instance, previous works did not consider the hierarchical structures of networks, which is widely witnessed in real-world networks. In this work, we answer an open question that whether the hyperbolic space is a better choice to accommodate signed networks and learn embeddings that can preserve the corresponding special characteristics. We also propose a non-Euclidean signed network embedding method based on structural balance theory and Riemannian optimization, which embeds signed networks into a Poincaré ball in a hyperbolic space. This space enables our approach to capture underlying hierarchy of nodes in signed networks because it can be seen as a continuous tree. We empirically compare our method against six Euclidean-based baselines in three tasks on seven real-world datasets, and the results show the effectiveness of our method.

26.8SESep 4, 2018

DeepHunter: Hunting Deep Neural Network Defects via Coverage-Guided Fuzzing

Xiaofei Xie, Lei Ma, Felix Juefei-Xu et al.

In company with the data explosion over the past decade, deep neural network (DNN) based software has experienced unprecedented leap and is becoming the key driving force of many novel industrial applications, including many safety-critical scenarios such as autonomous driving. Despite great success achieved in various human intelligence tasks, similar to traditional software, DNNs could also exhibit incorrect behaviors caused by hidden defects causing severe accidents and losses. In this paper, we propose DeepHunter, an automated fuzz testing framework for hunting potential defects of general-purpose DNNs. DeepHunter performs metamorphic mutation to generate new semantically preserved tests, and leverages multiple plugable coverage criteria as feedback to guide the test generation from different perspectives. To be scalable towards practical-sized DNNs, DeepHunter maintains multiple tests in a batch, and prioritizes the tests selection based on active feedback. The effectiveness of DeepHunter is extensively investigated on 3 popular datasets (MNIST, CIFAR-10, ImageNet) and 7 DNNs with diverse complexities, under a large set of 6 coverage criteria as feedback. The large-scale experiments demonstrate that DeepHunter can (1) significantly boost the coverage with guidance; (2) generate useful tests to detect erroneous behaviors and facilitate the DNN model quality evaluation; (3) accurately capture potential defects during DNN quantization for platform migration.

0.7LGOct 14, 2017

When Point Process Meets RNNs: Predicting Fine-Grained User Interests with Mutual Behavioral Infectivity

Tong Chen, Lin Wu, Yang Wang et al.

Predicting fine-grained interests of users with temporal behavior is important to personalization and information filtering applications. However, existing interest prediction methods are incapable of capturing the subtle degreed user interests towards particular items, and the internal time-varying drifting attention of individuals is not studied yet. Moreover, the prediction process can also be affected by inter-personal influence, known as behavioral mutual infectivity. Inspired by point process in modeling temporal point process, in this paper we present a deep prediction method based on two recurrent neural networks (RNNs) to jointly model each user's continuous browsing history and asynchronous event sequences in the context of inter-user behavioral mutual infectivity. Our model is able to predict the fine-grained interest from a user regarding a particular item and corresponding timestamps when an occurrence of event takes place. The proposed approach is more flexible to capture the dynamic characteristic of event sequences by using the temporal point process to model event data and timely update its intensity function by RNNs. Furthermore, to improve the interpretability of the model, the attention mechanism is introduced to emphasize both intra-personal and inter-personal behavior influence over time. Experiments on real datasets demonstrate that our model outperforms the state-of-the-art methods in fine-grained user interest prediction.

5.1PLSep 27, 2017

A Permission-Dependent Type System for Secure Information Flow Analysis

Hongxu Chen, Alwen Tiu, Zhiwu Xu et al.

We introduce a novel type system for enforcing secure information flow in an imperative language. Our work is motivated by the problem of statically checking potential information leakage in Android applications. To this end, we design a lightweight type system featuring Android permission model, where the permissions are statically assigned to applications and are used to enforce access control in the applications. We take inspiration from a type system by Banerjee and Naumann (BN) to allow security types to be dependent on the permissions of the applications. A novel feature of our type system is a typing rule for conditional branching induced by permission testing, which introduces a merging operator on security types, allowing more precise security policies to be enforced. The soundness of our type system is proved with respect to a notion of noninterference. In addition, a type inference algorithm is presented for the underlying security type system, by reducing the inference problem to a constraint solving problem in the lattice of security types.