AIJun 3Code
Online Skill Learning for Web Agents via State-Grounded Dynamic RetrievalJiaxi Li, Ke Deng, Yun Wang et al.
Language agents increasingly rely on reusable skills to improve multi-step web automation across related tasks. A growing line of work studies online skill learning, where agents continually induce skills from previous task trajectories and reuse them in future tasks on the fly. However, existing methods mainly reuse skills at the task-level: a fixed set of skills is retrieved based on the initial task instruction and then held fixed throughout execution. This static strategy is misaligned with web execution, where the appropriate next action depends not only on the task goal but also on the current webpage state, which often transitions into situations that the initial skills fail to cover. To address this gap, we propose State-Grounded Dynamic Retrieval (SGDR), an online skill learning method that enables stepwise skill reuse for web agents. SGDR consists of three components: a sliding-window extraction process that turns completed trajectories into reusable sub-procedures invokable at intermediate execution states, a dual text-code representation that connects skill retrieval with executable action, and a state-grounded dynamic retrieval mechanism that matches skills to both the task goal and the current webpage state. Experiments on WebArena across five domains show that SGDR consistently outperforms strong baselines, achieving average success rates of 37.5% with GPT-4.1 and 24.3% with Qwen3-4B, corresponding to relative gains of 10.6% and 10.0% over the strongest baseline, respectively. The code is available at https://github.com/plusnli/skill-dynamic-retrieval.
CVApr 7, 2023Code
Construction of unbiased dental template and parametric dental model for precision digital dentistryLei Ma, Jingyang Zhang, Ke Deng et al.
Dental template and parametric dental models are important tools for various applications in digital dentistry. However, constructing an unbiased dental template and accurate parametric dental models remains a challenging task due to the complex anatomical and morphological dental structures and also low volume ratio of the teeth. In this study, we develop an unbiased dental template by constructing an accurate dental atlas from CBCT images with guidance of teeth segmentation. First, to address the challenges, we propose to enhance the CBCT images and their segmentation images, including image cropping, image masking and segmentation intensity reassigning. Then, we further use the segmentation images to perform co-registration with the CBCT images to generate an accurate dental atlas, from which an unbiased dental template can be generated. By leveraging the unbiased dental template, we construct parametric dental models by estimating point-to-point correspondences between the dental models and employing Principal Component Analysis to determine shape subspaces of the parametric dental models. A total of 159 CBCT images of real subjects are collected to perform the constructions. Experimental results demonstrate effectiveness of our proposed method in constructing unbiased dental template and parametric dental model. The developed dental template and parametric dental models are available at https://github.com/Marvin0724/Teeth_template.
CVJul 18, 2022Code
ExAgt: Expert-guided Augmentation for Representation Learning of Traffic ScenariosLakshman Balasubramanian, Jonas Wurst, Robin Egolf et al.
Representation learning in recent years has been addressed with self-supervised learning methods. The input data is augmented into two distorted views and an encoder learns the representations that are invariant to distortions -- cross-view prediction. Augmentation is one of the key components in cross-view self-supervised learning frameworks to learn visual representations. This paper presents ExAgt, a novel method to include expert knowledge for augmenting traffic scenarios, to improve the learnt representations without any human annotation. The expert-guided augmentations are generated in an automated fashion based on the infrastructure, the interactions between the EGO and the traffic participants and an ideal sensor model. The ExAgt method is applied in two state-of-the-art cross-view prediction methods and the representations learnt are tested in downstream tasks like classification and clustering. Results show that the ExAgt method improves representation learning compared to using only standard augmentations and it provides a better representation space stability. The code is available at https://github.com/lab176344/ExAgt.
LGAug 4, 2023Code
Intensity-free Integral-based Learning of Marked Temporal Point ProcessesSishun Liu, Ke Deng, Xiuzhen Zhang et al.
In the marked temporal point processes (MTPP), a core problem is to parameterize the conditional joint PDF (probability distribution function) $p^*(m,t)$ for inter-event time $t$ and mark $m$, conditioned on the history. The majority of existing studies predefine intensity functions. Their utility is challenged by specifying the intensity function's proper form, which is critical to balance expressiveness and processing efficiency. Recently, there are studies moving away from predefining the intensity function -- one models $p^*(t)$ and $p^*(m)$ separately, while the other focuses on temporal point processes (TPPs), which do not consider marks. This study aims to develop high-fidelity $p^*(m,t)$ for discrete events where the event marks are either categorical or numeric in a multi-dimensional continuous space. We propose a solution framework IFIB (\underline{I}ntensity-\underline{f}ree \underline{I}ntegral-\underline{b}ased process) that models conditional joint PDF $p^*(m,t)$ directly without intensity functions. It remarkably simplifies the process to compel the essential mathematical restrictions. We show the desired properties of IFIB and the superior experimental results of IFIB on real-world and synthetic datasets. The code is available at \url{https://github.com/StepinSilence/IFIB}.
NIJun 16, 2023
TSNet-SAC: Leveraging Transformers for Efficient Task SchedulingKe Deng, Zhiyuan He, Hao Zhang et al.
In future 6G Mobile Edge Computing (MEC), autopilot systems require the capability of processing multimodal data with strong interdependencies. However, traditional heuristic algorithms are inadequate for real-time scheduling due to their requirement for multiple iterations to derive the optimal scheme. We propose a novel TSNet-SAC based on Transformer, that utilizes heuristic algorithms solely to guide the training of TSNet. Additionally, a Sliding Augment Component (SAC) is introduced to enhance the robustness and resolve algorithm defects. Furthermore, the Extender component is designed to handle multi-scale training data and provide network scalability, enabling TSNet to adapt to different access scenarios. Simulation demonstrates that TSNet-SAC outperforms existing networks in accuracy and robustness, achieving superior scheduling-making latency compared to heuristic algorithms.
IRJul 3, 2024
CRUISE on Quantum Computing for Feature Selection in Recommender SystemsJiayang Niu, Jie Li, Ke Deng et al.
Using Quantum Computers to solve problems in Recommender Systems that classical computers cannot address is a worthwhile research topic. In this paper, we use Quantum Annealers to address the feature selection problem in recommendation algorithms. This feature selection problem is a Quadratic Unconstrained Binary Optimization(QUBO) problem. By incorporating Counterfactual Analysis, we significantly improve the performance of the item-based KNN recommendation algorithm compared to using pure Mutual Information. Extensive experiments have demonstrated that the use of Counterfactual Analysis holds great promise for addressing such problems.
MLMay 13
Adaptive Kernel Density Estimation with Pre-trainingRuitong Zhang, Ke Deng
Density estimation in high-dimensional settings is an important and challenging statistical problem.Traditional methods based on kernel smoothing are inefficient in high dimensions due to the difficulties in specifying appropriate location-adaptive kernels. In this work, we introduce pre-training, a key idea behind many cutting-edge AI technologies, to the context of non-parametric density estimation. By establishing a pre-trained neural network that can recommend an appropriate location-adaptive kernel for each sample point, efficient density estimation with adaptive kernels is achieved in high dimensions. A wide range of numerical experiments show that this strategy is highly effective for improving density-estimation accuracy, when the target distribution is close to the distribution family for pre-training. When the target distribution is substantially different from the pre-training distribution family, the benefit from the proposed pre-training strategy may be diluted, but can be reactivated by an additional fine-tuning procedure.
CVFeb 10
LARV: Data-Free Layer-wise Adaptive Rescaling Veneer for Model MergingXinyu Wang, Ke Deng, Fei Dou et al.
Model merging aims to combine multiple fine-tuned models into a single multi-task model without access to training data. Existing task-vector merging methods such as TIES, TSV-M, and Iso-C/CTS differ in their aggregation rules but treat all layers nearly uniformly. This assumption overlooks the strong layer-wise heterogeneity in large vision transformers, where shallow layers are sensitive to interference while deeper layers encode stable task-specific features. We introduce LARV, a training-free, data-free, merger-agnostic Layer-wise Adaptive Rescaling Veneer that plugs into any task-vector merger and assigns a per-layer scale to each task vector before aggregation, and show it consistently boosts diverse merging rules. LARV adaptively suppresses shallow-layer interference and amplifies deeper-layer alignment using a simple deterministic schedule, requiring no retraining or modification to existing mergers. To our knowledge, this is the first work to perform layer-aware scaling for task-vector merging. LARV computes simple data-free layer proxies and turns them into scales through a lightweight rule; we study several instantiations within one framework (e.g., tiered two/three-level scaling with fixed values, or continuous mappings) and show that tiered choices offer the best robustness, while continuous mappings remain an ablation. LARV is orthogonal to the base merger and adds negligible cost. On FusionBench with Vision Transformers, LARV consistently improves all task-vector baselines across 8/14/20-task settings; for example, Iso-C + LARV reaches 85.9% on ViT-B/32, 89.2% on ViT-B/16, and 92.6% on ViT-L/14. Layerwise analysis and corruption tests further indicate that LARV suppresses shallow-layer interference while modestly amplifying deeper, task-stable features, turning model merging into a robust, layer-aware procedure rather than a uniform one.
LGOct 23, 2025Code
Addressing Mark Imbalance in Integration-free Neural Marked Temporal Point ProcessesSishun Liu, Ke Deng, Yongli Ren et al.
Marked Temporal Point Process (MTPP) has been well studied to model the event distribution in marked event streams, which can be used to predict the mark and arrival time of the next event. However, existing studies overlook that the distribution of event marks is highly imbalanced in many real-world applications, with some marks being frequent but others rare. The imbalance poses a significant challenge to the performance of the next event prediction, especially for events of rare marks. To address this issue, we propose a thresholding method, which learns thresholds to tune the mark probability normalized by the mark's prior probability to optimize mark prediction, rather than predicting the mark directly based on the mark probability as in existing studies. In conjunction with this method, we predict the mark first and then the time. In particular, we develop a novel neural MTPP model to support effective time sampling and estimation of mark probability without computationally expensive numerical improper integration. Extensive experiments on real-world datasets demonstrate the superior performance of our solution against various baselines for the next event mark and time prediction. The code is available at https://github.com/undes1red/IFNMTPP.
AISep 8, 2025Code
Tree of Agents: Improving Long-Context Capabilities of Large Language Models through Multi-Perspective ReasoningSong Yu, Xiaofei Xu, Ke Deng et al.
Large language models (LLMs) face persistent challenges when handling long-context tasks, most notably the lost in the middle issue, where information located in the middle of a long input tends to be underutilized. Some existing methods that reduce input have the risk of discarding key information, while others that extend context windows often lead to attention dispersion. To address these limitations, we propose Tree of Agents (TOA), a multi-agent reasoning framework that segments the input into chunks processed by independent agents. Each agent generates its local cognition, then agents dynamically exchange information for collaborative reasoning along tree-structured paths. TOA enables agents to probe different reasoning orders for multi-perspective understanding, effectively mitigating position bias and reducing hallucinations. To improve processing efficiency, we incorporate prefix-hash caching and adaptive pruning strategies, achieving significant performance improvements with comparable API overhead. Experiments show that TOA, powered by compact LLaMA3.1-8B, significantly outperforms multiple baselines and demonstrates comparable performance to the latest and much larger commercial models, such as Gemini1.5-pro, on various long-context tasks. Code is available at https://github.com/Aireduce952/Tree-of-Agents.
MLMar 29
Energy Score-Guided Neural Gaussian Mixture Model for Predictive Uncertainty QuantificationYang Yang, Chunlin Ji, Haoyang Li et al.
Quantifying predictive uncertainty is essential for real world machine learning applications, especially in scenarios requiring reliable and interpretable predictions. Many common parametric approaches rely on neural networks to estimate distribution parameters by optimizing the negative log likelihood. However, these methods often encounter challenges like training instability and mode collapse, leading to poor estimates of the mean and variance of the target output distribution. In this work, we propose the Neural Energy Gaussian Mixture Model (NE-GMM), a novel framework that integrates Gaussian Mixture Model (GMM) with Energy Score (ES) to enhance predictive uncertainty quantification. NE-GMM leverages the flexibility of GMM to capture complex multimodal distributions and leverages the robustness of ES to ensure well calibrated predictions in diverse scenarios. We theoretically prove that the hybrid loss function satisfies the properties of a strictly proper scoring rule, ensuring alignment with the true data distribution, and establish generalization error bounds, demonstrating that the model's empirical performance closely aligns with its expected performance on unseen data. Extensive experiments on both synthetic and real world datasets demonstrate the superiority of NE-GMM in terms of both predictive accuracy and uncertainty quantification.
LGNov 13, 2023
Explainable History Distillation by Marked Temporal Point ProcessSishun Liu, Ke Deng, Yan Wang et al.
Explainability of machine learning models is mandatory when researchers introduce these commonly believed black boxes to real-world tasks, especially high-stakes ones. In this paper, we build a machine learning system to automatically generate explanations of happened events from history by \gls{ca} based on the \acrfull{tpp}. Specifically, we propose a new task called \acrfull{ehd}. This task requires a model to distill as few events as possible from observed history. The target is that the event distribution conditioned on left events predicts the observed future noticeably worse. We then regard distilled events as the explanation for the future. To efficiently solve \acrshort{ehd}, we rewrite the task into a \gls{01ip} and directly estimate the solution to the program by a model called \acrfull{model}. This work fills the gap between our task and existing works, which only spot the difference between factual and counterfactual worlds after applying a predefined modification to the environment. Experiment results on Retweet and StackOverflow datasets prove that \acrshort{model} significantly outperforms other \acrshort{ehd} baselines and can reveal the rationale underpinning real-world processes.
SIJan 28, 2024
Harnessing Network Effect for Fake News Mitigation: Selecting Debunkers via Self-Imitation LearningXiaofei Xu, Ke Deng, Michael Dann et al.
This study aims to minimize the influence of fake news on social networks by deploying debunkers to propagate true news. This is framed as a reinforcement learning problem, where, at each stage, one user is selected to propagate true news. A challenging issue is episodic reward where the "net" effect of selecting individual debunkers cannot be discerned from the interleaving information propagation on social networks, and only the collective effect from mitigation efforts can be observed. Existing Self-Imitation Learning (SIL) methods have shown promise in learning from episodic rewards, but are ill-suited to the real-world application of fake news mitigation because of their poor sample efficiency. To learn a more effective debunker selection policy for fake news mitigation, this study proposes NAGASIL - Negative sampling and state Augmented Generative Adversarial Self-Imitation Learning, which consists of two improvements geared towards fake news mitigation: learning from negative samples, and an augmented state representation to capture the "real" environment state by integrating the current observed state with the previous state-action pairs from the same campaign. Experiments on two social networks show that NAGASIL yields superior performance to standard GASIL and state-of-the-art fake news mitigation models.
IROct 20, 2024
Performance-Driven QUBO for Recommender Systems on Quantum AnnealersJiayang Niu, Jie Li, Ke Deng et al.
We propose Counterfactual Analysis Quadratic Unconstrained Binary Optimization (CAQUBO) to solve QUBO problems for feature selection in recommender systems. CAQUBO leverages counterfactual analysis to measure the impact of individual features and feature combinations on model performance and employs the measurements to construct the coefficient matrix for a quantum annealer to select the optimal feature combinations for recommender systems, thereby improving their final recommendation performance. By establishing explicit connections between features and the recommendation performance, the proposed approach demonstrates superior performance compared to the state-of-the-art quantum annealing methods. Extensive experiments indicate that integrating quantum computing with counterfactual analysis holds great promise for addressing these challenges.
AIJul 25, 2025
Alignment and Safety in Large Language Models: Safety Mechanisms, Training Paradigms, and Emerging ChallengesHaoran Lu, Luyang Fang, Ruidong Zhang et al.
Due to the remarkable capabilities and growing impact of large language models (LLMs), they have been deeply integrated into many aspects of society. Thus, ensuring their alignment with human values and intentions has emerged as a critical challenge. This survey provides a comprehensive overview of practical alignment techniques, training protocols, and empirical findings in LLM alignment. We analyze the development of alignment methods across diverse paradigms, characterizing the fundamental trade-offs between core alignment objectives. Our analysis shows that while supervised fine-tuning enables basic instruction-following, preference-based methods offer more flexibility for aligning with nuanced human intent. We discuss state-of-the-art techniques, including Direct Preference Optimization (DPO), Constitutional AI, brain-inspired methods, and alignment uncertainty quantification (AUQ), highlighting their approaches to balancing quality and efficiency. We review existing evaluation frameworks and benchmarking datasets, emphasizing limitations such as reward misspecification, distributional robustness, and scalable oversight. We summarize strategies adopted by leading AI labs to illustrate the current state of practice. We conclude by outlining open problems in oversight, value pluralism, robustness, and continuous alignment. This survey aims to inform both researchers and practitioners navigating the evolving landscape of LLM alignment.
CVApr 7, 2024
Efficient Surgical Tool Recognition via HMM-Stabilized Deep LearningHaifeng Wang, Hao Xu, Jun Wang et al.
Recognizing various surgical tools, actions and phases from surgery videos is an important problem in computer vision with exciting clinical applications. Existing deep-learning-based methods for this problem either process each surgical video as a series of independent images without considering their dependence, or rely on complicated deep learning models to count for dependence of video frames. In this study, we revealed from exploratory data analysis that surgical videos enjoy relatively simple semantic structure, where the presence of surgical phases and tools can be well modeled by a compact hidden Markov model (HMM). Based on this observation, we propose an HMM-stabilized deep learning method for tool presence detection. A wide range of experiments confirm that the proposed approaches achieve better performance with lower training and running costs, and support more flexible ways to construct and utilize training data in scenarios where not all surgery videos of interest are extensively labelled. These results suggest that popular deep learning approaches with over-complicated model structures may suffer from inefficient utilization of data, and integrating ingredients of deep learning and statistical learning wisely may lead to more powerful algorithms that enjoy competitive performance, transparent interpretation and convenient model training simultaneously.
LGAug 11, 2025
MemoryKT: An Integrative Memory-and-Forgetting Method for Knowledge TracingMingrong Lin, Ke Deng, Zhengyang Wu et al.
Knowledge Tracing (KT) is committed to capturing students' knowledge mastery from their historical interactions. Simulating students' memory states is a promising approach to enhance both the performance and interpretability of knowledge tracing models. Memory consists of three fundamental processes: encoding, storage, and retrieval. Although forgetting primarily manifests during the storage stage, most existing studies rely on a single, undifferentiated forgetting mechanism, overlooking other memory processes as well as personalized forgetting patterns. To address this, this paper proposes memoryKT, a knowledge tracing model based on a novel temporal variational autoencoder. The model simulates memory dynamics through a three-stage process: (i) Learning the distribution of students' knowledge memory features, (ii) Reconstructing their exercise feedback, while (iii) Embedding a personalized forgetting module within the temporal workflow to dynamically modulate memory storage strength. This jointly models the complete encoding-storage-retrieval cycle, significantly enhancing the model's perception capability for individual differences. Extensive experiments on four public datasets demonstrate that our proposed approach significantly outperforms state-of-the-art baselines.
LGFeb 1, 2025
HoP: Homeomorphic Polar Learning for Hard Constrained OptimizationKe Deng, Hanwen Zhang, Jin Lu et al.
Constrained optimization demands highly efficient solvers which promotes the development of learn-to-optimize (L2O) approaches. As a data-driven method, L2O leverages neural networks to efficiently produce approximate solutions. However, a significant challenge remains in ensuring both optimality and feasibility of neural networks' output. To tackle this issue, we introduce Homeomorphic Polar Learning (HoP) to solve the star-convex hard-constrained optimization by embedding homeomorphic mapping in neural networks. The bijective structure enables end-to-end training without extra penalty or correction. For performance evaluation, we evaluate HoP's performance across a variety of synthetic optimization tasks and real-world applications in wireless communications. In all cases, HoP achieves solutions closer to the optimum than existing L2O methods while strictly maintaining feasibility.
LGNov 24, 2025
Multi-Agent Cross-Entropy Method with Monotonic Nonlinear Critic DecompositionYan Wang, Ke Deng, Yongli Ren
Cooperative multi-agent reinforcement learning (MARL) commonly adopts centralized training with decentralized execution (CTDE), where centralized critics leverage global information to guide decentralized actors. However, centralized-decentralized mismatch (CDM) arises when the suboptimal behavior of one agent degrades others' learning. Prior approaches mitigate CDM through value decomposition, but linear decompositions allow per-agent gradients at the cost of limited expressiveness, while nonlinear decompositions improve representation but require centralized gradients, reintroducing CDM. To overcome this trade-off, we propose the multi-agent cross-entropy method (MCEM), combined with monotonic nonlinear critic decomposition (NCD). MCEM updates policies by increasing the probability of high-value joint actions, thereby excluding suboptimal behaviors. For sample efficiency, we extend off-policy learning with a modified k-step return and Retrace. Analysis and experiments demonstrate that MCEM outperforms state-of-the-art methods across both continuous and discrete action benchmarks.
CVNov 21, 2025
Shape-preserving Tooth Segmentation from CBCT Images Using Deep Learning with Semantic and Shape AwarenessZongrui Ji, Zhiming Cui, Na Li et al.
Background:Accurate tooth segmentation from cone beam computed tomography (CBCT) images is crucial for digital dentistry but remains challenging in cases of interdental adhesions, which cause severe anatomical shape distortion. Methods: To address this, we propose a deep learning framework that integrates semantic and shape awareness for shape-preserving segmentation. Our method introduces a target-tooth-centroid prompted multi-label learning strategy to model semantic relationships between teeth, reducing shape ambiguity. Additionally, a tooth-shape-aware learning mechanism explicitly enforces morphological constraints to preserve boundary integrity. These components are unified via multi-task learning, jointly optimizing segmentation and shape preservation. Results: Extensive evaluations on internal and external datasets demonstrate that our approach significantly outperforms existing methods. Conclusions: Our approach effectively mitigates shape distortions and providing anatomically faithful tooth boundaries.
IROct 17, 2025
Enhance Large Language Models as Recommendation Systems with Collaborative FilteringZhisheng Yang, Xiaofei Xu, Ke Deng et al.
As powerful tools in Natural Language Processing (NLP), Large Language Models (LLMs) have been leveraged for crafting recommendations to achieve precise alignment with user preferences and elevate the quality of the recommendations. The existing approaches implement both non-tuning and tuning strategies. Compared to following the tuning strategy, the approaches following the non-tuning strategy avoid the relatively costly, time-consuming, and expertise-requiring process of further training pre-trained LLMs on task-specific datasets, but they suffer the issue of not having the task-specific business or local enterprise knowledge. To the best of our knowledge, none of the existing approaches following the non-tuning strategy explicitly integrates collaborative filtering, one of the most successful recommendation techniques. This study aims to fill the gap by proposing critique-based LLMs as recommendation systems (Critic-LLM-RS). For our purpose, we train a separate machine-learning model called Critic that implements collaborative filtering for recommendations by learning from the interactions between many users and items. The Critic provides critiques to LLMs to significantly refine the recommendations. Extensive experiments have verified the effectiveness of Critic-LLM-RS on real datasets.
LGOct 1, 2025
Rehearsal-free and Task-free Online Continual Learning With Contrastive PromptAopeng Wang, Ke Deng, Yongli Ren et al.
The main challenge of continual learning is \textit{catastrophic forgetting}. Because of processing data in one pass, online continual learning (OCL) is one of the most difficult continual learning scenarios. To address catastrophic forgetting in OCL, some existing studies use a rehearsal buffer to store samples and replay them in the later learning process, other studies do not store samples but assume a sequence of learning tasks so that the task identities can be explored. However, storing samples may raise data security or privacy concerns and it is not always possible to identify the boundaries between learning tasks in one pass of data processing. It motivates us to investigate rehearsal-free and task-free OCL (F2OCL). By integrating prompt learning with an NCM classifier, this study has effectively tackled catastrophic forgetting without storing samples and without usage of task boundaries or identities. The extensive experimental results on two benchmarks have demonstrated the effectiveness of the proposed method.
LGAug 16, 2025
Learning Marked Temporal Point Process Explanations based on Counterfactual and Factual ReasoningSishun Liu, Ke Deng, Xiuzhen Zhang et al.
Neural network-based Marked Temporal Point Process (MTPP) models have been widely adopted to model event sequences in high-stakes applications, raising concerns about the trustworthiness of outputs from these models. This study focuses on Explanation for MTPP, aiming to identify the minimal and rational explanation, that is, the minimum subset of events in history, based on which the prediction accuracy of MTPP matches that based on full history to a great extent and better than that based on the complement of the subset. This study finds that directly defining Explanation for MTPP as counterfactual explanation or factual explanation can result in irrational explanations. To address this issue, we define Explanation for MTPP as a combination of counterfactual explanation and factual explanation. This study proposes Counterfactual and Factual Explainer for MTPP (CFF) to solve Explanation for MTPP with a series of deliberately designed techniques. Experiments demonstrate the correctness and superiority of CFF over baselines regarding explanation quality and processing efficiency.
IRDec 20, 2021
CSSR: A Context-Aware Sequential Software Service Recommendation ModelMingwei Zhang, Jiayuan Liu, Weipu Zhang et al.
We propose a novel software service recommendation model to help users find their suitable repositories in GitHub. Our model first designs a novel context-induced repository graph embedding method to leverage rich contextual information of repositories to alleviate the difficulties caused by the data sparsity issue. It then leverages sequence information of user-repository interactions for the first time in the software service recommendation field. Specifically, a deep-learning based sequential recommendation technique is adopted to capture the dynamics of user preferences. Comprehensive experiments have been conducted on a large dataset collected from GitHub against a list of existing methods. The results illustrate the superiority of our method in various aspects.
LGOct 22, 2021
MANDERA: Malicious Node Detection in Federated Learning via RankingWanchuang Zhu, Benjamin Zi Hao Zhao, Simon Luo et al.
Byzantine attacks hinder the deployment of federated learning algorithms. Although we know that the benign gradients and Byzantine attacked gradients are distributed differently, to detect the malicious gradients is challenging due to (1) the gradient is high-dimensional and each dimension has its unique distribution and (2) the benign gradients and the attacked gradients are always mixed (two-sample test methods cannot apply directly). To address the above, for the first time, we propose MANDERA which is theoretically guaranteed to efficiently detect all malicious gradients under Byzantine attacks with no prior knowledge or history about the number of attacked nodes. More specifically, we transfer the original updating gradient space into a ranking matrix. By such an operation, the scales of different dimensions of the gradients in the ranking space become identical. The high-dimensional benign gradients and the malicious gradients can be easily separated. The effectiveness of MANDERA is further confirmed by experimentation on four Byzantine attack implementations (Gaussian, Zero Gradient, Sign Flipping, Shifted Mean), comparing with state-of-the-art defenses. The experiments cover both IID and Non-IID datasets.
LGAug 22, 2021
ExamGAN and Twin-ExamGAN for Exam Script GenerationZhengyang Wu, Ke Deng, Judy Qiu et al.
Nowadays, the learning management system (LMS) has been widely used in different educational stages from primary to tertiary education for student administration, documentation, tracking, reporting, and delivery of educational courses, training programs, or learning and development programs. Towards effective learning outcome assessment, the exam script generation problem has attracted many attentions and been investigated recently. But the research in this field is still in its early stage. There are opportunities to further improve the quality of generated exam scripts in various aspects. In particular, two essential issues have been ignored largely by existing solutions. First, given a course, it is unknown yet how to generate an exam script which can result in a desirable distribution of student scores in a class (or across different classes). Second, while it is frequently encountered in practice, it is unknown so far how to generate a pair of high quality exam scripts which are equivalent in assessment (i.e., the student scores are comparable by taking either of them) but have significantly different sets of questions. To fill the gap, this paper proposes ExamGAN (Exam Script Generative Adversarial Network) to generate high quality exam scripts, and then extends ExamGAN to T-ExamGAN (Twin-ExamGAN) to generate a pair of high quality exam scripts. Based on extensive experiments on three benchmark datasets, it has verified the superiority of proposed solutions in various aspects against the state-of-the-art. Moreover, we have conducted a case study which demonstrated the effectiveness of proposed solution in a real teaching scenario.
CVMay 17, 2021
Traffic Scenario Clustering by Iterative Optimisation of Self-Supervised Networks Using a Random Forest Activation Pattern SimilarityLakshman Balasubramanian, Jonas Wurst, Michael Botsch et al.
Traffic scenario categorisation is an essential component of automated driving, for e.\,g., in motion planning algorithms and their validation. Finding new relevant scenarios without handcrafted steps reduce the required resources for the development of autonomous driving dramatically. In this work, a method is proposed to address this challenge by introducing a clustering technique based on a novel data-adaptive similarity measure, called Random Forest Activation Pattern (RFAP) similarity. The RFAP similarity is generated using a tree encoding scheme in a Random Forest algorithm. The clustering method proposed in this work takes into account that there are labelled scenarios available and the information from the labelled scenarios can help to guide the clustering of unlabelled scenarios. It consists of three steps. First, a self-supervised Convolutional Neural Network~(CNN) is trained on all available traffic scenarios using a defined self-supervised objective. Second, the CNN is fine-tuned for classification of the labelled scenarios. Third, using the labelled and unlabelled scenarios an iterative optimisation procedure is performed for clustering. In the third step at each epoch of the iterative optimisation, the CNN is used as a feature generator for an unsupervised Random Forest. The trained forest, in turn, provides the RFAP similarity to adapt iteratively the feature generation process implemented by the CNN. Extensive experiments and ablation studies have been done on the highD dataset. The proposed method shows superior performance compared to baseline clustering techniques.
CVMay 17, 2021
Open-set Recognition based on the Combination of Deep Learning and Ensemble Method for Detecting Unknown Traffic ScenariosLakshman Balasubramanian, Friedrich Kruber, Michael Botsch et al.
An understanding and classification of driving scenarios are important for testing and development of autonomous driving functionalities. Machine learning models are useful for scenario classification but most of them assume that data received during the testing are from one of the classes used in the training. This assumption is not true always because of the open environment where vehicles operate. This is addressed by a new machine learning paradigm called open-set recognition. Open-set recognition is the problem of assigning test samples to one of the classes used in training or to an unknown class. This work proposes a combination of Convolutional Neural Networks (CNN) and Random Forest (RF) for open set recognition of traffic scenarios. CNNs are used for the feature generation and the RF algorithm along with extreme value theory for the detection of known and unknown classes. The proposed solution is featured by exploring the vote patterns of trees in RF instead of just majority voting. By inheriting the ensemble nature of RF, the vote pattern of all trees combined with extreme value theory is shown to be well suited for detecting unknown classes. The proposed method has been tested on the highD and OpenTraffic datasets and has demonstrated superior performance in various aspects compared to existing solutions.
IRSep 18, 2020
A Knowledge Graph based Approach for Mobile Application RecommendationMingwei Zhang, Jiawei Zhao, Hai Dong et al.
With the rapid prevalence of mobile devices and the dramatic proliferation of mobile applications (apps), app recommendation becomes an emergent task that would benefit both app users and stockholders. How to effectively organize and make full use of rich side information of users and apps is a key challenge to address the sparsity issue for traditional approaches. To meet this challenge, we proposed a novel end-to-end Knowledge Graph Convolutional Embedding Propagation Model (KGEP) for app recommendation. Specifically, we first designed a knowledge graph construction method to model the user and app side information, then adopted KG embedding techniques to capture the factual triplet-focused semantics of the side information related to the first-order structure of the KG, and finally proposed a relation-weighted convolutional embedding propagation model to capture the recommendation-focused semantics related to high-order structure of the KG. Extensive experiments conducted on a real-world dataset validate the effectiveness of the proposed approach compared to the state-of-the-art recommendation approaches.
MLJul 20, 2020
Multi-level Training and Bayesian Optimization for Economical Hyperparameter OptimizationYang Yang, Ke Deng, Michael Zhu
Hyperparameters play a critical role in the performances of many machine learning methods. Determining their best settings or Hyperparameter Optimization (HPO) faces difficulties presented by the large number of hyperparameters as well as the excessive training time. In this paper, we develop an effective approach to reducing the total amount of required training time for HPO. In the initialization, the nested Latin hypercube design is used to select hyperparameter configurations for two types of training, which are, respectively, heavy training and light training. We propose a truncated additive Gaussian process model to calibrate approximate performance measurements generated by light training, using accurate performance measurements generated by heavy training. Based on the model, a sequential model-based algorithm is developed to generate the performance profile of the configuration space as well as find optimal ones. Our proposed approach demonstrates competitive performance when applied to optimize synthetic examples, support vector machines, fully connected networks and convolutional neural networks.
CLOct 19, 2019
An Improved Historical Embedding without AlignmentXiaofei Xu, Ke Deng, Fei Hu et al.
Many words have evolved in meaning as a result of cultural and social change. Understanding such changes is crucial for modelling language and cultural evolution. Low-dimensional embedding methods have shown promise in detecting words' meaning change by encoding them into dense vectors. However, when exploring semantic change of words over time, these methods require the alignment of word embeddings across different time periods. This process is computationally expensive, prohibitively time consuming and suffering from contextual variability. In this paper, we propose a new and scalable method for encoding words from different time periods into one dense vector space. This can greatly improve performance when it comes to identifying words that have changed in meaning over time. We evaluated our method on dataset from Google Books N-gram. Our method outperformed three other popular methods in terms of the number of words correctly identified to have changed in meaning. Additionally, we provide an intuitive visualization of the semantic evolution of some words extracted by our method
LGOct 1, 2019
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based TechniquesWenlei Bao, Li-Wen Chang, Yang Chen et al.
Quantization has emerged to be an effective way to significantly boost the performance of deep neural networks (DNNs) by utilizing low-bit computations. Despite having lower numerical precision, quantized DNNs are able to reduce both memory bandwidth and computation cycles with little losses of accuracy. Integer GEMM (General Matrix Multiplication) is critical to running quantized DNN models efficiently, as GEMM operations often dominate the computations in these models. Various approaches have been developed by leveraging techniques such as vectorization and memory layout to improve the performance of integer GEMM. However, these existing approaches are not fast enough in certain scenarios. We developed NGEMM, a compiler-based GEMM implementation for accelerating lower-precision training and inference. NGEMM has better use of the vector units by avoiding unnecessary vector computation that is introduced during tree reduction. We compared NGEMM's performance with the state-of-art BLAS libraries such as MKL. Our experimental results showed that NGEMM outperformed MKL non-pack and pack version by an average of 1.86x and 1.16x, respectively. We have applied NGEMM to a number of production services in Microsoft.
MLApr 6, 2017
DIMM-SC: A Dirichlet mixture model for clustering droplet-based single cell transcriptomic dataZhe Sun, Ting Wang, Ke Deng et al.
Motivation: Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. Methods: We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. An expectation-maximization algorithm is used for parameter inference. Results: We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods.