LGJan 20, 2023Code
Who Should I Engage with At What Time? A Missing Event Aware Temporal Graph Neural NetworkMingyi Liu, Zhiying Tu, Xiaofei Xu et al.
Temporal graph neural network has recently received significant attention due to its wide application scenarios, such as bioinformatics, knowledge graphs, and social networks. There are some temporal graph neural networks that achieve remarkable results. However, these works focus on future event prediction and are performed under the assumption that all historical events are observable. In real-world applications, events are not always observable, and estimating event time is as important as predicting future events. In this paper, we propose MTGN, a missing event-aware temporal graph neural network, which uniformly models evolving graph structure and timing of events to support predicting what will happen in the future and when it will happen.MTGN models the dynamic of both observed and missing events as two coupled temporal point processes, thereby incorporating the effects of missing events into the network. Experimental results on several real-world temporal graphs demonstrate that MTGN significantly outperforms existing methods with up to 89% and 112% more accurate time and link prediction. Code can be found on https://github.com/HIT-ICES/TNNLS-MTGN.
AIJan 29, 2023Code
HeroNet: A Hybrid Retrieval-Generation Network for Conversational BotsBolin Zhang, Yunzhe Xu, Zhiying Tu et al.
Using natural language, Conversational Bot offers unprecedented ways to many challenges in areas such as information searching, item recommendation, and question answering. Existing bots are usually developed through retrieval-based or generative-based approaches, yet both of them have their own advantages and disadvantages. To assemble this two approaches, we propose a hybrid retrieval-generation network (HeroNet) with the three-fold ideas: 1). To produce high-quality sentence representations, HeroNet performs multi-task learning on two subtasks: Similar Queries Discovery and Query-Response Matching. Specifically, the retrieval performance is improved while the model size is reduced by training two lightweight, task-specific adapter modules that share only one underlying T5-Encoder model. 2). By introducing adversarial training, HeroNet is able to solve both retrieval\&generation tasks simultaneously while maximizing performance of each other. 3). The retrieval results are used as prior knowledge to improve the generation performance while the generative result are scored by the discriminator and their scores are integrated into the generator's cross-entropy loss function. The experimental results on a open dataset demonstrate the effectiveness of the HeroNet and our code is available at https://github.com/TempHero/HeroNet.git
46.6SEMay 23
SmellDoc: Extending Elastic Stack for Microservice Bad Smell Detection and VisualizationYongchao Xing, Weipan Yang, Yiming Lv et al.
Microservices have become a mainstream architectural paradigm, yet microservice bad smells can significantly harm maintainability and performance. Existing detection tools often produce obscure outputs and lack effective integration with runtime observability, making it difficult for operators to interpret results and take timely action. To address this gap, we propose SmellDoc, a customized framework based on Elastic Stack. SmellDoc extends the native observability dashboard with a microservice bad smell detection plugin, integrating detection, knowledge, and health monitoring. It introduces a Custom-Business-Collector to capture business-level metrics, a Re-integration Collector to aggregate heterogeneous runtime data, and detection components that combine static and runtime analyses. SmellDoc supports a knowledge base of 84 smell types and enables detection of 24 representative smells across architectural, runtime, and performance categories. Results are visualized in Kibana through multiple views, providing operators with actionable insights. Case studies on a benchmark microservice system demonstrate that SmellDoc is effective and usable in detecting, visualizing, and analyzing smells, thus enhancing runtime observability and accelerating troubleshooting to maintain a high level of Quality of Service.
AIMar 29, 2022
Requirements Elicitation in Cognitive Service for RecommendationBolin Zhang, Zhiying Tu, Yunzhe Xu et al.
Nowadays, cognitive service provides more interactive way to understand users' requirements via human-machine conversation. In other words, it has to capture users' requirements from their utterance and respond them with the relevant and suitable service resources. To this end, two phases must be applied: I.Sequence planning and Real-time detection of user requirement, II.Service resource selection and Response generation. The existing works ignore the potential connection between these two phases. To model their connection, Two-Phase Requirement Elicitation Method is proposed. For the phase I, this paper proposes a user requirement elicitation framework (URef) to plan a potential requirement sequence grounded on user profile and personal knowledge base before the conversation. In addition, it can also predict user's true requirement and judge whether the requirement is completed based on the user's utterance during the conversation. For the phase II, this paper proposes a response generation model based on attention, SaRSNet. It can select the appropriate resource (i.e. knowledge triple) in line with the requirement predicted by URef, and then generates a suitable response for recommendation. The experimental results on the open dataset \emph{DuRecDial} have been significantly improved compared to the baseline, which proves the effectiveness of the proposed methods.
CLMay 29, 2025Code
ScEdit: Script-based Assessment of Knowledge EditingXinye Li, Zunwen Zheng, Qian Zhang et al.
Knowledge Editing (KE) has gained increasing attention, yet current KE tasks remain relatively simple. Under current evaluation frameworks, many editing methods achieve exceptionally high scores, sometimes nearing perfection. However, few studies integrate KE into real-world application scenarios (e.g., recent interest in LLM-as-agent). To support our analysis, we introduce a novel script-based benchmark -- ScEdit (Script-based Knowledge Editing Benchmark) -- which encompasses both counterfactual and temporal edits. We integrate token-level and text-level evaluation methods, comprehensively analyzing existing KE techniques. The benchmark extends traditional fact-based ("What"-type question) evaluation to action-based ("How"-type question) evaluation. We observe that all KE methods exhibit a drop in performance on established metrics and face challenges on text-level metrics, indicating a challenging task. Our benchmark is available at https://github.com/asdfo123/ScEdit.
AIAug 1, 2024
HBot: A Chatbot for Healthcare Applications in Traditional Chinese Medicine Based on Human Body 3D VisualizationBolin Zhang, Zhiwei Yi, Jiahao Wang et al.
The unique diagnosis and treatment techniques and remarkable clinical efficacy of traditional Chinese medicine (TCM) make it play an important role in the field of elderly care and healthcare, especially in the rehabilitation of some common chronic diseases of the elderly. Therefore, building a TCM chatbot for healthcare application will help users obtain consultation services in a direct and natural way. However, concepts such as acupuncture points (acupoints) and meridians involved in TCM always appear in the consultation, which cannot be displayed intuitively. To this end, we develop a \textbf{h}ealthcare chat\textbf{bot} (HBot) based on a human body model in 3D and knowledge graph, which provides conversational services such as knowledge Q\&A, prescription recommendation, moxibustion therapy recommendation, and acupoint search. When specific acupoints are involved in the conversations between user and HBot, the 3D body will jump to the corresponding acupoints and highlight them. Moreover, Hbot can also be used in training scenarios to accelerate the teaching process of TCM by intuitively displaying acupuncture points and knowledge cards. The demonstration video is available at https://www.youtube.com/watch?v=UhQhutSKkTU . Our code and dataset are publicly available at Gitee: https://gitee.com/plabrolin/interactive-3d-acup.git
CLFeb 11
Beyond Confidence: The Rhythms of Reasoning in Generative ModelsDeyuan Liu, Zecheng Wang, Zhanyue Qin et al.
Large Language Models (LLMs) exhibit impressive capabilities yet suffer from sensitivity to slight input context variations, hampering reliability. Conventional metrics like accuracy and perplexity fail to assess local prediction robustness, as normalized output probabilities can obscure the underlying resilience of an LLM's internal state to perturbations. We introduce the Token Constraint Bound ($δ_{\mathrm{TCB}}$), a novel metric that quantifies the maximum internal state perturbation an LLM can withstand before its dominant next-token prediction significantly changes. Intrinsically linked to output embedding space geometry, $δ_{\mathrm{TCB}}$ provides insights into the stability of the model's internal predictive commitment. Our experiments show $δ_{\mathrm{TCB}}$ correlates with effective prompt engineering and uncovers critical prediction instabilities missed by perplexity during in-context learning and text generation. $δ_{\mathrm{TCB}}$ offers a principled, complementary approach to analyze and potentially improve the contextual stability of LLM predictions.
CLJan 8, 2020Code
LTP: A New Active Learning Strategy for CRF-Based Named Entity RecognitionMingyi Liu, Zhiying Tu, Tong Zhang et al.
In recent years, deep learning has achieved great success in many natural language processing tasks including named entity recognition. The shortcoming is that a large amount of manually-annotated data is usually required. Previous studies have demonstrated that active learning could elaborately reduce the cost of data annotation, but there is still plenty of room for improvement. In real applications we found existing uncertainty-based active learning strategies have two shortcomings. Firstly, these strategies prefer to choose long sequence explicitly or implicitly, which increase the annotation burden of annotators. Secondly, some strategies need to invade the model and modify to generate some additional information for sample selection, which will increase the workload of the developer and increase the training/prediction time of the model. In this paper, we first examine traditional active learning strategies in a specific case of BiLstm-CRF that has widely used in named entity recognition on several typical datasets. Then we propose an uncertainty-based active learning strategy called Lowest Token Probability (LTP) which combines the input and output of CRF to select informative instance. LTP is simple and powerful strategy that does not favor long sequences and does not need to invade the model. We test LTP on multiple datasets, and the experiments show that LTP performs slightly better than traditional strategies with obviously less annotation tokens on both sentence-level accuracy and entity-level F1-score. Related code have been release on https://github.com/HIT-ICES/AL-NER
CLFeb 4, 2024
A Survey on Data Selection for LLM Instruction TuningBolin Zhang, Jiahao Wang, Qianlong Du et al.
Instruction tuning is a vital step of training large language models (LLMs), so how to enhance the effect of instruction tuning has received increased attention. Existing works indicate that the quality of the dataset is more crucial than the quantity during instruction tuning of LLMs. Therefore, recently a lot of studies focus on exploring the methods of selecting high-quality subset from instruction datasets, aiming to reduce training costs and enhance the instruction-following capabilities of LLMs. This paper presents a comprehensive survey on data selection for LLM instruction tuning. Firstly, we introduce the wildly used instruction datasets. Then, we propose a new taxonomy of the data selection methods and provide a detailed introduction of recent advances, and the evaluation strategies and results of data selection methods are also elaborated in detail. Finally, we emphasize the open challenges and present new frontiers of this task.
CLMar 28, 2024
Checkpoint Merging via Bayesian Optimization in LLM PretrainingDeyuan Liu, Zecheng Wang, Bingning Wang et al.
The rapid proliferation of large language models (LLMs) such as GPT-4 and Gemini underscores the intense demand for resources during their training processes, posing significant challenges due to substantial computational and environmental costs. To alleviate this issue, we propose checkpoint merging in pretraining LLM. This method utilizes LLM checkpoints with shared training trajectories, and is rooted in an extensive search space exploration for the best merging weight via Bayesian optimization. Through various experiments, we demonstrate that: (1) Our proposed methodology exhibits the capacity to augment pretraining, presenting an opportunity akin to obtaining substantial benefits at minimal cost; (2) Our proposed methodology, despite requiring a given held-out dataset, still demonstrates robust generalization capabilities across diverse domains, a pivotal aspect in pretraining.
CLMay 21, 2025
LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language ModelsZhanyue Qin, Yue Ding, Deyuan Liu et al.
Nowadays, Large Language Models (LLMs) have attracted widespread attention due to their powerful performance. However, due to the unavoidable exposure to socially biased data during training, LLMs tend to exhibit social biases, particularly gender bias. To better explore and quantifying the degree of gender bias in LLMs, we propose a pair of datasets named GenBiasEval and GenHintEval, respectively. The GenBiasEval is responsible for evaluating the degree of gender bias in LLMs, accompanied by an evaluation metric named AFGB-Score (Absolutely Fair Gender Bias Score). Meanwhile, the GenHintEval is used to assess whether LLMs can provide responses consistent with prompts that contain gender hints, along with the accompanying evaluation metric UB-Score (UnBias Score). Besides, in order to mitigate gender bias in LLMs more effectively, we present the LFTF (Locating First and Then Fine-Tuning) algorithm.The algorithm first ranks specific LLM blocks by their relevance to gender bias in descending order using a metric called BMI (Block Mitigating Importance Score). Based on this ranking, the block most strongly associated with gender bias is then fine-tuned using a carefully designed loss function. Numerous experiments have shown that our proposed LFTF algorithm can significantly mitigate gender bias in LLMs while maintaining their general capabilities.
CLJun 24, 2024
UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language ModelsZhanyue Qin, Haochuan Wang, Deyuan Liu et al.
Sequential decision-making refers to algorithms that take into account the dynamics of the environment, where early decisions affect subsequent decisions. With large language models (LLMs) demonstrating powerful capabilities between tasks, we can't help but ask: Can Current LLMs Effectively Make Sequential Decisions? In order to answer this question, we propose the UNO Arena based on the card game UNO to evaluate the sequential decision-making capability of LLMs and explain in detail why we choose UNO. In UNO Arena, We evaluate the sequential decision-making capability of LLMs dynamically with novel metrics based Monte Carlo methods. We set up random players, DQN-based reinforcement learning players, and LLM players (e.g. GPT-4, Gemini-pro) for comparison testing. Furthermore, in order to improve the sequential decision-making capability of LLMs, we propose the TUTRI player, which can involves having LLMs reflect their own actions wtih the summary of game history and the game strategy. Numerous experiments demonstrate that the TUTRI player achieves a notable breakthrough in the performance of sequential decision-making compared to the vanilla LLM player.
CLJun 24, 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer MergingDeyuan Liu, Zhanyue Qin, Hairu Wang et al.
While large language models (LLMs) excel in many domains, their complexity and scale challenge deployment in resource-limited environments. Current compression techniques, such as parameter pruning, often fail to effectively utilize the knowledge from pruned parameters. To address these challenges, we propose Manifold-Based Knowledge Alignment and Layer Merging Compression (MKA), a novel approach that uses manifold learning and the Normalized Pairwise Information Bottleneck (NPIB) measure to merge similar layers, reducing model size while preserving essential performance. We evaluate MKA on multiple benchmark datasets and various LLMs. Our findings show that MKA not only preserves model performance but also achieves substantial compression ratios, outperforming traditional pruning methods. Moreover, when coupled with quantization, MKA delivers even greater compression. Specifically, on the MMLU dataset using the Llama3-8B model, MKA achieves a compression ratio of 43.75% with a minimal performance decrease of only 2.82\%. The proposed MKA method offers a resource-efficient and performance-preserving model compression technique for LLMs.
SEAug 21, 2021
Data Correction and Evolution Analysis of the ProgrammableWeb Service EcosystemMingyi Liu, Zhiying Tu, Yeqi Zhu et al.
The evolution analysis on Web service ecosystems has become a critical problem as the frequency of service changes on the Internet increases rapidly. Developers need to understand these evolution patterns to assist in their decision-making on service selection. ProgrammableWeb is a popular Web service ecosystem on which several evolution analyses have been conducted in the literature. However, the existing studies have ignored the quality issues of the ProgrammableWeb dataset and the issue of service obsolescence. In this study, we first report the quality issues identified in the ProgrammableWeb dataset from our empirical study. Then, we propose a novel method to correct the relevant evolution analysis data by estimating the life cycle of application programming interfaces (APIs) and mashups. We also reveal how to use three different dynamic network models in the service ecosystem evolution analysis based on the corrected ProgrammableWeb dataset. Our experimental experience iterates the quality issues of the original ProgrammableWeb and highlights several research opportunities.
AIAug 7, 2021
DySR: A Dynamic Representation Learning and Aligning based Model for Service Bundle RecommendationMingyi Liu, Zhiying Tu, Xiaofei Xu et al.
An increasing number and diversity of services are available, which result in significant challenges to effective reuse service during requirement satisfaction. There have been many service bundle recommendation studies and achieved remarkable results. However, there is still plenty of room for improvement in the performance of these methods. The fundamental problem with these studies is that they ignore the evolution of services over time and the representation gap between services and requirements. In this paper, we propose a dynamic representation learning and aligning based model called DySR to tackle these issues. DySR eliminates the representation gap between services and requirements by learning a transformation function and obtains service representations in an evolving social environment through dynamic graph representation learning. Extensive experiments conducted on a real-world dataset from ProgrammableWeb show that DySR outperforms existing state-of-the-art methods in commonly used evaluation metrics, improving $F1@5$ from $36.1\%$ to $69.3\%$.
LGJun 3, 2021
Learning Representation over Dynamic Graph using Aggregation-Diffusion MechanismMingyi Liu, Zhiying Tu, Xiaofei Xu et al.
Representation learning on graphs that evolve has recently received significant attention due to its wide application scenarios, such as bioinformatics, knowledge graphs, and social networks. The propagation of information in graphs is important in learning dynamic graph representations, and most of the existing methods achieve this by aggregation. However, relying only on aggregation to propagate information in dynamic graphs can result in delays in information propagation and thus affect the performance of the method. To alleviate this problem, we propose an aggregation-diffusion (AD) mechanism that actively propagates information to its neighbor by diffusion after the node updates its embedding through the aggregation mechanism. In experiments on two real-world datasets in the dynamic link prediction task, the AD mechanism outperforms the baseline models that only use aggregation to propagate information. We further conduct extensive experiments to discuss the influence of different factors in the AD mechanism.
SESep 4, 2020
Domain Priori Knowledge based Integrated Solution Design for Internet of ServicesHanchuan Xu, Xiao Wang, Yuxin Wang et al.
Various types of services, such as web APIs, IoT services, O2O services, and many others, have flooded on the Internet. Interconnections among these services have resulted in a new phenomenon called "Internet of Services" (IoS). By IoS,people don't need to request multiple services by themselves to fulfill their daily requirements, but it is an IoS platform that is responsible for constructing integrated solutions for them. Since user requirements (URs) are usually coarse-grained and transboundary, IoS platforms have to integrate services from multiple domains to fulfill the requirements. Considering there are too many available services in IoS, a big challenge is how to look for a tradeoff between the construction efficiency and the precision of final solutions. For this challenge, we introduce a framework and a platform for transboundary user requirement oriented solution design in IoS. The main idea is to make use of domain priori knowledge derived from the commonness and similarities among massive historical URs and among historical integrated service solutions(ISSs). Priori knowledge is classified into three types: requirement patterns (RPs), service patterns (SPs), and probabilistic matching matrix (PMM) between RPs and SPs. A UR is modeled in the form of an intention tree (ITree) along with a set of constraints on intention nodes, and then optimal RPs are selected to cover the I-Tree as much as possible. By taking advantage of the PMM, a set of SPs are filtered out and composed together to form the final ISS. Finally, the design of a platform supporting the above process is introduced.
AISep 3, 2020
User Intention Recognition and Requirement Elicitation Method for Conversational AI ServicesJunrui Tian, Zhiying Tu, Zhongjie Wang et al.
In recent years, chat-bot has become a new type of intelligent terminal to guide users to consume services. However, it is criticized most that the services it provides are not what users expect or most expect. This defect mostly dues to two problems, one is that the incompleteness and uncertainty of user's requirement expression caused by the information asymmetry, the other is that the diversity of service resources leads to the difficulty of service selection. Conversational bot is a typical mesh device, so the guided multi-rounds Q$\&$A is the most effective way to elicit user requirements. Obviously, complex Q$\&$A with too many rounds is boring and always leads to bad user experience. Therefore, we aim to obtain user requirements as accurately as possible in as few rounds as possible. To achieve this, a user intention recognition method based on Knowledge Graph (KG) was developed for fuzzy requirement inference, and a requirement elicitation method based on Granular Computing was proposed for dialog policy generation. Experimental results show that these two methods can effectively reduce the number of conversation rounds, and can quickly and accurately identify the user intention.
SEAug 30, 2016
A New Paradigm of Software Service Engineering in the Era of Big Data and Big ServiceXiaofei Xu, Gianmario Motta, Xianzhi Wang et al.
Servitization is one of the most significant trends that reshapes the information world and society in recent years. The requirement of collecting,storing, processing, and sharing of the Big Data has led to massive software resources being developed and made accessible as web-based services to facilitate such process. These services that handle the Big Data come from various domains and heterogeneous networks, and converge into a huge complicated service network (or ecosystem), called the Big Service.The key issue facing the big data and big service ecosystem is how to optimally configure and operate the related service resources to serve the specific requirements of possible applications, i.e., how to reuse the existing service resources effectively and efficiently to develop the new applications or software services, to meet the massive individualized requirements of end-users.Based on analyzing the big service ecosystem, we present in this paper a new paradigm for software service engineering, RE2SEP (Requirement-Engineering Two-Phase of Service Engineering Paradigm), which includes three components: service-oriented requirement engineering, domain-oriented service engineering, and software service development approach. RE2SEP enables the rapid design and implementation of service solutions to match the requirement propositions of massive individualized customers in the Big Service ecosystem. A case study on people's mobility service in a smart city environment is given to demonstrate the application of RE2SEP.RE2SEP can potentially revolutionize the traditional life-cycle oriented software engineering, leading to a new approach to software service engineering.