CLFeb 19, 2023
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained TransformersChen Liang, Haoming Jiang, Zheng Li et al. · gatech
Knowledge distillation has been shown to be a powerful model compression approach to facilitate the deployment of pre-trained language models in practice. This paper focuses on task-agnostic distillation. It produces a compact pre-trained model that can be easily fine-tuned on various tasks with small computational costs and memory footprints. Despite the practical benefits, task-agnostic distillation is challenging. Since the teacher model has a significantly larger capacity and stronger representation power than the student model, it is very difficult for the student to produce predictions that match the teacher's over a massive amount of open-domain training data. Such a large prediction discrepancy often diminishes the benefits of knowledge distillation. To address this challenge, we propose Homotopic Distillation (HomoDistil), a novel task-agnostic distillation approach equipped with iterative pruning. Specifically, we initialize the student model from the teacher model, and iteratively prune the student's neurons until the target width is reached. Such an approach maintains a small discrepancy between the teacher's and student's predictions throughout the distillation process, which ensures the effectiveness of knowledge transfer. Extensive experiments demonstrate that HomoDistil achieves significant improvements on existing baselines.
IRAug 7, 2023
Heterogeneous Knowledge Fusion: A Novel Approach for Personalized Recommendation via LLMBin Yin, Junjie Xie, Yu Qin et al.
The analysis and mining of user heterogeneous behavior are of paramount importance in recommendation systems. However, the conventional approach of incorporating various types of heterogeneous behavior into recommendation models leads to feature sparsity and knowledge fragmentation issues. To address this challenge, we propose a novel approach for personalized recommendation via Large Language Model (LLM), by extracting and fusing heterogeneous knowledge from user heterogeneous behavior information. In addition, by combining heterogeneous knowledge and recommendation tasks, instruction tuning is performed on LLM for personalized recommendations. The experimental results demonstrate that our method can effectively integrate user heterogeneous behavior and significantly improve recommendation performance.
CLSep 9, 2024Code
LegiLM: A Fine-Tuned Legal Language Model for Data ComplianceLinkai Zhu, Lu Yang, Chaofan Li et al.
Ensuring compliance with international data protection standards for privacy and data security is a crucial but complex task, often requiring substantial legal expertise. This paper introduces LegiLM, a novel legal language model specifically tailored for consulting on data or information compliance. LegiLM leverages a pre-trained GDPR Fines dataset and has been fine-tuned to automatically assess whether particular actions or events breach data security and privacy regulations. By incorporating a specialized dataset that includes global data protection laws, meticulously annotated policy documents, and relevant privacy policies, LegiLM is optimized for addressing data compliance challenges. The model integrates advanced legal reasoning methods and information retrieval enhancements to enhance accuracy and reliability in practical legal consulting scenarios. Our evaluation using a custom benchmark dataset demonstrates that LegiLM excels in detecting data regulation breaches, offering sound legal justifications, and recommending necessary compliance modifications, setting a new benchmark for AI-driven legal compliance solutions. Our resources are publicly available at https://github.com/DAOLegalAI/LegiLM
IRAug 21, 2024
LARR: Large Language Model Aided Real-time Scene Recommendation with Semantic UnderstandingZhizhong Wan, Bin Yin, Junjie Xie et al.
Click-Through Rate (CTR) prediction is crucial for Recommendation System(RS), aiming to provide personalized recommendation services for users in many aspects such as food delivery, e-commerce and so on. However, traditional RS relies on collaborative signals, which lacks semantic understanding to real-time scenes. We also noticed that a major challenge in utilizing Large Language Models (LLMs) for practical recommendation purposes is their efficiency in dealing with long text input. To break through the problems above, we propose Large Language Model Aided Real-time Scene Recommendation(LARR), adopt LLMs for semantic understanding, utilizing real-time scene information in RS without requiring LLM to process the entire real-time scene text directly, thereby enhancing the efficiency of LLM-based CTR modeling. Specifically, recommendation domain-specific knowledge is injected into LLM and then RS employs an aggregation encoder to build real-time scene information from separate LLM's outputs. Firstly, a LLM is continual pretrained on corpus built from recommendation data with the aid of special tokens. Subsequently, the LLM is fine-tuned via contrastive learning on three kinds of sample construction strategies. Through this step, LLM is transformed into a text embedding model. Finally, LLM's separate outputs for different scene features are aggregated by an encoder, aligning to collaborative signals in RS, enhancing the performance of recommendation model.
HCFeb 24, 2025
Teleology-Driven Affective Computing: A Causal Framework for Sustained Well-BeingBin Yin, Chong-Yi Liu, Liya Fu et al.
Affective computing has made significant strides in emotion recognition and generation, yet current approaches mainly focus on short-term pattern recognition and lack a comprehensive framework to guide affective agents toward long-term human well-being. To address this, we propose a teleology-driven affective computing framework that unifies major emotion theories (basic emotion, appraisal, and constructivist approaches) under the premise that affect is an adaptive, goal-directed process that facilitates survival and development. Our framework emphasizes aligning agent responses with both personal/individual and group/collective well-being over extended timescales. We advocate for creating a "dataverse" of personal affective events, capturing the interplay between beliefs, goals, actions, and outcomes through real-world experience sampling and immersive virtual reality. By leveraging causal modeling, this "dataverse" enables AI systems to infer individuals' unique affective concerns and provide tailored interventions for sustained well-being. Additionally, we introduce a meta-reinforcement learning paradigm to train agents in simulated environments, allowing them to adapt to evolving affective concerns and balance hierarchical goals - from immediate emotional needs to long-term self-actualization. This framework shifts the focus from statistical correlations to causal reasoning, enhancing agents' ability to predict and respond proactively to emotional challenges, and offers a foundation for developing personalized, ethically aligned affective systems that promote meaningful human-AI interactions and societal well-being.