CVOct 12, 2022Code
Masked Motion Encoding for Self-Supervised Video Representation LearningXinyu Sun, Peihao Chen, Liangwei Chen et al.
How to learn discriminative video representation from unlabeled videos is challenging but crucial for video analysis. The latest attempts seek to learn a representation model by predicting the appearance contents in the masked regions. However, simply masking and recovering appearance contents may not be sufficient to model temporal clues as the appearance contents can be easily reconstructed from a single frame. To overcome this limitation, we present Masked Motion Encoding (MME), a new pre-training paradigm that reconstructs both appearance and motion information to explore temporal clues. In MME, we focus on addressing two critical challenges to improve the representation performance: 1) how to well represent the possible long-term motion across multiple frames; and 2) how to obtain fine-grained temporal clues from sparsely sampled videos. Motivated by the fact that human is able to recognize an action by tracking objects' position changes and shape changes, we propose to reconstruct a motion trajectory that represents these two kinds of change in the masked regions. Besides, given the sparse video input, we enforce the model to reconstruct dense motion trajectories in both spatial and temporal dimensions. Pre-trained with our MME paradigm, the model is able to anticipate long-term and fine-grained motion details. Code is available at https://github.com/XinyuSun/MME.
SEMay 31, 2025Code
CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information RetrievalJiahui Geng, Fengyu Cai, Shaobo Cui et al.
Code retrieval is essential in modern software development, as it boosts code reuse and accelerates debugging. However, current benchmarks primarily emphasize functional relevance while neglecting critical dimensions of software quality. Motivated by this gap, we introduce CoQuIR, the first large-scale, multilingual benchmark specifically designed to evaluate quality-aware code retrieval across four key dimensions: correctness, efficiency, security, and maintainability. CoQuIR provides fine-grained quality annotations for 42,725 queries and 134,907 code snippets in 11 programming languages, and is accompanied by two quality-centric evaluation metrics: Pairwise Preference Accuracy and Margin-based Ranking Score. Using CoQuIR, we benchmark 23 retrieval models, covering both open-source and proprietary systems, and find that even top-performing models frequently fail to distinguish buggy or insecure code from their more robust counterparts. Furthermore, we conduct preliminary investigations into training methods that explicitly encourage retrievers to recognize code quality. Using synthetic datasets, we demonstrate promising improvements in quality-aware metrics across various models, without sacrificing semantic relevance. Downstream code generation experiments further validate the effectiveness of our approach. Overall, our work highlights the importance of integrating quality signals into code retrieval systems, laying the groundwork for more trustworthy and robust software development tools.
CLOct 21, 2020
A Weighted Heterogeneous Graph Based Dialogue SystemXinyan Zhao, Liangwei Chen, Huanhuan Chen
Knowledge based dialogue systems have attracted increasing research interest in diverse applications. However, for disease diagnosis, the widely used knowledge graph is hard to represent the symptom-symptom relations and symptom-disease relations since the edges of traditional knowledge graph are unweighted. Most research on disease diagnosis dialogue systems highly rely on data-driven methods and statistical features, lacking profound comprehension of symptom-disease relations and symptom-symptom relations. To tackle this issue, this work presents a weighted heterogeneous graph based dialogue system for disease diagnosis. Specifically, we build a weighted heterogeneous graph based on symptom co-occurrence and a proposed symptom frequency-inverse disease frequency. Then this work proposes a graph based deep Q-network (Graph-DQN) for dialogue management. By combining Graph Convolutional Network (GCN) with DQN to learn the embeddings of diseases and symptoms from both the structural and attribute information in the weighted heterogeneous graph, Graph-DQN could capture the symptom-disease relations and symptom-symptom relations better. Experimental results show that the proposed dialogue system rivals the state-of-the-art models. More importantly, the proposed dialogue system can complete the task with less dialogue turns and possess a better distinguishing capability on diseases with similar symptoms.
CLOct 2, 2020
Continual Learning for Natural Language Generation in Task-oriented Dialog SystemsFei Mi, Liangwei Chen, Mengjie Zhao et al.
Natural language generation (NLG) is an essential component of task-oriented dialog systems. Despite the recent success of neural approaches for NLG, they are typically developed in an offline manner for particular domains. To better fit real-life applications where new data come in a stream, we study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities incrementally. The major challenge towards this goal is catastrophic forgetting, meaning that a continually trained model tends to forget the knowledge it has learned before. To this end, we propose a method called ARPER (Adaptively Regularized Prioritized Exemplar Replay) by replaying prioritized historical exemplars, together with an adaptive regularization technique based on ElasticWeight Consolidation. Extensive experiments to continually learn new domains and intents are conducted on MultiWoZ-2.0 to benchmark ARPER with a wide range of techniques. Empirical results demonstrate that ARPER significantly outperforms other methods by effectively mitigating the detrimental catastrophic forgetting issue.