CLJul 1, 2024
A Global-Local Attention Mechanism for Relation ClassificationYiping Sun
Relation classification, a crucial component of relation extraction, involves identifying connections between two entities. Previous studies have predominantly focused on integrating the attention mechanism into relation classification at a global scale, overlooking the importance of the local context. To address this gap, this paper introduces a novel global-local attention mechanism for relation classification, which enhances global attention with a localized focus. Additionally, we propose innovative hard and soft localization mechanisms to identify potential keywords for local attention. By incorporating both hard and soft localization strategies, our approach offers a more nuanced and comprehensive understanding of the contextual cues that contribute to effective relation classification. Our experimental results on the SemEval-2010 Task 8 dataset highlight the superior performance of our method compared to previous attention-based approaches in relation classification.
22.3IRMay 11
CCD-Level and Load-Aware Thread Orchestration for In-Memory Vector ANNS on Multi-Core CPUsYuchen Huang, Baiteng Ma, Yiping Sun et al.
Vector approximate nearest neighbor search (ANNS) underpins search engines, recommendation systems, and advertising services. Recent advances in ANNS indexes make CPU a cost-effective choice for serving million-scale, in-memory vector search, yet per-core throughput remains constrained by memory access latency of vector reading and the compute intensity of distance evaluations in production deployments. With the growing scale of the business and advances in hardware, modern CCD-based multi-core CPUs have been widely deployed for high throughput in our services. However, we find that simply increasing core counts does not yield optimal performance scaling. To improve the efficiency of more cores from the CCD-based architecture, we analyze the distributions of real-world requests in our production environments. We observe high access locality in vector search in our online services and low cache utilization, resulting from overlooking the multi-chiplet nature of CCD based CPUs. Hence, we propose a workload- and hardware-aware thread orchestration framework at CCD-level that (i) provides a uniform interface for both inter-query parallel HNSW search and intra-query parallel IVF search, (ii) achieves cache-friendly and workload-adaptive mapping of task dispatching, and (iii) employs CCD-aware task stealing to address load imbalance. Applied to real production workloads from search, recommendation, and advertising services of Xiaohongshu (RedNote), our approach delivers up to 3.7x higher throughput and 30-90% reductions in P50 and P999 latency. In detail, compared with the original framework, the cache-miss ratio decreases by 6-30%, and the total CPU stall is reduced by 20-80%.
AIMar 9, 2024
Enhancing Multi-Hop Knowledge Graph Reasoning through Reward Shaping TechniquesChen Li, Haotian Zheng, Yiping Sun et al.
In the realm of computational knowledge representation, Knowledge Graph Reasoning (KG-R) stands at the forefront of facilitating sophisticated inferential capabilities across multifarious domains. The quintessence of this research elucidates the employment of reinforcement learning (RL) strategies, notably the REINFORCE algorithm, to navigate the intricacies inherent in multi-hop KG-R. This investigation critically addresses the prevalent challenges introduced by the inherent incompleteness of Knowledge Graphs (KGs), which frequently results in erroneous inferential outcomes, manifesting as both false negatives and misleading positives. By partitioning the Unified Medical Language System (UMLS) benchmark dataset into rich and sparse subsets, we investigate the efficacy of pre-trained BERT embeddings and Prompt Learning methodologies to refine the reward shaping process. This approach not only enhances the precision of multi-hop KG-R but also sets a new precedent for future research in the field, aiming to improve the robustness and accuracy of knowledge inference within complex KG frameworks. Our work contributes a novel perspective to the discourse on KG reasoning, offering a methodological advancement that aligns with the academic rigor and scholarly aspirations of the Natural journal, promising to invigorate further advancements in the realm of computational knowledge representation.
IRApr 10, 2024
TransTARec: Time-Adaptive Translating Embedding Model for Next POI RecommendationYiping Sun
The rapid growth of location acquisition technologies makes Point-of-Interest(POI) recommendation possible due to redundant user check-in records. In this paper, we focus on next POI recommendation in which next POI is based on previous POI. We observe that time plays an important role in next POI recommendation but is neglected in the recent proposed translating embedding methods. To tackle this shortage, we propose a time-adaptive translating embedding model (TransTARec) for next POI recommendation that naturally incorporates temporal influence, sequential dynamics, and user preference within a single component. Methodologically, we treat a (previous timestamp, user, next timestamp) triplet as a union translation vector and develop a neural-based fusion operation to fuse user preference and temporal influence. The superiority of TransTARec, which is confirmed by extensive experiments on real-world datasets, comes from not only the introduction of temporal influence but also the direct unification with user preference and sequential dynamics.
LGFeb 22, 2020
Interpretable Crowd Flow Prediction with Spatial-Temporal Self-AttentionHaoxing Lin, Weijia Jia, Yongjian You et al.
Crowd flow prediction has been increasingly investigated in intelligent urban computing field as a fundamental component of urban management system. The most challenging part of predicting crowd flow is to measure the complicated spatial-temporal dependencies. A prevalent solution employed in current methods is to divide and conquer the spatial and temporal information by various architectures (e.g., CNN/GCN, LSTM). However, this strategy has two disadvantages: (1) the sophisticated dependencies are also divided and therefore partially isolated; (2) the spatial-temporal features are transformed into latent representations when passing through different architectures, making it hard to interpret the predicted crowd flow. To address these issues, we propose a Spatial-Temporal Self-Attention Network (STSAN) with an ST encoding gate that calculates the entire spatial-temporal representation with positional and time encodings and therefore avoids dividing the dependencies. Furthermore, we develop a Multi-aspect attention mechanism that applies scaled dot-product attention over spatial-temporal information and measures the attention weights that explicitly indicate the dependencies. Experimental results on traffic and mobile data demonstrate that the proposed method reduces inflow and outflow RMSE by 16% and 8% on the Taxi-NYC dataset compared to the SOTA baselines.
LGDec 13, 2019
Spatial-Temporal Self-Attention Network for Flow PredictionHaoxing Lin, Weijia Jia, Yiping Sun et al.
Flow prediction (e.g., crowd flow, traffic flow) with features of spatial-temporal is increasingly investigated in AI research field. It is very challenging due to the complicated spatial dependencies between different locations and dynamic temporal dependencies among different time intervals. Although measurements of both dependencies are employed, existing methods suffer from the following two problems. First, the temporal dependencies are measured either uniformly or bias against long-term dependencies, which overlooks the distinctive impacts of short-term and long-term temporal dependencies. Second, the existing methods capture spatial and temporal dependencies independently, which wrongly assumes that the correlations between these dependencies are weak and ignores the complicated mutual influences between them. To address these issues, we propose a Spatial-Temporal Self-Attention Network (ST-SAN). As the path-length of attending long-term dependency is shorter in the self-attention mechanism, the vanishing of long-term temporal dependencies is prevented. In addition, since our model relies solely on attention mechanisms, the spatial and temporal dependencies can be simultaneously measured. Experimental results on real-world data demonstrate that, in comparison with state-of-the-art methods, our model reduces the root mean square errors by 9% in inflow prediction and 4% in outflow prediction on Taxi-NYC data, which is very significant compared to the previous improvement.