TBIN: Modeling Long Textual Behavior Data for CTR Prediction
This work addresses the challenge of handling long textual behavior data in CTR prediction for recommendation systems, representing an incremental improvement over existing LM-based methods.
The paper tackles the problem of modeling long user behavior data for CTR prediction by proposing TBIN, which uses efficient hashing and shifted chunk-based self-attention to avoid truncation and dynamically activate diverse interests, achieving effectiveness in offline and online experiments on a real-world food recommendation platform.
Click-through rate (CTR) prediction plays a pivotal role in the success of recommendations. Inspired by the recent thriving of language models (LMs), a surge of works improve prediction by organizing user behavior data in a \textbf{textual} format and using LMs to understand user interest at a semantic level. While promising, these works have to truncate the textual data to reduce the quadratic computational overhead of self-attention in LMs. However, it has been studied that long user behavior data can significantly benefit CTR prediction. In addition, these works typically condense user diverse interests into a single feature vector, which hinders the expressive capability of the model. In this paper, we propose a \textbf{T}extual \textbf{B}ehavior-based \textbf{I}nterest Chunking \textbf{N}etwork (TBIN), which tackles the above limitations by combining an efficient locality-sensitive hashing algorithm and a shifted chunk-based self-attention. The resulting user diverse interests are dynamically activated, producing user interest representation towards the target item. Finally, the results of both offline and online experiments on real-world food recommendation platform demonstrate the effectiveness of TBIN.