Forecasting Clicks in Digital Advertising: Multimodal Inputs and Interpretable Outputs
This work addresses the need for more accurate and interpretable click forecasts in digital advertising, which can impact revenue and campaign strategies, though it appears incremental by building on existing multimodal approaches.
The paper tackled the problem of forecasting click volume in digital advertising by integrating multimodal inputs, including textual logs, and achieved improved accuracy and reasoning quality compared to baselines on a large-scale industry dataset.
Forecasting click volume is a key task in digital advertising, influencing both revenue and campaign strategy. Traditional time series models rely solely on numerical data, often overlooking rich contextual information embedded in textual elements, such as keyword updates. We present a multimodal forecasting framework that combines click data with textual logs from real-world ad campaigns and generates human-interpretable explanations alongside numeric predictions. Reinforcement learning is used to improve comprehension of textual information and enhance fusion of modalities. Experiments on a large-scale industry dataset show that our method outperforms baselines in both accuracy and reasoning quality.