AD3: Attentive Deep Document Dater
This addresses a practical issue for web document analysis tasks like summarization and event extraction, though it is incremental as it builds on existing neural methods.
The paper tackles the problem of predicting document creation dates when metadata is missing or unreliable, proposing AD3, an attention-based neural system that achieves state-of-the-art results on multiple real-world datasets.
Knowledge of the creation date of documents facilitates several tasks such as summarization, event extraction, temporally focused information extraction etc. Unfortunately, for most of the documents on the Web, the time-stamp metadata is either missing or can't be trusted. Thus, predicting creation time from document content itself is an important task. In this paper, we propose Attentive Deep Document Dater (AD3), an attention-based neural document dating system which utilizes both context and temporal information in documents in a flexible and principled manner. We perform extensive experimentation on multiple real-world datasets to demonstrate the effectiveness of AD3 over neural and non-neural baselines.