Chronocept: Instilling a Sense of Time in Machines
This addresses a foundational gap in AI's temporal reasoning for applications like knowledge grounding and fact-checking, though it appears incremental as it builds on existing temporal reasoning concepts with a new benchmark.
The paper tackles the problem of AI's inability to reason about temporal validity by introducing Chronocept, the first benchmark that models temporal validity as a continuous probability distribution over time, with datasets showing strong inter-annotator agreement (84% and 89%) and baselines outperforming classification-based approaches.
Human cognition is deeply intertwined with a sense of time, known as Chronoception. This sense allows us to judge how long facts remain valid and when knowledge becomes outdated. Despite progress in vision, language, and motor control, AI still struggles to reason about temporal validity. We introduce Chronocept, the first benchmark to model temporal validity as a continuous probability distribution over time. Using skew-normal curves fitted along semantically decomposed temporal axes, Chronocept captures nuanced patterns of emergence, decay, and peak relevance. It includes two datasets: Benchmark I (atomic facts) and Benchmark II (multi-sentence passages). Annotations show strong inter-annotator agreement (84% and 89%). Our baselines predict curve parameters - location, scale, and skewness - enabling interpretable, generalizable learning and outperforming classification-based approaches. Chronocept fills a foundational gap in AI's temporal reasoning, supporting applications in knowledge grounding, fact-checking, retrieval-augmented generation (RAG), and proactive agents. Code and data are publicly available.