CLJan 10, 2025

Automating Date Format Detection for Data Visualization

arXiv:2501.05640v15 citationsh-index: 12025 International Conference on Advanced Machine Learning and Data Science (AMLDS)
Originality Incremental advance
AI Analysis

This addresses the data preparation challenge for users in data visualization and analytics, though it is incremental as it builds on existing methods for format detection.

The paper tackled the problem of date parsing as a bottleneck in analytic workflows by presenting two algorithms for automatic date format detection, achieving over 90% accuracy on a large corpus of data columns to streamline data preparation.

Data preparation, specifically date parsing, is a significant bottleneck in analytic workflows. To address this, we present two algorithms, one based on minimum entropy and the other on natural language modeling that automatically derive date formats from string data. These algorithms achieve over 90% accuracy on a large corpus of data columns, streamlining the data preparation process within visualization environments. The minimal entropy approach is particularly fast, providing interactive feedback. Our methods simplify date format extraction, making them suitable for integration into data visualization tools and databases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes