CVIRLGOct 19, 2024

Visual Navigation of Digital Libraries: Retrieval and Classification of Images in the National Library of Norway's Digitised Book Collection

arXiv:2410.14969v11 citationsh-index: 4CHR
Originality Synthesis-oriented
AI Analysis

This work addresses the accessibility of visual materials in digital libraries for users and archivists, but it is incremental as it applies existing methods to a new dataset.

The researchers tackled the problem of exploring images in digitised library collections by developing a proof-of-concept image search application for the National Library of Norway's pre-1900 books, comparing Vision Transformer, CLIP, and SigLIP embeddings, with results showing SigLIP slightly outperforming in retrieval and classification tasks and aiding in dataset cleaning.

Digital tools for text analysis have long been essential for the searchability and accessibility of digitised library collections. Recent computer vision advances have introduced similar capabilities for visual materials, with deep learning-based embeddings showing promise for analysing visual heritage. Given that many books feature visuals in addition to text, taking advantage of these breakthroughs is critical to making library collections open and accessible. In this work, we present a proof-of-concept image search application for exploring images in the National Library of Norway's pre-1900 books, comparing Vision Transformer (ViT), Contrastive Language-Image Pre-training (CLIP), and Sigmoid loss for Language-Image Pre-training (SigLIP) embeddings for image retrieval and classification. Our results show that the application performs well for exact image retrieval, with SigLIP embeddings slightly outperforming CLIP and ViT in both retrieval and classification tasks. Additionally, SigLIP-based image classification can aid in cleaning image datasets from a digitisation pipeline.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes