CVCLOct 22, 2024

Benchmarking Large Language Models for Image Classification of Marine Mammals

arXiv:2410.19848v14 citationsh-index: 6Has Code2024 IEEE International Conference on Knowledge Graph (ICKG)
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific gap in ecological monitoring by providing a new dataset and incremental improvements in classification methods for marine mammals.

The authors tackled the problem of classifying marine mammals by creating a benchmark dataset with 1,423 images across 65 categories and evaluating various methods, finding that a novel multi-agent system improved performance over traditional models and LLMs.

As Artificial Intelligence (AI) has developed rapidly over the past few decades, the new generation of AI, Large Language Models (LLMs) trained on massive datasets, has achieved ground-breaking performance in many applications. Further progress has been made in multimodal LLMs, with many datasets created to evaluate LLMs with vision abilities. However, none of those datasets focuses solely on marine mammals, which are indispensable for ecological equilibrium. In this work, we build a benchmark dataset with 1,423 images of 65 kinds of marine mammals, where each animal is uniquely classified into different levels of class, ranging from species-level to medium-level to group-level. Moreover, we evaluate several approaches for classifying these marine mammals: (1) machine learning (ML) algorithms using embeddings provided by neural networks, (2) influential pre-trained neural networks, (3) zero-shot models: CLIP and LLMs, and (4) a novel LLM-based multi-agent system (MAS). The results demonstrate the strengths of traditional models and LLMs in different aspects, and the MAS can further improve the classification performance. The dataset is available on GitHub: https://github.com/yeyimilk/LLM-Vision-Marine-Animals.git.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes