CV AISep 19, 2023

A multimodal deep learning architecture for smoking detection with a small data approach

Robert Lakatos, Peter Pollner, Andras Hajdu, Tamas Joo

arXiv:2309.10561v11.513 citationsh-index: 7

Originality Incremental advance

AI Analysis

This addresses the need for unbiased and reproducible detection of hidden smoking-related content in media, which is incremental as it builds on existing deep learning methods with a small data approach.

The paper tackled the problem of detecting covert tobacco advertisements in media by developing a multimodal deep learning model that integrates text and image processing with human reinforcement, achieving 74% accuracy for images and 98% for text.

Introduction: Covert tobacco advertisements often raise regulatory measures. This paper presents that artificial intelligence, particularly deep learning, has great potential for detecting hidden advertising and allows unbiased, reproducible, and fair quantification of tobacco-related media content. Methods: We propose an integrated text and image processing model based on deep learning, generative methods, and human reinforcement, which can detect smoking cases in both textual and visual formats, even with little available training data. Results: Our model can achieve 74\% accuracy for images and 98\% for text. Furthermore, our system integrates the possibility of expert intervention in the form of human reinforcement. Conclusions: Using the pre-trained multimodal, image, and text processing models available through deep learning makes it possible to detect smoking in different media even with few training data.

View on arXiv PDF

Similar