Bogdan Tomoyuki Nassu

h-index11

3papers

156citations

Novelty48%

AI Score42

Ranked #59,614 of 194,257 authors (top 31%)#20,562 in CV (top 35%)

3 Papers

1.4CVSep 30, 2022Code

Image-Based Detection of Modifications in Gas Pump PCBs with Deep Convolutional Autoencoders

Diulhio Candido de Oliveira, Bogdan Tomoyuki Nassu, Marco Aurelio Wehrmeister

In this paper, we introduce an approach for detecting modifications in assembled printed circuit boards based on photographs taken without tight control over perspective and illumination conditions. One instance of this problem is the visual inspection of gas pumps PCBs, which can be modified by fraudsters wishing to deceive costumers or evade taxes. Given the uncontrolled environment and the huge number of possible modifications, we address the problem as a case of anomaly detection, proposing an approach that is directed towards the characteristics of that scenario, while being well-suited for other similar applications. The proposed approach employs a deep convolutional autoencoder trained to reconstruct images of an unmodified board, but which remains unable to do the same for images showing modifications. By comparing the input image with its reconstruction, it is possible to segment anomalies and modifications in a pixel-wise manner. Experiments performed on a dataset built to represent real-world situations (and which we will make publicly available) show that our approach outperforms other state-of-the-art approaches for anomaly segmentation in the considered scenario, while producing comparable results on the popular MVTec-AD dataset for a more general object anomaly detection task.

13.5CVJan 9, 2021Code

Active Fire Detection in Landsat-8 Imagery: a Large-Scale Dataset and a Deep-Learning Study

Gabriel Henrique de Almeida Pereira, André Minoro Fusioka, Bogdan Tomoyuki Nassu et al.

Active fire detection in satellite imagery is of critical importance to the management of environmental conservation policies, supporting decision-making and law enforcement. This is a well established field, with many techniques being proposed over the years, usually based on pixel or region-level comparisons involving sensor-specific thresholds and neighborhood statistics. In this paper, we address the problem of active fire detection using deep learning techniques. In recent years, deep learning techniques have been enjoying an enormous success in many fields, but their use for active fire detection is relatively new, with open questions and demand for datasets and architectures for evaluation. This paper addresses these issues by introducing a new large-scale dataset for active fire detection, with over 150,000 image patches (more than 200 GB of data) extracted from Landsat-8 images captured around the world in August and September 2020, containing wildfires in several locations. The dataset was split in two parts, and contains 10-band spectral images with associated outputs, produced by three well known handcrafted algorithms for active fire detection in the first part, and manually annotated masks in the second part. We also present a study on how different convolutional neural network architectures can be used to approximate these handcrafted algorithms, and how models trained on automatically segmented patches can be combined to achieve better performance than the original algorithms - with the best combination having 87.2% precision and 92.4% recall on our manually annotated dataset. The proposed dataset, source codes and trained models are available on Github (https://github.com/pereira-gha/activefire), creating opportunities for further advances in the field

6.2CVAug 23, 2025

Do Multimodal LLMs See Sentiment?

Neemias B. da Silva, John Harrison, Rodrigo Minetto et al.

Understanding how visual content communicates sentiment is critical in an era where online interaction is increasingly dominated by this kind of media on social platforms. However, this remains a challenging problem, as sentiment perception is closely tied to complex, scene-level semantics. In this paper, we propose an original framework, MLLMsent, to investigate the sentiment reasoning capabilities of Multimodal Large Language Models (MLLMs) through three perspectives: (1) using those MLLMs for direct sentiment classification from images; (2) associating them with pre-trained LLMs for sentiment analysis on automatically generated image descriptions; and (3) fine-tuning the LLMs on sentiment-labeled image descriptions. Experiments on a recent and established benchmark demonstrate that our proposal, particularly the fine-tuned approach, achieves state-of-the-art results outperforming Lexicon-, CNN-, and Transformer-based baselines by up to 30.9%, 64.8%, and 42.4%, respectively, across different levels of evaluators' agreement and sentiment polarity categories. Remarkably, in a cross-dataset test, without any training on these new data, our model still outperforms, by up to 8.26%, the best runner-up, which has been trained directly on them. These results highlight the potential of the proposed visual reasoning scheme for advancing affective computing, while also establishing new benchmarks for future research.