Murilo Gustineli

h-index2

7papers

9citations

Novelty11%

AI Score32

Ranked #128,170 of 194,257 authors (top 66%)#42,432 in CV (top 72%)

7 Papers

17.1CVJul 16Code

Multi-Scale ViT Inference with Habitat-Fit Priors and kNN Retrieval for Multi-Species Plant Identification

Alper Erten, Murilo Gustineli, Adrian Cheung

This paper describes DS@GT ARC's third-place solution to the PlantCLEF 2026 challenge on multi-species plant identification in vegetation quadrat images, where systems must predict every species present in high-resolution (~3000 x 3000 pixel) plot photographs while training only on single-label images of individual plants. The pipeline is built around a fine-tuned DINOv2 ViT-L/14 classifier applied over a multi-scale tile decomposition of each quadrat, with per-tile predictions blended with a FAISS kNN retriever and post-processed by source-aware temporal fusion across repeated plot visits, a habitat-fit demotion that injects geographic and altitude priors from the training data, and a South-Western Europe geographic mask. Habitat-fit demotion and multi-scale aggregation are the largest individual contributors in the ablations. Two complementary training-centric directions, a cross-region transformer with noisy-student distillation on the LUCAS dataset and a label-as-query transformer decoder over synthetic CLS-domain pseudo-quadrats, yielded null results. An inference-time augmentation with instance-aware segmentation crops also did not improve performance. The selected submission reaches a private-leaderboard macro-F1 of 0.43902 (third place; public 0.51096); an unselected configuration of the same pipeline scored above 0.45 on the private set. Code: https://github.com/dsgt-arc/plantclef-2026.

13.1SDJul 16Code

Can Tokens Compete? Token Representations against Supervised CNN Backbones for BirdCLEF+ 2026

Anthony Miyaguchi, Murilo Gustineli, Adrian Cheung

This paper details the DS@GT ARC team's approach to BirdCLEF+ 2026, multi-label detection of animal vocalizations in soundscapes from the Pantanal wetlands. The 2026 edition adds about an hour of labeled soundscapes, shifting the task toward supervised pipelines fit to the labeled set. First, we build a competitive supervised baseline that ensembles a frozen Perch v2 backbone, a trained HGNetV2-B0 sound-event-detection network, and a non-bird prototypical head, reaching a private leaderboard score of 0.936 at rank 1894 within a 90-minute CPU budget. Second, we ask whether token-based representations can compete, contrasting codec representations from neural audio codecs against semantic representations from foundational embeddings. We compare two bioacoustic specialist models against four token-based encoders trained on AudioSet. The repository for this work can be found at https://github.com/dsgt-arc/birdclef-2026.

9.6CVJul 8, 2024Code

Multi-Label Plant Species Classification with Self-Supervised Vision Transformers

Murilo Gustineli, Anthony Miyaguchi, Ian Stalter

We present a transfer learning approach using a self-supervised Vision Transformer (DINOv2) for the PlantCLEF 2024 competition, focusing on the multi-label plant species classification. Our method leverages both base and fine-tuned DINOv2 models to extract generalized feature embeddings. We train classifiers to predict multiple plant species within a single image using these rich embeddings. To address the computational challenges of the large-scale dataset, we employ Spark for distributed data processing, ensuring efficient memory management and processing across a cluster of workers. Our data processing pipeline transforms images into grids of tiles, classifying each tile, and aggregating these predictions into a consolidated set of probabilities. Our results demonstrate the efficacy of combining transfer learning with advanced data processing techniques for multi-label image classification tasks. Our code is available at https://github.com/dsgt-kaggle-clef/plantclef-2024.

3.7CVJul 8, 2024Code

Transfer Learning with Self-Supervised Vision Transformers for Snake Identification

Anthony Miyaguchi, Murilo Gustineli, Austin Fischer et al.

We present our approach for the SnakeCLEF 2024 competition to predict snake species from images. We explore and use Meta's DINOv2 vision transformer model for feature extraction to tackle species' high variability and visual similarity in a dataset of 182,261 images. We perform exploratory analysis on embeddings to understand their structure, and train a linear classifier on the embeddings to predict species. Despite achieving a score of 39.69, our results show promise for DINOv2 embeddings in snake identification. All code for this project is available at https://github.com/dsgt-kaggle-clef/snakeclef-2024.

6.9LGApr 6, 2022

A survey on recently proposed activation functions for Deep Learning

Murilo Gustineli

Artificial neural networks (ANN), typically referred to as neural networks, are a class of Machine Learning algorithms and have achieved widespread success, having been inspired by the biological structure of the human brain. Neural networks are inherently powerful due to their ability to learn complex function approximations from data. This generalization ability has been able to impact multidisciplinary areas involving image recognition, speech recognition, natural language processing, and others. Activation functions are a crucial sub-component of neural networks. They define the output of a node in the network given a set of inputs. This survey discusses the main concepts of activation functions in neural networks, including; a brief introduction to deep neural networks, a summary of what are activation functions and how they are used in neural networks, their most common properties, the different types of activation functions, some of the challenges, limitations, and alternative solutions faced by activation functions, concluding with the final remarks.

2.3SDJun 29, 2023Code

Transfer Learning with Semi-Supervised Dataset Annotation for Birdcall Classification

Anthony Miyaguchi, Nathan Zhong, Murilo Gustineli et al.

We present working notes on transfer learning with semi-supervised dataset annotation for the BirdCLEF 2023 competition, focused on identifying African bird species in recorded soundscapes. Our approach utilizes existing off-the-shelf models, BirdNET and MixIT, to address representation and labeling challenges in the competition. We explore the embedding space learned by BirdNET and propose a process to derive an annotated dataset for supervised learning. Our experiments involve various models and feature engineering approaches to maximize performance on the competition leaderboard. The results demonstrate the effectiveness of our approach in classifying bird species and highlight the potential of transfer learning and semi-supervised dataset annotation in similar tasks.

3.6CVJul 8, 2025Code

Tile-Based ViT Inference with Visual-Cluster Priors for Zero-Shot Multi-Species Plant Identification

Murilo Gustineli, Anthony Miyaguchi, Adrian Cheung et al.

We describe DS@GT's second-place solution to the PlantCLEF 2025 challenge on multi-species plant identification in vegetation quadrat images. Our pipeline combines (i) a fine-tuned Vision Transformer ViTD2PC24All for patch-level inference, (ii) a 4x4 tiling strategy that aligns patch size with the network's 518x518 receptive field, and (iii) domain-prior adaptation through PaCMAP + K-Means visual clustering and geolocation filtering. Tile predictions are aggregated by majority vote and re-weighted with cluster-specific Bayesian priors, yielding a macro-averaged F1 of 0.348 (private leaderboard) while requiring no additional training. All code, configuration files, and reproducibility scripts are publicly available at https://github.com/dsgt-arc/plantclef-2025.