Tuomas Eerola

CV
h-index50
18papers
204citations
Novelty35%
AI Score39

18 Papers

16.4CVMar 17Code
Cross-modal learning for plankton recognition

Joona Kareinen, Veikka Immonen, Tuomas Eerola et al.

This paper considers self-supervised cross-modal coordination as a strategy enabling utilization of multiple modalities and large volumes of unlabeled plankton data to build models for plankton recognition. Automated imaging instruments facilitate the continuous collection of plankton image data on a large scale. Current methods for automatic plankton image recognition rely primarily on supervised approaches, which require labeled training sets that are labor-intensive to collect. On the other hand, some modern plankton imaging instruments complement image information with optical measurement data, such as scatter and fluorescence profiles, which currently are not widely utilized in plankton recognition. In this work, we explore the possibility of using such measurement data to guide the learning process without requiring manual labeling. Inspired by the concepts behind Contrastive Language-Image Pre-training, we train encoders for both modalities using only binary supervisory information indicating whether a given image and profile originate from the same particle or from different particles. For plankton recognition, we employ a small labeled gallery of known plankton species combined with a $k$-NN classifier. This approach yields a recognition model that is inherently multimodal, i.e., capable of utilizing information extracted from both image and profile data. We demonstrate that the proposed method achieves high recognition accuracy while requiring only a minimal number of labeled images. Furthermore, we show that the approach outperforms an image-only self-supervised baseline. Code available at https://github.com/Jookare/cross-modal-plankton.

CVJun 5, 2022
SealID: Saimaa ringed seal re-identification dataset

Ekaterina Nepovinnykh, Tuomas Eerola, Vincent Biard et al.

Wildlife camera traps and crowd-sourced image material provide novel possibilities to monitor endangered animal species. However, massive image volumes that these methods produce are overwhelming for researchers to go through manually which calls for automatic systems to perform the analysis. The analysis task that has gained the most attention is the re-identification of individuals, as it allows, for example, to study animal migration or to estimate the population size. The Saimaa ringed seal (Pusa hispida saimensis) is an endangered subspecies only found in the Lake Saimaa, Finland, and is one of the few existing freshwater seal species. Ringed seals have permanent pelage patterns that are unique to each individual which can be used for the identification of individuals. Large variation in poses further exacerbated by the deformable nature of seals together with varying appearance and low contrast between the ring pattern and the rest of the pelage makes the Saimaa ringed seal re-identification task very challenging, providing a good benchmark to evaluate state-of-the-art re-identification methods. Therefore, we make our Saimaa ringed seal image (SealID) dataset (N=57) publicly available for research purposes. In this paper, the dataset is described, the evaluation protocol for re-identification methods is proposed, and the results for two baseline methods HotSpotter and NORPPA are provided. The SealID dataset has been made publicly available.

CVJun 6, 2022
NORPPA: NOvel Ringed seal re-identification by Pelage Pattern Aggregation

Ekaterina Nepovinnykh, Ilia Chelak, Tuomas Eerola et al.

We propose a method for Saimaa ringed seal (Pusa hispida saimensis) re-identification. Access to large image volumes through camera trapping and crowdsourcing provides novel possibilities for animal monitoring and conservation and calls for automatic methods for analysis, in particular, when re-identifying individual animals from the images. The proposed method NOvel Ringed seal re-identification by Pelage Pattern Aggregation (NORPPA) utilizes the permanent and unique pelage pattern of Saimaa ringed seals and content-based image retrieval techniques. First, the query image is preprocessed, and each seal instance is segmented. Next, the seal's pelage pattern is extracted using a U-net encoder-decoder based method. Then, CNN-based affine invariant features are embedded and aggregated into Fisher Vectors. Finally, the cosine distance between the Fisher Vectors is used to find the best match from a database of known individuals. We perform extensive experiments of various modifications of the method on a new challenging Saimaa ringed seals re-identification dataset. The proposed method is shown to produce the best re-identification accuracy on our dataset in comparisons with alternative approaches.

CVMar 15, 2023
Towards Phytoplankton Parasite Detection Using Autoencoders

Simon Bilik, Daniel Batrakhanov, Tuomas Eerola et al.

Phytoplankton parasites are largely understudied microbial components with a potentially significant ecological impact on phytoplankton bloom dynamics. To better understand their impact, we need improved detection methods to integrate phytoplankton parasite interactions in monitoring aquatic ecosystems. Automated imaging devices usually produce high amount of phytoplankton image data, while the occurrence of anomalous phytoplankton data is rare. Thus, we propose an unsupervised anomaly detection system based on the similarity of the original and autoencoder-reconstructed samples. With this approach, we were able to reach an overall F1 score of 0.75 in nine phytoplankton species, which could be further improved by species-specific fine-tuning. The proposed unsupervised approach was further compared with the supervised Faster R-CNN based object detector. With this supervised approach and the model trained on plankton species and anomalies, we were able to reach the highest F1 score of 0.86. However, the unsupervised approach is expected to be more universal as it can detect also unknown anomalies and it does not require any annotated anomalous data that may not be always available in sufficient quantities. Although other studies have dealt with plankton anomaly detection in terms of non-plankton particles, or air bubble detection, our paper is according to our best knowledge the first one which focuses on automated anomaly detection considering putative phytoplankton parasites or infections.

CVAug 11, 2023
Combining feature aggregation and geometric similarity for re-identification of patterned animals

Veikka Immonen, Ekaterina Nepovinnykh, Tuomas Eerola et al.

Image-based re-identification of animal individuals allows gathering of information such as migration patterns of the animals over time. This, together with large image volumes collected using camera traps and crowdsourcing, opens novel possibilities to study animal populations. For many species, the re-identification can be done by analyzing the permanent fur, feather, or skin patterns that are unique to each individual. In this paper, we address the re-identification by combining two types of pattern similarity metrics: 1) pattern appearance similarity obtained by pattern feature aggregation and 2) geometric pattern similarity obtained by analyzing the geometric consistency of pattern similarities. The proposed combination allows to efficiently utilize both the local and global pattern features, providing a general re-identification approach that can be applied to a wide variety of different pattern types. In the experimental part of the work, we demonstrate that the method achieves promising re-identification accuracies for Saimaa ringed seals and whale sharks.

CVFeb 8, 2024
DAPlankton: Benchmark Dataset for Multi-instrument Plankton Recognition via Fine-grained Domain Adaptation

Daniel Batrakhanov, Tuomas Eerola, Kaisa Kraft et al.

Plankton recognition provides novel possibilities to study various environmental aspects and an interesting real-world context to develop domain adaptation (DA) methods. Different imaging instruments cause domain shift between datasets hampering the development of general plankton recognition methods. A promising remedy for this is DA allowing to adapt a model trained on one instrument to other instruments. In this paper, we present a new DA dataset called DAPlankton which consists of phytoplankton images obtained with different instruments. Phytoplankton provides a challenging DA problem due to the fine-grained nature of the task and high class imbalance in real-world datasets. DAPlankton consists of two subsets. DAPlankton_LAB contains images of cultured phytoplankton providing a balanced dataset with minimal label uncertainty. DAPlankton_SEA consists of images collected from the Baltic Sea providing challenging real-world data with large intra-class variance and class imbalance. We further present a benchmark comparison of three widely used DA methods.

CVMar 14, 2025
Open-Set Plankton Recognition

Joona Kareinen, Annaliina Skyttä, Tuomas Eerola et al.

This paper considers open-set recognition (OSR) of plankton images. Plankton include a diverse range of microscopic aquatic organisms that have an important role in marine ecosystems as primary producers and as a base of food webs. Given their sensitivity to environmental changes, fluctuations in plankton populations offer valuable information about oceans' health and climate change motivating their monitoring. Modern automatic plankton imaging devices enable the collection of large-scale plankton image datasets, facilitating species-level analysis. Plankton species recognition can be seen as an image classification task and is typically solved using deep learning-based image recognition models. However, data collection in real aquatic environments results in imaging devices capturing a variety of non-plankton particles and plankton species not present in the training set. This creates a challenging fine-grained OSR problem, characterized by subtle differences between taxonomically close plankton species. We address this challenge by conducting extensive experiments on three OSR approaches using both phyto- and zooplankton images analyzing also on the effect of the rejection thresholds for OSR. The results demonstrate that high OSR accuracy can be obtained promoting the use of these methods in operational plankton research. We have made the data publicly available to the research community.

CVMay 24, 2024
Understanding the Impact of Training Set Size on Animal Re-identification

Aleksandr Algasov, Ekaterina Nepovinnykh, Tuomas Eerola et al.

Recent advancements in the automatic re-identification of animal individuals from images have opened up new possibilities for studying wildlife through camera traps and citizen science projects. Existing methods leverage distinct and permanent visual body markings, such as fur patterns or scars, and typically employ one of two strategies: local features or end-to-end learning. In this study, we delve into the impact of training set size by conducting comprehensive experiments across six different methods and five animal species. While it is well known that end-to-end learning-based methods surpass local feature-based methods given a sufficient amount of good-quality training data, the challenge of gathering such datasets for wildlife animals means that local feature-based methods remain a more practical approach for many species. We demonstrate the benefits of both local feature and end-to-end learning-based approaches and show that species-specific characteristics, particularly intra-individual variance, have a notable effect on training data requirements.

CVMar 14, 2025
Self-Supervised Pretraining for Fine-Grained Plankton Recognition

Joona Kareinen, Tuomas Eerola, Kaisa Kraft et al.

Plankton recognition is an important computer vision problem due to plankton's essential role in ocean food webs and carbon capture, highlighting the need for species-level monitoring. However, this task is challenging due to its fine-grained nature and dataset shifts caused by different imaging instruments and varying species distributions. As new plankton image datasets are collected at an increasing pace, there is a need for general plankton recognition models that require minimal expert effort for data labeling. In this work, we study large-scale self-supervised pretraining for fine-grained plankton recognition. We first employ masked autoencoding and a large volume of diverse plankton image data to pretrain a general-purpose plankton image encoder. Then we utilize fine-tuning to obtain accurate plankton recognition models for new datasets with a very limited number of labeled training images. Our experiments show that self-supervised pretraining with diverse plankton data clearly increases plankton recognition accuracy compared to standard ImageNet pretraining when the amount of training data is limited. Moreover, the accuracy can be further improved when unlabeled target data is available and utilized during the pretraining.

CVJun 18, 2025
Unsupervised Pelage Pattern Unwrapping for Animal Re-identification

Aleksandr Algasov, Ekaterina Nepovinnykh, Fedor Zolotarev et al.

Existing individual re-identification methods often struggle with the deformable nature of animal fur or skin patterns which undergo geometric distortions due to body movement and posture changes. In this paper, we propose a geometry-aware texture mapping approach that unwarps pelage patterns, the unique markings found on an animal's skin or fur, into a canonical UV space, enabling more robust feature matching. Our method uses surface normal estimation to guide the unwrapping process while preserving the geometric consistency between the 3D surface and the 2D texture space. We focus on two challenging species: Saimaa ringed seals (Pusa hispida saimensis) and leopards (Panthera pardus). Both species have distinctive yet highly deformable fur patterns. By integrating our pattern-preserving UV mapping with existing re-identification techniques, we demonstrate improved accuracy across diverse poses and viewing angles. Our framework does not require ground truth UV annotations and can be trained in a self-supervised manner. Experiments on seal and leopard datasets show up to a 5.4% improvement in re-identification accuracy.

CVMar 27, 2025
Multimodal surface defect detection from wooden logs for sawing optimization

Bořek Reich, Matej Kunda, Fedor Zolotarev et al.

We propose a novel, good-quality, and less demanding method for detecting knots on the surface of wooden logs using multimodal data fusion. Knots are a primary factor affecting the quality of sawn timber, making their detection fundamental to any timber grading or cutting optimization system. While X-ray computed tomography provides accurate knot locations and internal structures, it is often too slow or expensive for practical use. An attractive alternative is to use fast and cost-effective log surface measurements, such as laser scanners or RGB cameras, to detect surface knots and estimate the internal structure of wood. However, due to the small size of knots and noise caused by factors, such as bark and other natural variations, detection accuracy often remains low when only one measurement modality is used. In this paper, we demonstrate that by using a data fusion pipeline consisting of separate streams for RGB and point cloud data, combined by a late fusion module, higher knot detection accuracy can be achieved compared to using either modality alone. We further propose a simple yet efficient sawing angle optimization method that utilizes surface knot detections and cross-correlation to minimize the amount of unwanted arris knots, demonstrating its benefits over randomized sawing angles.

CVMar 18, 2025
Towards synthetic generation of realistic wooden logs

Fedor Zolotarev, Borek Reich, Tuomas Eerola et al.

In this work, we propose a novel method to synthetically generate realistic 3D representations of wooden logs. Efficient sawmilling heavily relies on accurate measurement of logs and the distribution of knots inside them. Computed Tomography (CT) can be used to obtain accurate information about the knots but is often not feasible in a sawmill environment. A promising alternative is to utilize surface measurements and machine learning techniques to predict the inner structure of the logs. However, obtaining enough training data remains a challenge. We focus mainly on two aspects of log generation: the modeling of knot growth inside the tree, and the realistic synthesis of the surface including the regions, where the knots reach the surface. This results in the first log synthesis approach capable of generating both the internal knot and external surface structures of wood. We demonstrate that the proposed mathematical log model accurately fits to real data obtained from CT scans and enables the generation of realistic logs.

CVMar 18, 2025
Deep Unsupervised Segmentation of Log Point Clouds

Fedor Zolotarev, Tuomas Eerola, Tomi Kauppi

In sawmills, it is essential to accurately measure the raw material, i.e. wooden logs, to optimise the sawing process. Earlier studies have shown that accurate predictions of the inner structure of the logs can be obtained using just surface point clouds produced by a laser scanner. This provides a cost-efficient and fast alternative to the X-ray CT-based measurement devices. The essential steps in analysing log point clouds is segmentation, as it forms the basis for finding the fine surface details that provide the cues about the inner structure of the log. We propose a novel Point Transformer-based point cloud segmentation technique that learns to find the points belonging to the log surface in unsupervised manner. This is obtained using a loss function that utilises the geometrical properties of a cylinder while taking into account the shape variation common in timber logs. We demonstrate the accuracy of the method on wooden logs, but the approach could be utilised also on other cylindrical objects.

CVMar 5, 2025
Mineral segmentation using electron microscope images and spectral sampling through multimodal graph neural networks

Samuel Repka, Bořek Reich, Fedor Zolotarev et al.

We propose a novel Graph Neural Network-based method for segmentation based on data fusion of multimodal Scanning Electron Microscope (SEM) images. In most cases, Backscattered Electron (BSE) images obtained using SEM do not contain sufficient information for mineral segmentation. Therefore, imaging is often complemented with point-wise Energy-Dispersive X-ray Spectroscopy (EDS) spectral measurements that provide highly accurate information about the chemical composition but that are time-consuming to acquire. This motivates the use of sparse spectral data in conjunction with BSE images for mineral segmentation. The unstructured nature of the spectral data makes most traditional image fusion techniques unsuitable for BSE-EDS fusion. We propose using graph neural networks to fuse the two modalities and segment the mineral phases simultaneously. Our results demonstrate that providing EDS data for as few as 1% of BSE pixels produces accurate segmentation, enabling rapid analysis of mineral samples. The proposed data fusion pipeline is versatile and can be adapted to other domains that involve image data and point-wise measurements.

CVMay 19, 2023
Survey of Automatic Plankton Image Recognition: Challenges, Existing Solutions and Future Perspectives

Tuomas Eerola, Daniel Batrakhanov, Nastaran Vatankhah Barazandeh et al.

Planktonic organisms are key components of aquatic ecosystems and respond quickly to changes in the environment, therefore their monitoring is vital to understand the changes in the environment. Yet, monitoring plankton at appropriate scales still remains a challenge, limiting our understanding of functioning of aquatic systems and their response to changes. Modern plankton imaging instruments can be utilized to sample at high frequencies, enabling novel possibilities to study plankton populations. However, manual analysis of the data is costly, time consuming and expert based, making such approach unsuitable for large-scale application and urging for automatic solutions. The key problem related to the utilization of plankton datasets through image analysis is plankton recognition. Despite the large amount of research done, automatic methods have not been widely adopted for operational use. In this paper, a comprehensive survey on existing solutions for automatic plankton recognition is presented. First, we identify the most notable challenges that that make the development of plankton recognition systems difficult. Then, we provide a detailed description of solutions for these challenges proposed in plankton recognition literature. Finally, we propose a workflow to identify the specific challenges in new datasets and the recommended approaches to address them. For many of the challenges, applicable solutions exist. However, important challenges remain unsolved: 1) the domain shift between the datasets hindering the development of a general plankton recognition system that would work across different imaging instruments, 2) the difficulty to identify and process the images of previously unseen classes, and 3) the uncertainty in expert annotations that affects the training of the machine learning models for recognition. These challenges should be addressed in the future research.

CVMay 28, 2021
EDEN: Deep Feature Distribution Pooling for Saimaa Ringed Seals Pattern Matching

Ilia Chelak, Ekaterina Nepovinnykh, Tuomas Eerola et al.

In this paper, pelage pattern matching is considered to solve the individual re-identification of the Saimaa ringed seals. Animal re-identification together with the access to large amount of image material through camera traps and crowd-sourcing provide novel possibilities for animal monitoring and conservation. We propose a novel feature pooling approach that allow aggregating the local pattern features to get a fixed size embedding vector that incorporate global features by taking into account the spatial distribution of features. This is obtained by eigen decomposition of covariances computed for probability mass functions representing feature maps. Embedding vectors can then be used to find the best match in the database of known individuals allowing animal re-identification. The results show that the proposed pooling method outperforms the existing methods on the challenging Saimaa ringed seal image data.

CVJun 3, 2019
Resolving Overlapping Convex Objects in Silhouette Images by Concavity Analysis and Gaussian Process

Sahar Zafari, Mariia Murashkina, Tuomas Eerola et al.

Segmentation of overlapping convex objects has various applications, for example, in nanoparticles and cell imaging. Often the segmentation method has to rely purely on edges between the background and foreground making the analyzed images essentially silhouette images. Therefore, to segment the objects, the method needs to be able to resolve the overlaps between multiple objects by utilizing prior information about the shape of the objects. This paper introduces a novel method for segmentation of clustered partially overlapping convex objects in silhouette images. The proposed method involves three main steps: pre-processing, contour evidence extraction, and contour estimation. Contour evidence extraction starts by recovering contour segments from a binarized image by detecting concave points. After this, the contour segments which belong to the same objects are grouped. The grouping is formulated as a combinatorial optimization problem and solved using the branch and bound algorithm. Finally, the full contours of the objects are estimated by a Gaussian process regression method. The experiments on a challenging dataset consisting of nanoparticles demonstrate that the proposed method outperforms three current state-of-art approaches in overlapping convex objects segmentation. The method relies only on edge information and can be applied to any segmentation problems where the objects are partially overlapping and have a convex shape.

MMAug 8, 2013
Semantic Computing of Moods Based on Tags in Social Media of Music

Pasi Saari, Tuomas Eerola

Social tags inherent in online music services such as Last.fm provide a rich source of information on musical moods. The abundance of social tags makes this data highly beneficial for developing techniques to manage and retrieve mood information, and enables study of the relationships between music content and mood representations with data substantially larger than that available for conventional emotion research. However, no systematic assessment has been done on the accuracy of social tags and derived semantic models at capturing mood information in music. We propose a novel technique called Affective Circumplex Transformation (ACT) for representing the moods of music tracks in an interpretable and robust fashion based on semantic computing of social tags and research in emotion modeling. We validate the technique by predicting listener ratings of moods in music tracks, and compare the results to prediction with the Vector Space Model (VSM), Singular Value Decomposition (SVD), Nonnegative Matrix Factorization (NMF), and Probabilistic Latent Semantic Analysis (PLSA). The results show that ACT consistently outperforms the baseline techniques, and its performance is robust against a low number of track-level mood tags. The results give validity and analytical insights for harnessing millions of music tracks and associated mood data available through social tags in application development.