Tobias Olsson

CV
h-index2
3papers
25citations
Novelty42%
AI Score35

3 Papers

CVOct 8, 2023
Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks

Mehrdad Mohammadian, Neda Maleki, Tobias Olsson et al.

What happens if we encounter a suitable font for our design work but do not know its name? Visual Font Recognition (VFR) systems are used to identify the font typeface in an image. These systems can assist graphic designers in identifying fonts used in images. A VFR system also aids in improving the speed and accuracy of Optical Character Recognition (OCR) systems. In this paper, we introduce the first publicly available datasets in the field of Persian font recognition and employ Convolutional Neural Networks (CNN) to address this problem. The results show that the proposed pipeline obtained 78.0% top-1 accuracy on our new datasets, 89.1% on the IDPL-PFOD dataset, and 94.5% on the KAFD dataset. Furthermore, the average time spent in the entire pipeline for one sample of our proposed datasets is 0.54 and 0.017 seconds for CPU and GPU, respectively. We conclude that CNN methods can be used to recognize Persian fonts without the need for additional pre-processing steps such as feature extraction, binarization, normalization, etc.

SESep 20, 2021Code
To Automatically Map Source Code Entities to Architectural Modules with Naive Bayes

Tobias Olsson, Morgan Ericsson, Anna Wingkvist

Background: The process of mapping a source code entity onto an architectural module is to a large degree a manual task. Automating this process could increase the use of static architecture conformance checking methods, such as reflexion modeling, in industry. Current techniques rely on user parameterization and a highly cohesive design. A machine learning approach would potentially require fewer parameters and better use of the available information to aid in automatic mapping. Aim: We investigate how a classifier can be trained to map from source code to architecture modules automatically. This classifier is trained with semantic and syntactic dependency information extracted from the source code and from architecture descriptions. The classifier is implemented using multinomial naive Bayes and evaluated. Method: We perform experiments and compare the classifier with three state-of-the-art mapping functions in eight open-source Java systems with known ground-truth-mappings. Results: We find that the classifier outperforms the state-of-the-art in all cases and that it provides a useful baseline for further research in the area of semi-automatic incremental clustering. Conclusions: We conclude that machine learning is a useful approach that performs better and with less need for parameterization compared to other approaches. Future work includes investigating problematic mappings and a more diverse set of subject systems.

LGSep 4, 2025
Privacy Risks in Time Series Forecasting: User- and Record-Level Membership Inference

Nicolas Johansson, Tobias Olsson, Daniel Nilsson et al.

Membership inference attacks (MIAs) aim to determine whether specific data were used to train a model. While extensively studied on classification models, their impact on time series forecasting remains largely unexplored. We address this gap by introducing two new attacks: (i) an adaptation of multivariate LiRA, a state-of-the-art MIA originally developed for classification models, to the time-series forecasting setting, and (ii) a novel end-to-end learning approach called Deep Time Series (DTS) attack. We benchmark these methods against adapted versions of other leading attacks from the classification setting. We evaluate all attacks in realistic settings on the TUH-EEG and ELD datasets, targeting two strong forecasting architectures, LSTM and the state-of-the-art N-HiTS, under both record- and user-level threat models. Our results show that forecasting models are vulnerable, with user-level attacks often achieving perfect detection. The proposed methods achieve the strongest performance in several settings, establishing new baselines for privacy risk assessment in time series forecasting. Furthermore, vulnerability increases with longer prediction horizons and smaller training populations, echoing trends observed in large language models.