Ivica Dimitrovski

CV
h-index16
7papers
135citations
Novelty30%
AI Score40

7 Papers

CVJul 14, 2022Code
Current Trends in Deep Learning for Earth Observation: An Open-source Benchmark Arena for Image Classification

Ivica Dimitrovski, Ivan Kitanovski, Dragi Kocev et al.

We present AiTLAS: Benchmark Arena -- an open-source benchmark suite for evaluating state-of-the-art deep learning approaches for image classification in Earth Observation (EO). To this end, we present a comprehensive comparative analysis of more than 500 models derived from ten different state-of-the-art architectures and compare them to a variety of multi-class and multi-label classification tasks from 22 datasets with different sizes and properties. In addition to models trained entirely on these datasets, we benchmark models trained in the context of transfer learning, leveraging pre-trained model variants, as it is typically performed in practice. All presented approaches are general and can be easily extended to many other remote sensing image classification tasks not considered in this study. To ensure reproducibility and facilitate better usability and further developments, all of the experimental resources including the trained models, model configurations, and processing details of the datasets (with their corresponding splits used for training and evaluating the models) are publicly available on the repository: https://github.com/biasvariancelabs/aitlas-arena

4.0LGMay 6
Event-Based Early Warning of Vineyard Disease Risk from Environmental Time Series

Ivica Dimitrovski, Ivan Kitanovski, Danco Davcev et al.

Accurate early warning of vineyard disease risk from environmental observations is essential for timely intervention and more sustainable crop protection. However, many existing studies formulate disease prediction as daily presence classification, which can favor persistence-driven predictions and provide only limited support for actionable short-horizon warning. In this paper, we present an event-based approach for early warning of vineyard disease risk from environmental time series and evaluate it through a vineyard case study. Rather than predicting daily disease status, the task is reformulated to predict transitions into annotated disease-risk periods within a future window of 3-7 days. To reduce fragmentation caused by short interruptions in the binary labels, new events are defined only after a minimum disease-free gap. This formulation encourages models to capture environmental precursors associated with upcoming risk periods instead of merely reproducing temporal persistence. Using multi-year agro-meteorological data, we construct input representations that capture humidity dynamics, rainfall accumulation, temperature variability, and seasonal structure through cyclic temporal encoding. We evaluate representative methods from classical machine learning and deep learning, including XGBoost, Long Short-Term Memory (LSTM) networks, and Temporal Convolutional Networks (TCNs), using both standard classification metrics and an event-oriented early warning protocol. The results show that the event-based formulation supports practical short-horizon warning, while the compared models exhibit distinct trade-offs between event recall, lead time, and false-alert behavior. Overall, the study underscores the importance of problem formulation in environmental time-series learning and demonstrates the value of event-based prediction for vineyard disease warning systems.

CVJul 4, 2023
In-Domain Self-Supervised Learning Improves Remote Sensing Image Scene Classification

Ivica Dimitrovski, Ivan Kitanovski, Nikola Simidjievski et al.

We investigate the utility of in-domain self-supervised pre-training of vision models in the analysis of remote sensing imagery. Self-supervised learning (SSL) has emerged as a promising approach for remote sensing image classification due to its ability to exploit large amounts of unlabeled data. Unlike traditional supervised learning, SSL aims to learn representations of data without the need for explicit labels. This is achieved by formulating auxiliary tasks that can be used for pre-training models before fine-tuning them on a given downstream task. A common approach in practice to SSL pre-training is utilizing standard pre-training datasets, such as ImageNet. While relevant, such a general approach can have a sub-optimal influence on the downstream performance of models, especially on tasks from challenging domains such as remote sensing. In this paper, we analyze the effectiveness of SSL pre-training by employing the iBOT framework coupled with Vision transformers trained on Million-AID, a large and unlabeled remote sensing dataset. We present a comprehensive study of different self-supervised pre-training strategies and evaluate their effect across 14 downstream datasets with diverse properties. Our results demonstrate that leveraging large in-domain datasets for self-supervised pre-training consistently leads to improved predictive downstream performance, compared to the standard approaches found in practice.

CVAug 5, 2022
Discover the Mysteries of the Maya: Selected Contributions from the Machine Learning Challenge & The Discovery Challenge Workshop at ECML PKDD 2021

Dragi Kocev, Nikola Simidjievski, Ana Kostovska et al.

The volume contains selected contributions from the Machine Learning Challenge "Discover the Mysteries of the Maya", presented at the Discovery Challenge Track of The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021). Remote sensing has greatly accelerated traditional archaeological landscape surveys in the forested regions of the ancient Maya. Typical exploration and discovery attempts, beside focusing on whole ancient cities, focus also on individual buildings and structures. Recently, there have been several successful attempts of utilizing machine learning for identifying ancient Maya settlements. These attempts, while relevant, focus on narrow areas and rely on high-quality aerial laser scanning (ALS) data which covers only a fraction of the region where ancient Maya were once settled. Satellite image data, on the other hand, produced by the European Space Agency's (ESA) Sentinel missions, is abundant and, more importantly, publicly available. The "Discover the Mysteries of the Maya" challenge aimed at locating and identifying ancient Maya architectures (buildings, aguadas, and platforms) by performing integrated image segmentation of different types of satellite imagery (from Sentinel-1 and Sentinel-2) data and ALS (lidar) data.

CVOct 28, 2025
Few-Shot Remote Sensing Image Scene Classification with CLIP and Prompt Learning

Ivica Dimitrovski, Vlatko Spasev, Ivan Kitanovski

Remote sensing applications increasingly rely on deep learning for scene classification. However, their performance is often constrained by the scarcity of labeled data and the high cost of annotation across diverse geographic and sensor domains. While recent vision-language models like CLIP have shown promise by learning transferable representations at scale by aligning visual and textual modalities, their direct application to remote sensing remains suboptimal due to significant domain gaps and the need for task-specific semantic adaptation. To address this critical challenge, we systematically explore prompt learning as a lightweight and efficient adaptation strategy for few-shot remote sensing image scene classification. We evaluate several representative methods, including Context Optimization, Conditional Context Optimization, Multi-modal Prompt Learning, and Prompting with Self-Regulating Constraints. These approaches reflect complementary design philosophies: from static context optimization to conditional prompts for enhanced generalization, multi-modal prompts for joint vision-language adaptation, and semantically regularized prompts for stable learning without forgetting. We benchmark these prompt-learning methods against two standard baselines: zero-shot CLIP with hand-crafted prompts and a linear probe trained on frozen CLIP features. Through extensive experiments on multiple benchmark remote sensing datasets, including cross-dataset generalization tests, we demonstrate that prompt learning consistently outperforms both baselines in few-shot scenarios. Notably, Prompting with Self-Regulating Constraints achieves the most robust cross-domain performance. Our findings underscore prompt learning as a scalable and efficient solution for bridging the domain gap in satellite and aerial imagery, providing a strong foundation for future research in this field.

CVJan 21, 2022
AiTLAS: Artificial Intelligence Toolbox for Earth Observation

Ivica Dimitrovski, Ivan Kitanovski, Panče Panov et al.

The AiTLAS toolbox (Artificial Intelligence Toolbox for Earth Observation) includes state-of-the-art machine learning methods for exploratory and predictive analysis of satellite imagery as well as repository of AI-ready Earth Observation (EO) datasets. It can be easily applied for a variety of Earth Observation tasks, such as land use and cover classification, crop type prediction, localization of specific objects (semantic segmentation), etc. The main goal of AiTLAS is to facilitate better usability and adoption of novel AI methods (and models) by EO experts, while offering easy access and standardized format of EO datasets to AI experts which further allows benchmarking of various existing and novel AI methods tailored for EO data.

IRJun 18, 2017
Addressing Item-Cold Start Problem in Recommendation Systems using Model Based Approach and Deep Learning

Ivica Obadić, Gjorgji Madjarov, Ivica Dimitrovski et al.

Traditional recommendation systems rely on past usage data in order to generate new recommendations. Those approaches fail to generate sensible recommendations for new users and items into the system due to missing information about their past interactions. In this paper, we propose a solution for successfully addressing item-cold start problem which uses model-based approach and recent advances in deep learning. In particular, we use latent factor model for recommendation, and predict the latent factors from item's descriptions using convolutional neural network when they cannot be obtained from usage data. Latent factors obtained by applying matrix factorization to the available usage data are used as ground truth to train the convolutional neural network. To create latent factor representations for the new items, the convolutional neural network uses their textual description. The results from the experiments reveal that the proposed approach significantly outperforms several baseline estimators.