Dunja Mladenić

AI
h-index15
25papers
488citations
Novelty29%
AI Score33

25 Papers

CLDec 1, 2022Code
A Commonsense-Infused Language-Agnostic Learning Framework for Enhancing Prediction of Political Polarity in Multilingual News Headlines

Swati Swati, Adrian Mladenić Grobelnik, Dunja Mladenić et al.

Predicting the political polarity of news headlines is a challenging task that becomes even more challenging in a multilingual setting with low-resource languages. To deal with this, we propose to utilise the Inferential Commonsense Knowledge via a Translate-Retrieve-Translate strategy to introduce a learning framework. To begin with, we use the method of translation and retrieval to acquire the inferential knowledge in the target language. We then employ an attention mechanism to emphasise important inferences. We finally integrate the attended inferences into a multilingual pre-trained language model for the task of bias prediction. To evaluate the effectiveness of our framework, we present a dataset of over 62.6K multilingual news headlines in five European languages annotated with their respective political polarities. We evaluate several state-of-the-art multilingual pre-trained language models since their performance tends to vary across languages (low/high resource). Evaluation results demonstrate that our proposed framework is effective regardless of the models employed. Overall, the best performing model trained with only headlines show 0.90 accuracy and F1, and 0.83 jaccard score. With attended knowledge in our framework, the same model show an increase in 2.2% accuracy and F1, and 3.6% jaccard score. Extending our experiments to individual languages reveals that the models we analyze for Slovenian perform significantly worse than other languages in our dataset. To investigate this, we assess the effect of translation quality on prediction performance. It indicates that the disparity in performance is most likely due to poor translation quality. We release our dataset and scripts at: https://github.com/Swati17293/KG-Multi-Bias for future research. Our framework has the potential to benefit journalists, social scientists, news producers, and consumers.

AIMar 21, 2022
Human-Centric Artificial Intelligence Architecture for Industry 5.0 Applications

Jože M. Rožanec, Inna Novalija, Patrik Zajec et al.

Human-centricity is the core value behind the evolution of manufacturing towards Industry 5.0. Nevertheless, there is a lack of architecture that considers safety, trustworthiness, and human-centricity at its core. Therefore, we propose an architecture that integrates Artificial Intelligence (Active Learning, Forecasting, Explainable Artificial Intelligence), simulated reality, decision-making, and users' feedback, focusing on synergies between humans and machines. Furthermore, we align the proposed architecture with the Big Data Value Association Reference Architecture Model. Finally, we validate it on three use cases from real-world case studies.

LGSep 12, 2022
Active Learning and Novel Model Calibration Measurements for Automated Visual Inspection in Manufacturing

Jože M. Rožanec, Luka Bizjak, Elena Trajkova et al.

Quality control is a crucial activity performed by manufacturing enterprises to ensure that their products meet quality standards and avoid potential damage to the brand's reputation. The decreased cost of sensors and connectivity enabled increasing digitalization of manufacturing. In addition, artificial intelligence enables higher degrees of automation, reducing overall costs and time required for defect inspection. This research compares three active learning approaches, having single and multiple oracles, to visual inspection. Six new metrics are proposed to assess the quality of calibration without the need for ground truth. Furthermore, this research explores whether existing calibrators can improve their performance by leveraging an approximate ground truth to enlarge the calibration set. The experiments were performed on real-world data provided by Philips Consumer Lifestyle BV. Our results show that the explored active learning settings can reduce the data labeling effort by between three and four percent without detriment to the overall quality goals, considering a threshold of p=0.95. Furthermore, the results show that the proposed calibration metrics successfully capture relevant information otherwise available to metrics used up to date only through ground truth data. Therefore, the proposed metrics can be used to estimate the quality of models' probability calibration without committing to a labeling effort to obtain ground truth data.

CVDec 19, 2022
Synthetic Data Augmentation Using GAN For Improved Automated Visual Inspection

Jože M. Rožanec, Patrik Zajec, Spyros Theodoropoulos et al.

Quality control is a crucial activity performed by manufacturing companies to ensure their products conform to the requirements and specifications. The introduction of artificial intelligence models enables to automate the visual quality inspection, speeding up the inspection process and ensuring all products are evaluated under the same criteria. In this research, we compare supervised and unsupervised defect detection techniques and explore data augmentation techniques to mitigate the data imbalance in the context of automated visual inspection. Furthermore, we use Generative Adversarial Networks for data augmentation to enhance the classifiers' discriminative performance. Our results show that state-of-the-art unsupervised defect detection does not match the performance of supervised models but can be used to reduce the labeling workload by more than 50%. Furthermore, the best classification performance was achieved considering GAN-based data generation with AUC ROC scores equal to or higher than 0,9898, even when increasing the dataset imbalance by leaving only 25\% of the images denoting defective products. We performed the research with real-world data provided by Philips Consumer Lifestyle BV.

AIApr 12, 2022
Enriching Artificial Intelligence Explanations with Knowledge Fragments

Jože M. Rožanec, Elena Trajkova, Inna Novalija et al.

Artificial Intelligence models are increasingly used in manufacturing to inform decision-making. Responsible decision-making requires accurate forecasts and an understanding of the models' behavior. Furthermore, the insights into models' rationale can be enriched with domain knowledge. This research builds explanations considering feature rankings for a particular forecast, enriching them with media news entries, datasets' metadata, and entries from the Google Knowledge Graph. We compare two approaches (embeddings-based and semantic-based) on a real-world use case regarding demand forecasting.

HCJul 3, 2023
Human in the AI loop via xAI and Active Learning for Visual Inspection

Jože M. Rožanec, Elias Montini, Vincenzo Cutrona et al.

Industrial revolutions have historically disrupted manufacturing by introducing automation into production. Increasing automation reshapes the role of the human worker. Advances in robotics and artificial intelligence open new frontiers of human-machine collaboration. Such collaboration can be realized considering two sub-fields of artificial intelligence: active learning and explainable artificial intelligence. Active learning aims to devise strategies that help obtain data that allows machine learning algorithms to learn better. On the other hand, explainable artificial intelligence aims to make the machine learning models intelligible to the human person. The present work first describes Industry 5.0, human-machine collaboration, and state-of-the-art regarding quality inspection, emphasizing visual inspection. Then it outlines how human-machine collaboration could be realized and enhanced in visual inspection. Finally, some of the results obtained in the EU H2020 STAR project regarding visual inspection are shared, considering artificial intelligence, human digital twins, and cybersecurity.

CVDec 19, 2022
Robust Anomaly Map Assisted Multiple Defect Detection with Supervised Classification Techniques

Jože M. Rožanec, Patrik Zajec, Spyros Theodoropoulos et al.

Industry 4.0 aims to optimize the manufacturing environment by leveraging new technological advances, such as new sensing capabilities and artificial intelligence. The DRAEM technique has shown state-of-the-art performance for unsupervised classification. The ability to create anomaly maps highlighting areas where defects probably lie can be leveraged to provide cues to supervised classification models and enhance their performance. Our research shows that the best performance is achieved when training a defect detection model by providing an image and the corresponding anomaly map as input. Furthermore, such a setting provides consistent performance when framing the defect detection as a binary or multiclass classification problem and is not affected by class balancing policies. We performed the experiments on three datasets with real-world data provided by Philips Consumer Lifestyle BV.

LGSep 28, 2022
Machine Beats Machine: Machine Learning Models to Defend Against Adversarial Attacks

Jože M. Rožanec, Dimitrios Papamartzivanos, Entso Veliou et al.

We propose using a two-layered deployment of machine learning models to prevent adversarial attacks. The first layer determines whether the data was tampered, while the second layer solves a domain-specific problem. We explore three sets of features and three dataset variations to train machine learning models. Our results show clustering algorithms achieved promising results. In particular, we consider the best results were obtained by applying the DBSCAN algorithm to the structured structural similarity index measure computed between the images and a white reference image.

AISep 28, 2022
Forecasting Sensor Values in Waste-To-Fuel Plants: a Case Study

Bor Brecelj, Beno Šircelj, Jože M. Rožanec et al.

In this research, we develop machine learning models to predict future sensor readings of a waste-to-fuel plant, which would enable proactive control of the plant's operations. We developed models that predict sensor readings for 30 and 60 minutes into the future. The models were trained using historical data, and predictions were made based on sensor readings taken at a specific time. We compare three types of models: (a) a näive prediction that considers only the last predicted value, (b) neural networks that make predictions based on past sensor data (we consider different time window sizes for making a prediction), and (c) a gradient boosted tree regressor created with a set of features that we developed. We developed and tested our models on a real-world use case at a waste-to-fuel plant in Canada. We found that approach (c) provided the best results, while approach (b) provided mixed results and was not able to outperform the näive consistently.

LGOct 12, 2023
Dealing with zero-inflated data: achieving SOTA with a two-fold machine learning approach

Jože M. Rožanec, Gašper Petelin, João Costa et al.

In many cases, a machine learning model must learn to correctly predict a few data points with particular values of interest in a broader range of data where many target values are zero. Zero-inflated data can be found in diverse scenarios, such as lumpy and intermittent demands, power consumption for home appliances being turned on and off, impurities measurement in distillation processes, and even airport shuttle demand prediction. The presence of zeroes affects the models' learning and may result in poor performance. Furthermore, zeroes also distort the metrics used to compute the model's prediction quality. This paper showcases two real-world use cases (home appliances classification and airport shuttle demand prediction) where a hierarchical model applied in the context of zero-inflated data leads to excellent results. In particular, for home appliances classification, the weighted average of Precision, Recall, F1, and AUC ROC was increased by 27%, 34%, 49%, and 27%, respectively. Furthermore, it is estimated that the proposed approach is also four times more energy efficient than the SOTA approach against which it was compared to. Two-fold models performed best in all cases when predicting airport shuttle demand, and the difference against other models has been proven to be statistically significant.

LGOct 1, 2025
Fiaingen: A financial time series generative method matching real-world data quality

Jože M. Rožanec, Tina Žezlin, Laurentiu Vasiliu et al.

Data is vital in enabling machine learning models to advance research and practical applications in finance, where accurate and robust models are essential for investment and trading decision-making. However, real-world data is limited despite its quantity, quality, and variety. The data shortage of various financial assets directly hinders the performance of machine learning models designed to trade and invest in these assets. Generative methods can mitigate this shortage. In this paper, we introduce a set of novel techniques for time series data generation (we name them Fiaingen) and assess their performance across three criteria: (a) overlap of real-world and synthetic data on a reduced dimensionality space, (b) performance on downstream machine learning tasks, and (c) runtime performance. Our experiments demonstrate that the methods achieve state-of-the-art performance across the three criteria listed above. Synthetic data generated with Fiaingen methods more closely mirrors the original time series data while keeping data generation time close to seconds - ensuring the scalability of the proposed approach. Furthermore, models trained on it achieve performance close to those trained with real-world data.

IRFeb 10, 2025
The 2021 Tokyo Olympics Multilingual News Article Dataset

Erik Novak, Erik Calcina, Dunja Mladenić et al.

In this paper, we introduce a dataset of multilingual news articles covering the 2021 Tokyo Olympics. A total of 10,940 news articles were gathered from 1,918 different publishers, covering 1,350 sub-events of the 2021 Olympics, and published between July 1, 2021, and August 14, 2021. These articles are written in nine languages from different language families and in different scripts. To create the dataset, the raw news articles were first retrieved via a service that collects and analyzes news articles. Then, the articles were grouped using an online clustering algorithm, with each group containing articles reporting on the same sub-event. Finally, the groups were manually annotated and evaluated. The development of this dataset aims to provide a resource for evaluating the performance of multilingual news clustering algorithms, for which limited datasets are available. It can also be used to analyze the dynamics and events of the 2021 Tokyo Olympics from different perspectives. The dataset is available in CSV format and can be accessed from the CLARIN.SI repository.

CVJun 1, 2024
Multimodal Metadata Assignment for Cultural Heritage Artifacts

Luis Rei, Dunja Mladenić, Mareike Dorozynski et al.

We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset. The three modalities are Image, Text, and Tabular data. We based the image classifier on a ResNet convolutional neural network architecture and the text classifier on a multilingual transformer architecture (XML-Roberta). Both are trained as multitask classifiers and use the focal loss to handle class imbalance. Tabular data and late fusion are handled by Gradient Tree Boosting. We also show how we leveraged specific data models and taxonomy in a Knowledge Graph to create the dataset and to store classification results. All individual classifiers accurately predict missing properties in the digitized silk artifacts, with the multimodal approach providing the best results.

CVOct 15, 2021
Streaming Machine Learning and Online Active Learning for Automated Visual Inspection

Jože M. Rožanec, Elena Trajkova, Paulien Dam et al.

Quality control is a key activity performed by manufacturing companies to verify product conformance to the requirements and specifications. Standardized quality control ensures that all the products are evaluated under the same criteria. The decreased cost of sensors and connectivity enabled an increasing digitalization of manufacturing and provided greater data availability. Such data availability has spurred the development of artificial intelligence models, which allow higher degrees of automation and reduced bias when inspecting the products. Furthermore, the increased speed of inspection reduces overall costs and time required for defect inspection. In this research, we compare five streaming machine learning algorithms applied to visual defect inspection with real-world data provided by Philips Consumer Lifestyle BV. Furthermore, we compare them in a streaming active learning context, which reduces the data labeling effort in a real-world context. Our results show that active learning reduces the data labeling effort by almost 15% on average for the worst case, while keeping an acceptable classification performance. The use of machine learning models for automated visual inspection are expected to speed up the quality inspection up to 40%.

LGSep 6, 2021
Active Learning for Automated Visual Inspection of Manufactured Products

Elena Trajkova, Jože M. Rožanec, Paulien Dam et al.

Quality control is a key activity performed by manufacturing enterprises to ensure products meet quality standards and avoid potential damage to the brand's reputation. The decreased cost of sensors and connectivity enabled an increasing digitalization of manufacturing. In addition, artificial intelligence enables higher degrees of automation, reducing overall costs and time required for defect inspection. In this research, we compare three active learning approaches and five machine learning algorithms applied to visual defect inspection with real-world data provided by Philips Consumer Lifestyle BV. Our results show that active learning reduces the data labeling effort without detriment to the models' performance.

AIJul 5, 2021
Knowledge Modelling and Active Learning in Manufacturing

Jože M. Rožanec, Inna Novalija, d Patrik Zajec et al.

The increasing digitalization of the manufacturing domain requires adequate knowledge modeling to capture relevant information. Ontologies and Knowledge Graphs provide means to model and relate a wide range of concepts, problems, and configurations. Both can be used to generate new knowledge through deductive inference and identify missing knowledge. While digitalization increases the amount of data available, much data is not labeled and cannot be directly used to train supervised machine learning models. Active learning can be used to identify the most informative data instances for which to obtain users' feedback, reduce friction, and maximize knowledge acquisition. By combining semantic technologies and active learning, multiple use cases in the manufacturing domain can be addressed taking advantage of the available knowledge and data.

AIJul 5, 2021
A Review of Explainable Artificial Intelligence in Manufacturing

Georgios Sofianidis, Jože M. Rožanec, Dunja Mladenić et al.

The implementation of Artificial Intelligence (AI) systems in the manufacturing domain enables higher production efficiency, outstanding performance, and safer operations, leveraging powerful tools such as deep learning and reinforcement learning techniques. Despite the high accuracy of these models, they are mostly considered black boxes: they are unintelligible to the human. Opaqueness affects trust in the system, a factor that is critical in the context of decision-making. We present an overview of Explainable Artificial Intelligence (XAI) techniques as a means of boosting the transparency of models. We analyze different metrics to evaluate these techniques and describe several application scenarios in the manufacturing domain.

AIMay 5, 2021
XAI-KG: knowledge graph to support XAI and decision-making in manufacturing

Jože M. Rožanec, Patrik Zajec, Klemen Kenda et al.

The increasing adoption of artificial intelligence requires accurate forecasts and means to understand the reasoning of artificial intelligence models behind such a forecast. Explainable Artificial Intelligence (XAI) aims to provide cues for why a model issued a certain prediction. Such cues are of utmost importance to decision-making since they provide insights on the features that influenced most certain forecasts and let the user decide if the forecast can be trusted. Though many techniques were developed to explain black-box models, little research was done on assessing the quality of those explanations and their influence on decision-making. We propose an ontology and knowledge graph to support collecting feedback regarding forecasts, forecast explanations, recommended decision-making options, and user actions. This way, we provide means to improve forecasting models, explanations, and recommendations of decision-making options. We tailor the knowledge graph for the domain of demand forecasting and validate it on real-world data.

AIApr 2, 2021
STARdom: an architecture for trusted and secure human-centered manufacturing systems

Jože M. Rožanec, Patrik Zajec, Klemen Kenda et al.

There is a lack of a single architecture specification that addresses the needs of trusted and secure Artificial Intelligence systems with humans in the loop, such as human-centered manufacturing systems at the core of the evolution towards Industry 5.0. To realize this, we propose an architecture that integrates forecasts, Explainable Artificial Intelligence, supports collecting users' feedback, and uses Active Learning and Simulated Reality to enhance forecasts and provide decision-making recommendations. The architecture security is addressed as a general concern. We align the proposed architecture with the Big Data Value Association Reference Architecture Model. We tailor it for the domain of demand forecasting and validate it on a real-world case study.

AIApr 1, 2021
Semantic XAI for contextualized demand forecasting explanations

Jože M. Rožanec, Dunja Mladenić

The paper proposes a novel architecture for explainable AI based on semantic technologies and AI. We tailor the architecture for the domain of demand forecasting and validate it on a real-world case study. The provided explanations combine concepts describing features relevant to a particular forecast, related media events, and metadata regarding external datasets of interest. The knowledge graph provides concepts that convey feature information at a higher abstraction level. By using them, explanations do not expose sensitive details regarding the demand forecasting models. The explanations also emphasize actionable dimensions where suitable. We link domain knowledge, forecasted values, and forecast explanations in a Knowledge Graph. The ontology and dataset we developed for this use case are publicly available for further research.

AIMar 30, 2021
Towards Active Learning Based Smart Assistant for Manufacturing

Patrik Zajec, Jože M. Rožanec, Inna Novalija et al.

A general approach for building a smart assistant that guides a user from a forecast generated by a machine learning model through a sequence of decision-making steps is presented. We develop a methodology to build such a system. The system is demonstrated on a demand forecasting use case in manufacturing. The methodology can be extended to several use cases in manufacturing. The system provides means for knowledge acquisition, gathering data from users. We envision active learning can be used to get data labels where labeled data is scarce.

AIMar 23, 2021
Actionable Cognitive Twins for Decision Making in Manufacturing

Jože M. Rožanec, Jinzhi Lu, Jan Rupnik et al.

Actionable Cognitive Twins are the next generation Digital Twins enhanced with cognitive capabilities through a knowledge graph and artificial intelligence models that provide insights and decision-making options to the users. The knowledge graph describes the domain-specific knowledge regarding entities and interrelationships related to a manufacturing setting. It also contains information on possible decision-making options that can assist decision-makers, such as planners or logisticians. In this paper, we propose a knowledge graph modeling approach to construct actionable cognitive twins for capturing specific knowledge related to demand forecasting and production planning in a manufacturing plant. The knowledge graph provides semantic descriptions and contextualization of the production lines and processes, including data identification and simulation or artificial intelligence algorithms and forecasts used to support them. Such semantics provide ground for inferencing, relating different knowledge types: creative, deductive, definitional, and inductive. To develop the knowledge graph models for describing the use case completely, systems thinking approach is proposed to design and verify the ontology, develop a knowledge graph and build an actionable cognitive twin. Finally, we evaluate our approach in two use cases developed for a European original equipment manufacturer related to the automotive industry as part of the European Horizon 2020 project FACTLOG.

LGMar 23, 2021
Reframing demand forecasting: a two-fold approach for lumpy and intermittent demand

Jože M. Rožanec, Dunja Mladenić

Demand forecasting is a crucial component of demand management. While shortening the forecasting horizon allows for more recent data and less uncertainty, this frequently means lower data aggregation levels and a more significant data sparsity. Sparse demand data usually results in lumpy or intermittent demand patterns, which have sparse and irregular demand intervals. Usual statistical and machine learning models fail to provide good forecasts in such scenarios. Our research shows that competitive demand forecasts can be obtained through two models: predicting the demand occurrence and estimating the demand size. We analyze the usage of local and global machine learning models for both cases and compare results against baseline methods. Finally, we propose a novel evaluation criterion of lumpy and intermittent demand forecasting models' performance. Our research shows that global classification models are the best choice when predicting demand event occurrence. When predicting demand sizes, we achieved the best results using Simple Exponential Smoothing forecast. We tested our approach on real-world data consisting of 516 three-year-long time series corresponding to European automotive original equipment manufacturers' daily demand.

AIJul 20, 2016
Constructing a Natural Language Inference Dataset using Generative Neural Networks

Janez Starc, Dunja Mladenić

Natural Language Inference is an important task for Natural Language Understanding. It is concerned with classifying the logical relation between two sentences. In this paper, we propose several text generative neural networks for generating text hypothesis, which allows construction of new Natural Language Inference datasets. To evaluate the models, we propose a new metric -- the accuracy of the classifier trained on the generated dataset. The accuracy obtained by our best generative model is only 2.7% lower than the accuracy of the classifier trained on the original, human crafted dataset. Furthermore, the best generated dataset combined with the original dataset achieves the highest accuracy. The best model learns a mapping embedding for each training example. By comparing various metrics we show that datasets that obtain higher ROUGE or METEOR scores do not necessarily yield higher classification accuracies. We also provide analysis of what are the characteristics of a good dataset including the distinguishability of the generated datasets from the original one.

AIJan 5, 2016
Joint learning of ontology and semantic parser from text

Janez Starc, Dunja Mladenić

Semantic parsing methods are used for capturing and representing semantic meaning of text. Meaning representation capturing all the concepts in the text may not always be available or may not be sufficiently complete. Ontologies provide a structured and reasoning-capable way to model the content of a collection of texts. In this work, we present a novel approach to joint learning of ontology and semantic parser from text. The method is based on semi-automatic induction of a context-free grammar from semantically annotated text. The grammar parses the text into semantic trees. Both, the grammar and the semantic trees are used to learn the ontology on several levels -- classes, instances, taxonomic and non-taxonomic relations. The approach was evaluated on the first sentences of Wikipedia pages describing people.