María Óskarsdóttir

SI
h-index29
12papers
362citations
Novelty46%
AI Score44

12 Papers

SINov 15, 2022
Influencer Detection with Dynamic Graph Neural Networks

Elena Tiukhova, Emiliano Penaloza, María Óskarsdóttir et al.

Leveraging network information for prediction tasks has become a common practice in many domains. Being an important part of targeted marketing, influencer detection can potentially benefit from incorporating dynamic network representation. In this work, we investigate different dynamic Graph Neural Networks (GNNs) configurations for influencer detection and evaluate their prediction performance using a unique corporate data set. We show that using deep multi-head attention in GNN and encoding temporal attributes significantly improves performance. Furthermore, our empirical evaluation illustrates that capturing neighborhood representation is more beneficial that using network centrality measures.

SIJul 16, 2023
INFLECT-DGNN: Influencer Prediction with Dynamic Graph Neural Networks

Elena Tiukhova, Emiliano Penaloza, María Óskarsdóttir et al.

Leveraging network information for predictive modeling has become widespread in many domains. Within the realm of referral and targeted marketing, influencer detection stands out as an area that could greatly benefit from the incorporation of dynamic network representation due to the continuous evolution of customer-brand relationships. In this paper, we present INFLECT-DGNN, a new method for profit-driven INFLuencer prEdiCTion with Dynamic Graph Neural Networks that innovatively combines Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs) with weighted loss functions, synthetic minority oversampling adapted to graph data, and a carefully crafted rolling-window strategy. We introduce a novel profit-driven framework that supports decision-making based on model predictions. To test the framework, we use a unique corporate dataset with diverse networks, capturing the customer interactions across three cities with different socioeconomic and demographic characteristics. Our results show how using RNNs to encode temporal attributes alongside GNNs significantly improves predictive performance, while the profit-driven framework determines the optimal classification threshold for profit maximization. We compare the results of different models to demonstrate the importance of capturing network representation, temporal dependencies, and using a profit-driven evaluation. Our research has significant implications for the fields of referral and targeted marketing, expanding the technical use of deep graph learning within corporate environments.

19.6SPMay 19
Staging by the Book: Automatic Sleep Stage Classification Using Scoring Rules

Emil Hardarson, Konstantin Popov, Sigridur Sigurdardottir et al.

Automated sleep staging is commonly approached as a supervised machine learning problem, with deep learning methods dominating recent research. While machine learning models achieve near-human level agreement with human-scored reference sleep stages, their decisions are typically opaque and not designed to follow clinical scoring rules. We propose a transparent alternative: a deterministic, rule-based sleep staging method that explicitly operationalizes the American Academy of Sleep Medicine's (AASM) scoring logic as executable code, coupled with epoch-level natural-language justifications derived from an explanation trace. We evaluate the approach on 50 polysomnography recordings with a 10-scorer majority-vote consensus as reference. Across all recordings, the method agreed with the majority-vote reference in 60.5% of epochs ($κ=0.42$), with substantially higher agreement on a dataset used during development (77.1%, $κ=0.61$). Agreement with the reference was highest for sleep stage N2 (recall 83.5%) and moderate for sleep stage R (recall 68.7%), while Wake and N1 recall were low. Despite lower agreement with the reference than contemporary deep learning models, the method provides deterministic decisions and natural language explanations aligned with AASM scoring rules, making it a complementary tool for auditing, debugging, and governing deep learning-based sleep staging.

GNFeb 1, 2024
Attention-based Dynamic Multilayer Graph Neural Networks for Loan Default Prediction

Sahab Zandi, Kamesh Korangi, María Óskarsdóttir et al.

Whereas traditional credit scoring tends to employ only individual borrower- or loan-level predictors, it has been acknowledged for some time that connections between borrowers may result in default risk propagating over a network. In this paper, we present a model for credit risk assessment leveraging a dynamic multilayer network built from a Graph Neural Network and a Recurrent Neural Network, each layer reflecting a different source of network connection. We test our methodology in a behavioural credit scoring context using a dataset provided by U.S. mortgage financier Freddie Mac, in which different types of connections arise from the geographical location of the borrower and their choice of mortgage provider. The proposed model considers both types of connections and the evolution of these connections over time. We enhance the model by using a custom attention mechanism that weights the different time snapshots according to their importance. After testing multiple configurations, a model with GAT, LSTM, and the attention mechanism provides the best results. Empirical results demonstrate that, when it comes to predicting probability of default for the borrowers, our proposed model brings both better results and novel insights for the analysis of the importance of connections and timestamps, compared to traditional methods.

GNOct 10, 2025
A Multimodal Approach to SME Credit Scoring Integrating Transaction and Ownership Networks

Sahab Zandi, Kamesh Korangi, Juan C. Moreno-Paredes et al.

Small and Medium-sized Enterprises (SMEs) are known to play a vital role in economic growth, employment, and innovation. However, they tend to face significant challenges in accessing credit due to limited financial histories, collateral constraints, and exposure to macroeconomic shocks. These challenges make an accurate credit risk assessment by lenders crucial, particularly since SMEs frequently operate within interconnected firm networks through which default risk can propagate. This paper presents and tests a novel approach for modelling the risk of SME credit, using a unique large data set of SME loans provided by a prominent financial institution. Specifically, our approach employs Graph Neural Networks to predict SME default using multilayer network data derived from common ownership and financial transactions between firms. We show that combining this information with traditional structured data not only improves application scoring performance, but also explicitly models contagion risk between companies. Further analysis shows how the directionality and intensity of these connections influence financial risk contagion, offering a deeper understanding of the underlying processes. Our findings highlight the predictive power of network data, as well as the role of supply chain networks in exposing SMEs to correlated default risk.

LGFeb 5, 2025
Chaos into Order: Neural Framework for Expected Value Estimation of Stochastic Partial Differential Equations

Ísak Pétursson, María Óskarsdóttir

Stochastic partial differential equations (SPDEs) describe the evolution of random processes over space and time, but their solutions are often analytically intractable and computationally expensive to estimate. In this paper, we propose the Learned Expectation Collapser (LEC), a physics-informed neural framework designed to approximate the expected value of linear SPDE solutions without requiring domain discretization. By leveraging randomized sampling of both space-time coordinates and noise realizations during training, LEC trains standard feedforward neural networks to minimize residual loss across multiple stochastic samples. We hypothesize and empirically confirm that this training regime drives the network to converge toward the expected value of the solution of the SPDE. Using the stochastic heat equation as a testbed, we evaluate performance across a diverse set of 144 experimental configurations that span multiple spatial dimensions, noise models, and forcing functions. The results show that the model consistently learns accurate approximations of the expected value of the solution in lower dimensions and a predictable decrease in accuracy with increased spatial dimensions, with improved stability and robustness under increased Monte Carlo sampling. Our findings offer new insight into how neural networks implicitly learn statistical structure from stochastic differential operators and suggest a pathway toward scalable, simulator-free SPDE solvers.

HCJan 9, 2025
World of ScoreCraft: Novel Multi Scorer Experiment on the Impact of a Decision Support System in Sleep Staging

Benedikt Holm, Arnar Óskarsson, Björn Elvar Þorleifsson et al.

Manual scoring of polysomnography (PSG) is a time intensive task, prone to inter scorer variability that can impact diagnostic reliability. This study investigates the integration of decision support systems (DSS) into PSG scoring workflows, focusing on their effects on accuracy, scoring time, and potential biases toward recommendations from artificial intelligence (AI) compared to human generated recommendations. Using a novel online scoring platform, we conducted a repeated measures study with sleep technologists, who scored traditional and self applied PSGs. Participants were occasionally presented with recommendations labeled as either human or AI generated. We found that traditional PSGs tended to be scored slightly more accurately than self applied PSGs, but this difference was not statistically significant. Correct recommendations significantly improved scoring accuracy for both PSG types, while incorrect recommendations reduced accuracy. No significant bias was observed toward or against AI generated recommendations compared to human generated recommendations. These findings highlight the potential of AI to enhance PSG scoring reliability. However, ensuring the accuracy of AI outputs is critical to maximizing its benefits. Future research should explore the long term impacts of DSS on scoring workflows and strategies for integrating AI in clinical practice.

STOct 15, 2024
Generalized Distribution Prediction for Asset Returns

Ísak Pétursson, María Óskarsdóttir

We present a novel approach for predicting the distribution of asset returns using a quantile-based method with Long Short-Term Memory (LSTM) networks. Our model is designed in two stages: the first focuses on predicting the quantiles of normalized asset returns using asset-specific features, while the second stage incorporates market data to adjust these predictions for broader economic conditions. This results in a generalized model that can be applied across various asset classes, including commodities, cryptocurrencies, as well as synthetic datasets. The predicted quantiles are then converted into full probability distributions through kernel density estimation, allowing for more precise return distribution predictions and inferencing. The LSTM model significantly outperforms a linear quantile regression baseline by 98% and a dense neural network model by over 50%, showcasing its ability to capture complex patterns in financial return distributions across both synthetic and real-world data. By using exclusively asset-class-neutral features, our model achieves robust, generalizable results.

SIOct 19, 2020
Multilayer Network Analysis for Improved Credit Risk Prediction

María Óskarsdóttir, Cristián Bravo

We present a multilayer network model for credit risk assessment. Our model accounts for multiple connections between borrowers (such as their geographic location and their economic activity) and allows for explicitly modelling the interaction between connected borrowers. We develop a multilayer personalized PageRank algorithm that allows quantifying the strength of the default exposure of any borrower in the network. We test our methodology in an agricultural lending framework, where it has been suspected for a long time default correlates between borrowers when they are subject to the same structural risks. Our results show there are significant predictive gains just by including centrality multilayer network information in the model, and these gains are increased by more complex information such as the multilayer PageRank variables. The results suggest default risk is highest when an individual is connected to many defaulters, but this risk is mitigated by the size of the neighbourhood of the individual, showing both default risk and financial stability propagate throughout the network.

SISep 15, 2020
Social network analytics for supervised fraud detection in insurance

María Óskarsdóttir, Waqas Ahmed, Katrien Antonio et al.

Insurance fraud occurs when policyholders file claims that are exaggerated or based on intentional damages. This contribution develops a fraud detection strategy by extracting insightful information from the social network of a claim. First, we construct a network by linking claims with all their involved parties, including the policyholders, brokers, experts, and garages. Next, we establish fraud as a social phenomenon in the network and use the BiRank algorithm with a fraud specific query vector to compute a fraud score for each claim. From the network, we extract features related to the fraud scores as well as the claims' neighborhood structure. Finally, we combine these network features with the claim-specific features and build a supervised model with fraud in motor insurance as the target variable. Although we build a model for only motor insurance, the network includes claims from all available lines of business. Our results show that models with features derived from the network perform well when detecting fraud and even outperform the models using only the classical claim-specific features. Combining network and claim-specific features further improves the performance of supervised learning models to detect fraud. The resulting model flags highly suspicions claims that need to be further investigated. Our approach provides a guided and intelligent selection of claims and contributes to a more effective fraud investigation process.

SIMay 25, 2020
Evolution of Credit Risk Using a Personalized Pagerank Algorithm for Multilayer Networks

Cristián Bravo, María Óskarsdóttir

In this paper we present a novel algorithm to study the evolution of credit risk across complex multilayer networks. Pagerank-like algorithms allow for the propagation of an influence variable across single networks, and allow quantifying the risk single entities (nodes) are subject to given the connection they have to other nodes in the network. Multilayer networks, on the other hand, are networks where subset of nodes can be associated to a unique set (layer), and where edges connect elements either intra or inter networks. Our personalized PageRank algorithm for multilayer networks allows for quantifying how credit risk evolves across time and propagates through these networks. By using bipartite networks in each layer, we can quantify the risk of various components, not only the loans. We test our method in an agricultural lending dataset, and our results show how default risk is a challenging phenomenon that propagates and evolves through the network across time.

SIFeb 23, 2020
The Value of Big Data for Credit Scoring: Enhancing Financial Inclusion using Mobile Phone Data and Social Network Analytics

María Óskarsdóttir, Cristián Bravo, Carlos Sarraute et al.

Credit scoring is without a doubt one of the oldest applications of analytics. In recent years, a multitude of sophisticated classification techniques have been developed to improve the statistical performance of credit scoring models. Instead of focusing on the techniques themselves, this paper leverages alternative data sources to enhance both statistical and economic model performance. The study demonstrates how including call networks, in the context of positive credit information, as a new Big Data source has added value in terms of profit by applying a profit measure and profit-based feature selection. A unique combination of datasets, including call-detail records, credit and debit account information of customers is used to create scorecards for credit card applicants. Call-detail records are used to build call networks and advanced social network analytics techniques are applied to propagate influence from prior defaulters throughout the network to produce influence scores. The results show that combining call-detail records with traditional data in credit scoring models significantly increases their performance when measured in AUC. In terms of profit, the best model is the one built with only calling behavior features. In addition, the calling behavior features are the most predictive in other models, both in terms of statistical and economic performance. The results have an impact in terms of ethical use of call-detail records, regulatory implications, financial inclusion, as well as data sharing and privacy.