Alexandros Sopasakis

LG
h-index13
7papers
27citations
Novelty31%
AI Score30

7 Papers

LGOct 16, 2024
Improved Anomaly Detection through Conditional Latent Space VAE Ensembles

Oskar Åström, Alexandros Sopasakis

We propose a novel Conditional Latent space Variational Autoencoder (CL-VAE) to perform improved pre-processing for anomaly detection on data with known inlier classes and unknown outlier classes. This proposed variational autoencoder (VAE) improves latent space separation by conditioning on information within the data. The method fits a unique prior distribution to each class in the dataset, effectively expanding the classic prior distribution for VAEs to include a Gaussian mixture model. An ensemble of these VAEs are merged in the latent spaces to form a group consensus that greatly improves the accuracy of anomaly detection across data sets. Our approach is compared against the capabilities of a typical VAE, a CNN, and a PCA, with regards AUC for anomaly detection. The proposed model shows increased accuracy in anomaly detection, achieving an AUC of 97.4% on the MNIST dataset compared to 95.7% for the second best model. In addition, the CL-VAE shows increased benefits from ensembling, a more interpretable latent space, and an increased ability to learn patterns in complex data with limited model sizes.

LGNov 26, 2025
Using Text-Based Life Trajectories from Swedish Register Data to Predict Residential Mobility with Pretrained Transformers

Philipp Stark, Alexandros Sopasakis, Ola Hall et al.

We transform large-scale Swedish register data into textual life trajectories to address two long-standing challenges in data analysis: high cardinality of categorical variables and inconsistencies in coding schemes over time. Leveraging this uniquely comprehensive population register, we convert register data from 6.9 million individuals (2001-2013) into semantically rich texts and predict individuals' residential mobility in later years (2013-2017). These life trajectories combine demographic information with annual changes in residence, work, education, income, and family circumstances, allowing us to assess how effectively such sequences support longitudinal prediction. We compare multiple NLP architectures (including LSTM, DistilBERT, BERT, and Qwen) and find that sequential and transformer-based models capture temporal and semantic structure more effectively than baseline models. The results show that textualized register data preserves meaningful information about individual pathways and supports complex, scalable modeling. Because few countries maintain longitudinal microdata with comparable coverage and precision, this dataset enables analyses and methodological tests that would be difficult or impossible elsewhere, offering a rigorous testbed for developing and evaluating new sequence-modeling approaches. Overall, our findings demonstrate that combining semantically rich register data with modern language models can substantially advance longitudinal analysis in social sciences.

LGFeb 3, 2025
Learning Traffic Anomalies from Generative Models on Real-Time Observations

Fotis I. Giasemis, Alexandros Sopasakis

Accurate detection of traffic anomalies is crucial for effective urban traffic management and congestion mitigation. We use the Spatiotemporal Generative Adversarial Network (STGAN) framework combining Graph Neural Networks and Long Short-Term Memory networks to capture complex spatial and temporal dependencies in traffic data. We apply STGAN to real-time, minute-by-minute observations from 42 traffic cameras across Gothenburg, Sweden, collected over several months in 2020. The images are processed to compute a flow metric representing vehicle density, which serves as input for the model. Training is conducted on data from April to November 2020, and validation is performed on a separate dataset from November 14 to 23, 2020. Our results demonstrate that the model effectively detects traffic anomalies with high precision and low false positive rates. The detected anomalies include camera signal interruptions, visual artifacts, and extreme weather conditions affecting traffic flow.

LGNov 14, 2024
Early-Scheduled Handover Preparation in 5G NR Millimeter-Wave Systems

Dino Pjanić, Alexandros Sopasakis, Andres Reial et al.

The handover (HO) procedure is one of the most critical functions in a cellular network driven by measurements of the user channel of the serving and neighboring cells. The success rate of the entire HO procedure is significantly affected by the preparation stage. As massive Multiple-Input Multiple-Output (MIMO) systems with large antenna arrays allow resolving finer details of channel behavior, we investigate how machine learning can be applied to time series data of beam measurements in the Fifth Generation (5G) New Radio (NR) system to improve the HO procedure. This paper introduces the Early-Scheduled Handover Preparation scheme designed to enhance the robustness and efficiency of the HO procedure, particularly in scenarios involving high mobility and dense small cell deployments. Early-Scheduled Handover Preparation focuses on optimizing the timing of the HO preparation phase by leveraging machine learning techniques to predict the earliest possible trigger points for HO events. We identify a new early trigger for HO preparation and demonstrate how it can beneficially reduce the required time for HO execution reducing channel quality degradation. These insights enable a new HO preparation scheme that offers a novel, user-aware, and proactive HO decision making in MIMO scenarios incorporating mobility.

ITSep 13, 2021
Learning-Based UE Classification in Millimeter-Wave Cellular Systems With Mobility

Dino Pjanić, Alexandros Sopasakis, Harsh Tataria et al.

Millimeter-wave cellular communication requires beamforming procedures that enable alignment of the transmitter and receiver beams as the user equipment (UE) moves. For efficient beam tracking it is advantageous to classify users according to their traffic and mobility patterns. Research to date has demonstrated efficient ways of machine learning based UE classification. Although different machine learning approaches have shown success, most of them are based on physical layer attributes of the received signal. This, however, imposes additional complexity and requires access to those lower layer signals. In this paper, we show that traditional supervised and even unsupervised machine learning methods can successfully be applied on higher layer channel measurement reports in order to perform UE classification, thereby reducing the complexity of the classification process.

LGNov 24, 2019
Latent space conditioning for improved classification and anomaly detection

Erik Norlander, Alexandros Sopasakis

We propose a new type of variational autoencoder to perform improved pre-processing for clustering and anomaly detection on data with a given label. Anomalies however are not known or labeled. We call our method conditional latent space variational autonencoder since it separates the latent space by conditioning on information within the data. The method fits one prior distribution to each class in the dataset, effectively expanding the prior distribution to include a Gaussian mixture model. Our approach is compared against the capabilities of a typical variational autoencoder by measuring their V-score during cluster formation with respect to the k-means and EM algorithms. For anomaly detection, we use a new metric composed of the mass-volume and excess-mass curves which can work in an unsupervised setting. We compare the results between established methods such as as isolation forest, local outlier factor and one-class support vector machine.

NASep 9, 2005
Error analysis of coarse-grained kinetic Monte Carlo method

Markos A Katsoulakis, Petr Plechac, Alexandros Sopasakis

In this paper we investigate the approximation properties of the coarse-graining procedure applied to kinetic Monte Carlo simulations of lattice stochastic dynamics. We provide both analytical and numerical evidence that the hierarchy of the coarse models is built in a systematic way that allows for error control in both transient and long-time simulations. We demonstrate that the numerical accuracy of the CGMC algorithm as an approximation of stochastic lattice spin flip dynamics is of order two in terms of the coarse-graining ratio and that the natural small parameter is the coarse-graining ratio over the range of particle/particle interactions. The error estimate is shown to hold in the weak convergence sense. We employ the derived analytical results to guide CGMC algorithms and we demonstrate a CPU speed-up in demanding computational regimes that involve nucleation, phase transitions and metastability.