AISep 26, 2024Code
DREAMS: A python framework for Training Deep Learning Models on EEG Data with Model Card Reporting for Medical ApplicationsRabindra Khadka, Pedro G Lind, Anis Yazidi et al.
Electroencephalography (EEG) provides a non-invasive way to observe brain activity in real time. Deep learning has enhanced EEG analysis, enabling meaningful pattern detection for clinical and research purposes. However, most existing frameworks for EEG data analysis are either focused on preprocessing techniques or deep learning model development, often overlooking the crucial need for structured documentation and model interpretability. In this paper, we introduce DREAMS (Deep REport for AI ModelS), a Python-based framework designed to generate automated model cards for deep learning models applied to EEG data. Unlike generic model reporting tools, DREAMS is specifically tailored for EEG-based deep learning applications, incorporating domain-specific metadata, preprocessing details, performance metrics, and uncertainty quantification. The framework seamlessly integrates with deep learning pipelines, providing structured YAML-based documentation. We evaluate DREAMS through two case studies: an EEG emotion classification task using the FACED dataset and a abnormal EEG classification task using the Temple Univeristy Hospital (TUH) Abnormal dataset. These evaluations demonstrate how the generated model card enhances transparency by documenting model performance, dataset biases, and interpretability limitations. Unlike existing model documentation approaches, DREAMS provides visualized performance metrics, dataset alignment details, and model uncertainty estimations, making it a valuable tool for researchers and clinicians working with EEG-based AI. The source code for DREAMS is open-source, facilitating broad adoption in healthcare AI, research, and ethical AI development.
MLOct 11, 2022
Combining datasets to increase the number of samples and improve model fittingThu Nguyen, Rabindra Khadka, Nhan Phan et al.
For many use cases, combining information from different datasets can be of interest to improve a machine learning model's performance, especially when the number of samples from at least one of the datasets is small. However, a potential challenge in such cases is that the features from these datasets are not identical, even though there are some commonly shared features among the datasets. To tackle this challenge, we propose a novel framework called Combine datasets based on Imputation (ComImp). In addition, we propose a variant of ComImp that uses Principle Component Analysis (PCA), PCA-ComImp in order to reduce dimension before combining datasets. This is useful when the datasets have a large number of features that are not shared between them. Furthermore, our framework can also be utilized for data preprocessing by imputing missing data, i.e., filling in the missing entries while combining different datasets. To illustrate the power of the proposed methods and their potential usages, we conduct experiments for various tasks: regression, classification, and for different data types: tabular data, time series data, when the datasets to be combined have missing data. We also investigate how the devised methods can be used with transfer learning to provide even further model training improvement. Our results indicate that the proposed methods are somewhat similar to transfer learning in that the merge can significantly improve the accuracy of a prediction model on smaller datasets. In addition, the methods can boost performance by a significant margin when combining small datasets together and can provide extra improvement when being used with transfer learning.
AIJul 27, 2022
Towards the Neuroevolution of Low-level Artificial General IntelligenceSidney Pontes-Filho, Kristoffer Olsen, Anis Yazidi et al.
In this work, we argue that the search for Artificial General Intelligence (AGI) should start from a much lower level than human-level intelligence. The circumstances of intelligent behavior in nature resulted from an organism interacting with its surrounding environment, which could change over time and exert pressure on the organism to allow for learning of new behaviors or environment models. Our hypothesis is that learning occurs through interpreting sensory feedback when an agent acts in an environment. For that to happen, a body and a reactive environment are needed. We evaluate a method to evolve a biologically-inspired artificial neural network that learns from environment reactions named Neuroevolution of Artificial General Intelligence (NAGI), a framework for low-level AGI. This method allows the evolutionary complexification of a randomly-initialized spiking neural network with adaptive synapses, which controls agents instantiated in mutable environments. Such a configuration allows us to benchmark the adaptivity and generality of the controllers. The chosen tasks in the mutable environments are food foraging, emulation of logic gates, and cart-pole balancing. The three tasks are successfully solved with rather small network topologies and therefore it opens up the possibility of experimenting with more complex tasks and scenarios where curriculum learning is beneficial.
AIApr 23, 2023
A Conceptual Algorithm for Applying Ethical Principles of AI to Medical PracticeDebesh Jha, Gorkem Durak, Vanshali Sharma et al.
Artificial Intelligence (AI) is poised to transform healthcare delivery through revolutionary advances in clinical decision support and diagnostic capabilities. While human expertise remains foundational to medical practice, AI-powered tools are increasingly matching or exceeding specialist-level performance across multiple domains, paving the way for a new era of democratized healthcare access. These systems promise to reduce disparities in care delivery across demographic, racial, and socioeconomic boundaries by providing high-quality diagnostic support at scale. As a result, advanced healthcare services can be affordable to all populations, irrespective of demographics, race, or socioeconomic background. The democratization of such AI tools can reduce the cost of care, optimize resource allocation, and improve the quality of care. In contrast to humans, AI can potentially uncover complex relationships in the data from a large set of inputs and lead to new evidence-based knowledge in medicine. However, integrating AI into healthcare raises several ethical and philosophical concerns, such as bias, transparency, autonomy, responsibility, and accountability. In this study, we examine recent advances in AI-enabled medical image analysis, current regulatory frameworks, and emerging best practices for clinical integration. We analyze both technical and ethical challenges inherent in deploying AI systems across healthcare institutions, with particular attention to data privacy, algorithmic fairness, and system transparency. Furthermore, we propose practical solutions to address key challenges, including data scarcity, racial bias in training datasets, limited model interpretability, and systematic algorithmic biases. Finally, we outline a conceptual algorithm for responsible AI implementations and identify promising future research and development directions.
LGJun 30, 2022
Deep Reinforcement Learning with Swin TransformersLi Meng, Morten Goodwin, Anis Yazidi et al.
Transformers are neural network models that utilize multiple layers of self-attention heads and have exhibited enormous potential in natural language processing tasks. Meanwhile, there have been efforts to adapt transformers to visual tasks of machine learning, including Vision Transformers and Swin Transformers. Although some researchers use Vision Transformers for reinforcement learning tasks, their experiments remain at a small scale due to the high computational cost. This article presents the first online reinforcement learning scheme that is based on Swin Transformers: Swin DQN. In contrast to existing research, our novel approach demonstrate the superior performance with experiments on 49 games in the Arcade Learning Environment. The results show that our approach achieves significantly higher maximal evaluation scores than the baseline method in 45 of all the 49 games (92%), and higher mean evaluation scores than the baseline method in 40 of all the 49 games (82%).
LGMar 2, 2022
Improving the Diversity of Bootstrapped DQN by Replacing Priors With NoiseLi Meng, Morten Goodwin, Anis Yazidi et al.
Q-learning is one of the most well-known Reinforcement Learning algorithms. There have been tremendous efforts to develop this algorithm using neural networks. Bootstrapped Deep Q-Learning Network is amongst them. It utilizes multiple neural network heads to introduce diversity into Q-learning. Diversity can sometimes be viewed as the amount of reasonable moves an agent can take at a given state, analogous to the definition of the exploration ratio in RL. Thus, the performance of Bootstrapped Deep Q-Learning Network is deeply connected with the level of diversity within the algorithm. In the original research, it was pointed out that a random prior could improve the performance of the model. In this article, we further explore the possibility of replacing priors with noise and sample the noise from a Gaussian distribution to introduce more diversity into this algorithm. We conduct our experiment on the Atari benchmark and compare our algorithm to both the original and other related algorithms. The results show that our modification of the Bootstrapped Deep Q-Learning algorithm achieves significantly higher evaluation scores across different types of Atari games. Thus, we conclude that replacing priors with noise can improve Bootstrapped Deep Q-Learning's performance by ensuring the integrity of diversities.
GTMar 28, 2022
Adaptive Learning with Artificial Barriers Yielding Nash Equilibria in General GamesIsmail Hassan, B. John Oommen, Anis Yazidi
Artificial barriers in Learning Automata (LA) is a powerful and yet under-explored concept although it was first proposed in the 1980s. Introducing artificial non-absorbing barriers makes the LA schemes resilient to being trapped in absorbing barriers, a phenomenon which is often referred to as lock in probability leading to an exclusive choice of one action after convergence. Within the field of LA and reinforcement learning in general, there is a sacristy of theoretical works and applications of schemes with artificial barriers. In this paper, we devise a LA with artificial barriers for solving a general form of stochastic bimatrix game. Classical LA systems possess properties of absorbing barriers and they are a powerful tool in game theory and were shown to converge to game's of Nash equilibrium under limited information. However, the stream of works in LA for solving game theoretical problems can merely solve the case where the Saddle Point of the game exists in a pure strategy and fail to reach mixed Nash equilibrium when no Saddle Point exists for a pure strategy. In this paper, by resorting to the powerful concept of artificial barriers, we suggest a LA that converges to an optimal mixed Nash equilibrium even though there may be no Saddle Point when a pure strategy is invoked. Our deployed scheme is of Linear Reward-Inaction ($L_{R-I}$) flavor which is originally an absorbing LA scheme, however, we render it non-absorbing by introducing artificial barriers in an elegant and natural manner, in the sense that that the well-known legacy $L_{R-I}$ scheme can be seen as an instance of our proposed algorithm for a particular choice of the barrier. Furthermore, we present an $S$ Learning version of our LA with absorbing barriers that is able to handle $S$-Learning environment in which the feedback is continuous and not binary as in the case of the $L_{R-I}$.
LGMar 13, 2023
Unsupervised Representation Learning in Partially Observable Atari GamesLi Meng, Morten Goodwin, Anis Yazidi et al.
State representation learning aims to capture latent factors of an environment. Contrastive methods have performed better than generative models in previous state representation learning research. Although some researchers realize the connections between masked image modeling and contrastive representation learning, the effort is focused on using masks as an augmentation technique to represent the latent generative factors better. Partially observable environments in reinforcement learning have not yet been carefully studied using unsupervised state representation learning methods. In this article, we create an unsupervised state representation learning scheme for partially observable states. We conducted our experiment on a previous Atari 2600 framework designed to evaluate representation learning models. A contrastive method called Spatiotemporal DeepInfomax (ST-DIM) has shown state-of-the-art performance on this benchmark but remains inferior to its supervised counterpart. Our approach improves ST-DIM when the environment is not fully observable and achieves higher F1 scores and accuracy scores than the supervised learning counterpart. The mean accuracy score averaged over categories of our approach is ~66%, compared to ~38% of supervised learning. The mean F1 score is ~64% to ~33%.
AIOct 3, 2023
Generalized Convergence Analysis of Tsetlin Machines: A Probabilistic Approach to Concept LearningMohamed-Bachir Belaid, Jivitesh Sharma, Lei Jiao et al.
Tsetlin Machines (TMs) have garnered increasing interest for their ability to learn concepts via propositional formulas and their proven efficiency across various application domains. Despite this, the convergence proof for the TMs, particularly for the AND operator (\emph{conjunction} of literals), in the generalized case (inputs greater than two bits) remains an open problem. This paper aims to fill this gap by presenting a comprehensive convergence analysis of Tsetlin automaton-based Machine Learning algorithms. We introduce a novel framework, referred to as Probabilistic Concept Learning (PCL), which simplifies the TM structure while incorporating dedicated feedback mechanisms and dedicated inclusion/exclusion probabilities for literals. Given $n$ features, PCL aims to learn a set of conjunction clauses $C_i$ each associated with a distinct inclusion probability $p_i$. Most importantly, we establish a theoretical proof confirming that, for any clause $C_k$, PCL converges to a conjunction of literals when $0.5<p_k<1$. This result serves as a stepping stone for future research on the convergence properties of Tsetlin automaton-based learning algorithms. Our findings not only contribute to the theoretical understanding of Tsetlin Machines but also have implications for their practical application, potentially leading to more robust and interpretable machine learning models.
CLOct 31, 2025
BiSparse-AAS: Bilinear Sparse Attention and Adaptive Spans Framework for Scalable and Efficient Text SummarizationDesta Haileselassie Hagos, Legand L. Burge, Anietie Andy et al.
Transformer-based architectures have advanced text summarization, yet their quadratic complexity limits scalability on long documents. This paper introduces BiSparse-AAS (Bilinear Sparse Attention with Adaptive Spans), a novel framework that combines sparse attention, adaptive spans, and bilinear attention to address these limitations. Sparse attention reduces computational costs by focusing on the most relevant parts of the input, while adaptive spans dynamically adjust the attention ranges. Bilinear attention complements both by modeling complex token interactions within this refined context. BiSparse-AAS consistently outperforms state-of-the-art baselines in both extractive and abstractive summarization tasks, achieving average ROUGE improvements of about 68.1% on CNN/DailyMail and 52.6% on XSum, while maintaining strong performance on OpenWebText and Gigaword datasets. By addressing efficiency, scalability, and long-sequence modeling, BiSparse-AAS provides a unified, practical solution for real-world text summarization applications.
LGMay 10, 2025Code
Learning Graph Representation of Agent DiffusersYoucef Djenouri, Nassim Belmecheri, Tomasz Michalak et al.
Diffusion-based generative models have significantly advanced text-to-image synthesis, demonstrating impressive text comprehension and zero-shot generalization. These models refine images from random noise based on textual prompts, with initial reliance on text input shifting towards enhanced visual fidelity over time. This transition suggests that static model parameters might not optimally address the distinct phases of generation. We introduce LGR-AD (Learning Graph Representation of Agent Diffusers), a novel multi-agent system designed to improve adaptability in dynamic computer vision tasks. LGR-AD models the generation process as a distributed system of interacting agents, each representing an expert sub-model. These agents dynamically adapt to varying conditions and collaborate through a graph neural network that encodes their relationships and performance metrics. Our approach employs a coordination mechanism based on top-$k$ maximum spanning trees, optimizing the generation process. Each agent's decision-making is guided by a meta-model that minimizes a novel loss function, balancing accuracy and diversity. Theoretical analysis and extensive empirical evaluations show that LGR-AD outperforms traditional diffusion models across various benchmarks, highlighting its potential for scalable and flexible solutions in complex image generation tasks. Code is available at: https://github.com/YousIA/LGR_AD
CVFeb 1, 2024
A Manifold Representation of the Key in Vision TransformersLi Meng, Morten Goodwin, Anis Yazidi et al.
Vision Transformers implement multi-head self-attention via stacking multiple attention blocks. The query, key, and value are often intertwined and generated within those blocks via a single, shared linear transformation. This paper explores the concept of disentangling the key from the query and value, and adopting a manifold representation for the key. Our experiments reveal that decoupling and endowing the key with a manifold structure can enhance the model's performance. Specifically, ViT-B exhibits a 0.87% increase in top-1 accuracy, while Swin-T sees a boost of 0.52% in top-1 accuracy on the ImageNet-1K dataset, with eight charts in the manifold key. Our approach also yields positive results in object detection and instance segmentation tasks on the COCO dataset. We establish that these performance gains are not merely due to the simplicity of adding more parameters and computations. Future research may investigate strategies for cutting the budget of such representations and aim for further performance improvements based on our findings.
NCAug 25, 2025
Saccade crossing avoidance as a visual search strategyAlex Szorkovszky, Rujeena Mathema, Pedro Lencastre et al.
Although visual search appears largely random, several oculomotor biases exist such that the likelihoods of saccade directions and lengths depend on the previous scan path. Compared to the most recent fixations, the impact of the longer path history is more difficult to quantify. Using the step-selection framework commonly used in movement ecology, and analyzing data from 45-second viewings of "Where's Waldo?", we report a new memory-dependent effect that also varies significantly between individuals, which we term self-crossing avoidance. This is a tendency for saccades to avoid crossing those earlier in the scan path, and is most evident when both have small amplitudes. We show this by comparing real data to synthetic data generated from a memoryless approximation of the spatial statistics (i.e. a Markovian nonparametric model with a matching distribution of saccade lengths over time). Maximum likelihood fitting indicates that this effect is strongest when including the last $\approx 7$ seconds of a scan path. The effect size is comparable to well-known forms of history dependence such as inhibition of return. A parametric probabilistic model including a self-crossing penalty term was able to reproduce joint statistics of saccade lengths and self-crossings. We also quantified individual strategic differences, and their consistency over the six images viewed per participant, using mixed-effect regressions. Participants with a higher tendency to avoid crossings displayed smaller saccade lengths and shorter fixation durations on average, but did not display more horizontal, vertical, forward or reverse saccades. Together, these results indicate that the avoidance of crossings is a local orienting strategy that facilitates and complements inhibition of return, and hence exploration of visual scenes.
SPAug 18, 2025
EEG-MSAF: An Interpretable Microstate Framework uncovers Default-Mode Decoherence in Early NeurodegenerationMohammad Mehedi Hasan, Pedro G. Lind, Hernando Ombao et al.
Dementia (DEM) is a growing global health challenge, underscoring the need for early and accurate diagnosis. Electroencephalography (EEG) provides a non-invasive window into brain activity, but conventional methods struggle to capture its transient complexity. We present the \textbf{EEG Microstate Analysis Framework (EEG-MSAF)}, an end-to-end pipeline that leverages EEG microstates discrete, quasi-stable topographies to identify DEM-related biomarkers and distinguish DEM, mild cognitive impairment (MCI), and normal cognition (NC). EEG-MSAF comprises three stages: (1) automated microstate feature extraction, (2) classification with machine learning (ML), and (3) feature ranking using Shapley Additive Explanations (SHAP) to highlight key biomarkers. We evaluate on two EEG datasets: the public Chung-Ang University EEG (CAUEEG) dataset and a clinical cohort from Thessaloniki Hospital. Our framework demonstrates strong performance and generalizability. On CAUEEG, EEG-MSAF-SVM achieves \textbf{89\% $\pm$ 0.01 accuracy}, surpassing the deep learning baseline CEEDNET by \textbf{19.3\%}. On the Thessaloniki dataset, it reaches \textbf{95\% $\pm$ 0.01 accuracy}, comparable to EEGConvNeXt. SHAP analysis identifies mean correlation and occurrence as the most informative metrics: disruption of microstate C (salience/attention network) dominates DEM prediction, while microstate F, a novel default-mode pattern, emerges as a key early biomarker for both MCI and DEM. By combining accuracy, generalizability, and interpretability, EEG-MSAF advances EEG-based dementia diagnosis and sheds light on brain dynamics across the cognitive spectrum.
CVJul 4, 2025
From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Visual Concepts in Brain Signal AnalysisAmirabbas Hojjati, Lu Li, Ibrahim Hameed et al.
EEG signals capture brain activity with high temporal and low spatial resolution, supporting applications such as neurological diagnosis, cognitive monitoring, and brain-computer interfaces. However, effective analysis is hindered by limited labeled data, high dimensionality, and the absence of scalable models that fully capture spatiotemporal dependencies. Existing self-supervised learning (SSL) methods often focus on either spatial or temporal features, leading to suboptimal representations. To this end, we propose EEG-VJEPA, a novel adaptation of the Video Joint Embedding Predictive Architecture (V-JEPA) for EEG classification. By treating EEG as video-like sequences, EEG-VJEPA learns semantically meaningful spatiotemporal representations using joint embeddings and adaptive masking. To our knowledge, this is the first work that exploits V-JEPA for EEG classification and explores the visual concepts learned by the model. Evaluations on the publicly available Temple University Hospital (TUH) Abnormal EEG dataset show that EEG-VJEPA outperforms existing state-of-the-art models in classification accuracy. Beyond classification accuracy, EEG-VJEPA captures physiologically relevant spatial and temporal signal patterns, offering interpretable embeddings that may support human-AI collaboration in diagnostic workflows. These findings position EEG-VJEPA as a promising framework for scalable, trustworthy EEG analysis in real-world clinical settings.
LGJun 30, 2025
Examining Reject Relations in Stimulus Equivalence SimulationsAlexis Carrillo, Asieh Abolpour Mofrad, Anis Yazidi et al.
Simulations offer a valuable tool for exploring stimulus equivalence (SE), yet the potential of reject relations to disrupt the assessment of equivalence class formation is contentious. This study investigates the role of reject relations in the acquisition of stimulus equivalence using computational models. We examined feedforward neural networks (FFNs), bidirectional encoder representations from transformers (BERT), and generative pre-trained transformers (GPT) across 18 conditions in matching-to-sample (MTS) simulations. Conditions varied in training structure (linear series, one-to-many, and many-to-one), relation type (select-only, reject-only, and select-reject), and negative comparison selection (standard and biased). A probabilistic agent served as a benchmark, embodying purely associative learning. The primary goal was to determine whether artificial neural networks could demonstrate equivalence class formation or whether their performance reflected associative learning. Results showed that reject relations influenced agent performance. While some agents achieved high accuracy on equivalence tests, particularly with reject relations and biased negative comparisons, this performance was comparable to the probabilistic agent. These findings suggest that artificial neural networks, including transformer models, may rely on associative strategies rather than SE. This underscores the need for careful consideration of reject relations and more stringent criteria in computational models of equivalence.
NEDec 5, 2024
Modeling Eye Gaze Velocity Trajectories using GANs with Spectral Loss for Enhanced FidelityShailendra Bhandari, Pedro Lencastre, Rujeena Mathema et al.
Accurate modeling of eye gaze dynamics is essential for advancement in human-computer interaction, neurological diagnostics, and cognitive research. Traditional generative models like Markov models often fail to capture the complex temporal dependencies and distributional nuance inherent in eye gaze trajectories data. This study introduces a GAN framework employing LSTM and CNN generators and discriminators to generate high-fidelity synthetic eye gaze velocity trajectories. We conducted a comprehensive evaluation of four GAN architectures: CNN-CNN, LSTM-CNN, CNN-LSTM, and LSTM-LSTM trained under two conditions: using only adversarial loss and using a weighted combination of adversarial and spectral losses. Our findings reveal that the LSTM-CNN architecture trained with this new loss function exhibits the closest alignment to the real data distribution, effectively capturing both the distribution tails and the intricate temporal dependencies. The inclusion of spectral regularization significantly enhances the GANs ability to replicate the spectral characteristics of eye gaze movements, leading to a more stable learning process and improved data fidelity. Comparative analysis with an HMM optimized to four hidden states further highlights the advantages of the LSTM-CNN GAN. Statistical metrics show that the HMM-generated data significantly diverges from the real data in terms of mean, standard deviation, skewness, and kurtosis. In contrast, the LSTM-CNN model closely matches the real data across these statistics, affirming its capacity to model the complexity of eye gaze dynamics effectively. These results position the spectrally regularized LSTM-CNN GAN as a robust tool for generating synthetic eye gaze velocity data with high fidelity.
LGMay 22, 2024
Maximum Manifold Capacity Representations in State Representation LearningLi Meng, Morten Goodwin, Anis Yazidi et al.
The expanding research on manifold-based self-supervised learning (SSL) builds on the manifold hypothesis, which suggests that the inherent complexity of high-dimensional data can be unraveled through lower-dimensional manifold embeddings. Capitalizing on this, DeepInfomax with an unbalanced atlas (DIM-UA) has emerged as a powerful tool and yielded impressive results for state representations in reinforcement learning. Meanwhile, Maximum Manifold Capacity Representation (MMCR) presents a new frontier for SSL by optimizing class separability via manifold compression. However, MMCR demands extensive input views, resulting in significant computational costs and protracted pre-training durations. Bridging this gap, we present an innovative integration of MMCR into existing SSL methods, incorporating a discerning regularization strategy that enhances the lower bound of mutual information. We also propose a novel state representation learning method extending DIM-UA, embedding a nuclear norm loss to enforce manifold consistency robustly. On experimentation with the Atari Annotated RAM Interface, our method improves DIM-UA significantly with the same number of target encoding dimensions. The mean F1 score averaged over categories is 78% compared to 75% of DIM-UA. There are also compelling gains when implementing SimCLR and Barlow Twins. This supports our SSL innovation as a paradigm shift, enabling more nuanced high-dimensional data representations.
LGMay 17, 2023
State Representation Learning Using an Unbalanced AtlasLi Meng, Morten Goodwin, Anis Yazidi et al.
The manifold hypothesis posits that high-dimensional data often lies on a lower-dimensional manifold and that utilizing this manifold as the target space yields more efficient representations. While numerous traditional manifold-based techniques exist for dimensionality reduction, their application in self-supervised learning has witnessed slow progress. The recent MSimCLR method combines manifold encoding with SimCLR but requires extremely low target encoding dimensions to outperform SimCLR, limiting its applicability. This paper introduces a novel learning paradigm using an unbalanced atlas (UA), capable of surpassing state-of-the-art self-supervised learning approaches. We investigated and engineered the DeepInfomax with an unbalanced atlas (DIM-UA) method by adapting the Spatiotemporal DeepInfomax (ST-DIM) framework to align with our proposed UA paradigm. The efficacy of DIM-UA is demonstrated through training and evaluation on the Atari Annotated RAM Interface (AtariARI) benchmark, a modified version of the Atari 2600 framework that produces annotated image samples for representation learning. The UA paradigm improves existing algorithms significantly as the number of target encoding dimensions grows. For instance, the mean F1 score averaged over categories of DIM-UA is ~75% compared to ~70% of ST-DIM when using 16384 hidden units.
LGSep 2, 2021
Artificial Intelligence in Dry Eye DiseaseAndrea M. Storås, Inga Strümke, Michael A. Riegler et al.
Dry eye disease (DED) has a prevalence of between 5 and 50\%, depending on the diagnostic criteria used and population under study. However, it remains one of the most underdiagnosed and undertreated conditions in ophthalmology. Many tests used in the diagnosis of DED rely on an experienced observer for image interpretation, which may be considered subjective and result in variation in diagnosis. Since artificial intelligence (AI) systems are capable of advanced problem solving, use of such techniques could lead to more objective diagnosis. Although the term `AI' is commonly used, recent success in its applications to medicine is mainly due to advancements in the sub-field of machine learning, which has been used to automatically classify images and predict medical outcomes. Powerful machine learning techniques have been harnessed to understand nuances in patient data and medical images, aiming for consistent diagnosis and stratification of disease severity. This is the first literature review on the use of AI in DED. We provide a brief introduction to AI, report its current use in DED research and its potential for application in the clinic. Our review found that AI has been employed in a wide range of DED clinical tests and research applications, primarily for interpretation of interferometry, slit-lamp and meibography images. While initial results are promising, much work is still needed on model development, clinical testing and standardisation.
LGJun 28, 2021
Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert ExamplesLi Meng, Anis Yazidi, Morten Goodwin et al.
In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims at incorporating semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value of a state in a coarse manner using three discrete values. An expert network is designed in addition to the Q-network, which updates each time following the regular offline minibatch update whenever the expert example buffer is not empty. Using the board game Othello, we compare our algorithm with the baseline Q-learning algorithm, which is a combination of Double Q-learning and Dueling Q-learning. Our results show that Expert Q-learning is indeed useful and more resistant to the overestimation bias. The baseline Q-learning algorithm exhibits unstable and suboptimal behavior in non-deterministic settings, whereas Expert Q-learning demonstrates more robust performance with higher scores, illustrating that our algorithm is indeed suitable to integrate state values from expert examples into Q-learning.
LGMay 14, 2021
DoS and DDoS Mitigation Using Variational AutoencodersEirik Molde Bårli, Anis Yazidi, Enrique Herrera Viedma et al.
DoS and DDoS attacks have been growing in size and number over the last decade and existing solutions to mitigate these attacks are in general inefficient. Compared to other types of malicious cyber attacks, DoS and DDoS attacks are particularly more challenging to combat. With their ability to mask themselves as legitimate traffic, developing methods to detect these types of attacks on a packet or flow level, has proven to be a difficult task. In this paper, we explore the potential of Variational Autoencoders to serve as a component within an intelligent security solution that differentiates between normal and malicious traffic. Two methods based on the ability of Variational Autoencoders to learn latent representations from network traffic flows are proposed. The first method resorts to a classifier based on the latent encodings obtained from Variational Autoencoders learned from traffic traces. The second method is rather an anomaly detection method where the Variational Autoencoder is used to learn the abstract feature representations of exclusively legitimate traffic. Then anomalies are filtered out by relying on the reconstruction loss of the Variational Autoencoder. Both of the proposed methods have been thoroughly tested on two separate datasets with a similar feature space. The results show that both methods are promising, with a slight superiority of the classifier based method over the anomaly based one. %that the first method is able to successfully detect individual traffic flows with high precision on the training and validation data, slightly less successfully on the test data. For the second method, the Variational Autoencoder will require further adjustments to be able to sufficiently filter out anomalies from network traffic flows.
CVJan 6, 2021
LightLayers: Parameter Efficient Dense and Convolutional Layers for Image ClassificationDebesh Jha, Anis Yazidi, Michael A. Riegler et al.
Deep Neural Networks (DNNs) have become the de-facto standard in computer vision, as well as in many other pattern recognition tasks. A key drawback of DNNs is that the training phase can be very computationally expensive. Organizations or individuals that cannot afford purchasing state-of-the-art hardware or tapping into cloud-hosted infrastructures may face a long waiting time before the training completes or might not be able to train a model at all. Investigating novel ways to reduce the training time could be a potential solution to alleviate this drawback, and thus enabling more rapid development of new algorithms and models. In this paper, we propose LightLayers, a method for reducing the number of trainable parameters in deep neural networks (DNN). The proposed LightLayers consists of LightDense andLightConv2D layer that are as efficient as regular Conv2D and Dense layers, but uses less parameters. We resort to Matrix Factorization to reduce the complexity of the DNN models resulting into lightweight DNNmodels that require less computational power, without much loss in the accuracy. We have tested LightLayers on MNIST, Fashion MNIST, CI-FAR 10, and CIFAR 100 datasets. Promising results are obtained for MNIST, Fashion MNIST, CIFAR-10 datasets whereas CIFAR 100 shows acceptable performance by using fewer parameters.
MEApr 27, 2020
Efficient Quantile Tracking Using an OracleHugo L. Hammer, Anis Yazidi, Michael A. Riegler et al.
For incremental quantile estimators the step size and possibly other tuning parameters must be carefully set. However, little attention has been given on how to set these values in an online manner. In this article we suggest two novel procedures that address this issue. The core part of the procedures is to estimate the current tracking mean squared error (MSE). The MSE is decomposed in tracking variance and bias and novel and efficient procedures to estimate these quantities are presented. It is shown that estimation bias can be tracked by associating it with the portion of observations below the quantile estimates. The first procedure runs an ensemble of $L$ quantile estimators for wide range of values of the tuning parameters and typically around $L = 100$. In each iteration an oracle selects the best estimate by the guidance of the estimated MSEs. The second method only runs an ensemble of $L = 3$ estimators and thus the values of the tuning parameters need from time to time to be adjusted for the running estimators. The procedures have a low memory foot print of $8L$ and a computational complexity of $8L$ per iteration. The experiments show that the procedures are highly efficient and track quantiles with an error close to the theoretical optimum. The Oracle approach performs best, but comes with higher computational cost. The procedures were further applied to a massive real-life data stream of tweets and proofed real world applicability of them.
SPJan 22, 2020
A hemodynamic decomposition model for detecting cognitive load using functional near-infrared spectroscopyMarco A. Pinto-Orellana, Diego C. Nascimento, Peyman Mirtaheri et al.
In the current paper, we introduce a parametric data-driven model for functional near-infrared spectroscopy that decomposes a signal into a series of independent, rescaled, time-shifted, hemodynamic basis functions. Each decomposed waveform retains relevant biological information about the expected hemodynamic behavior. The model is also presented along with an efficient iterative estimation method to improve the computational speed. Our hemodynamic decomposition model (HDM) extends the canonical model for instances when a) the external stimuli are unknown, or b) when the assumption of a direct relationship between the experimental stimuli and the hemodynamic responses cannot hold. We also argue that the proposed approach can be potentially adopted as a feature transformation method for machine learning purposes. By virtue of applying our devised HDM to a cognitive load classification task on fNIRS signals, we have achieved an accuracy of 86.20%+-2.56% using six channels in the frontal cortex, and 86.34%+-2.81% utilizing only the AFpz channel also located in the frontal area. In comparison, state-of-the-art time-spectral transformations only yield 64.61%+-3.03% and 37.8%+-2.96% under identical experimental settings.
NEJul 3, 2019
A general representation of dynamical systems for reservoir computingSidney Pontes-Filho, Anis Yazidi, Jianhua Zhang et al.
Dynamical systems are capable of performing computation in a reservoir computing paradigm. This paper presents a general representation of these systems as an artificial neural network (ANN). Initially, we implement the simplest dynamical system, a cellular automaton. The mathematical fundamentals behind an ANN are maintained, but the weights of the connections and the activation function are adjusted to work as an update rule in the context of cellular automata. The advantages of such implementation are its usage on specialized and optimized deep learning libraries, the capabilities to generalize it to other types of networks and the possibility to evolve cellular automata and other dynamical systems in terms of connectivity, update and learning rules. Our implementation of cellular automata constitutes an initial step towards a general framework for dynamical systems. It aims to evolve such systems to optimize their usage in reservoir computing and to model physical computing substrates.
NEAug 31, 2018
Autonomous Configuration of Network Parameters in Operating Systems using Evolutionary AlgorithmsBartosz Gembala, Anis Yazidi, Hårek Haugerud et al.
By default, the Linux network stack is not configured for highspeed large file transfer. The reason behind this is to save memory resources. It is possible to tune the Linux network stack by increasing the network buffers size for high-speed networks that connect server systems in order to handle more network packets. However, there are also several other TCP/IP parameters that can be tuned in an Operating System (OS). In this paper, we leverage Genetic Algorithms (GAs) to devise a system which learns from the history of the network traffic and uses this knowledge to optimize the current performance by adjusting the parameters. This can be done for a standard Linux kernel using sysctl or /proc. For a Virtual Machine (VM), virtually any type of OS can be installed and an image can swiftly be compiled and deployed. By being a sandboxed environment, risky configurations can be tested without the danger of harming the system. Different scenarios for network parameter configurations are thoroughly tested, and an increase of up to 65% throughput speed is achieved compared to the default Linux configuration.
NEJul 12, 2018
Achieving Connectivity Between Wide Areas Through Self-Organising Robot Swarm Using Embodied EvolutionErik Aaron Hansen, Stefano Nichele, Anis Yazidi et al.
Abruptions to the communication infrastructure happens occasionally, where manual dedicated personnel will go out to fix the interruptions, restoring communication abilities. However, sometimes this can be dangerous to the personnel carrying out the task, which can be the case in war situations, environmental disasters like earthquakes or toxic spills or in the occurrence of fire. Therefore, human casualties can be minimised if autonomous robots are deployed that can achieve the same outcome: to establish a communication link between two previously distant but connected sites. In this paper we investigate the deployment of mobile ad hoc robots which relay traffic between them. In order to get the robots to locate themselves appropriately, we take inspiration from self-organisation and emergence in artificial life, where a common overall goal may be achieved if the correct local rules on the agents in system are invoked. We integrate the aspect of connectivity between two sites into the multirobot simulation platform known as JBotEvolver. The robot swarm is composed of Thymio II robots. In addition, we compare three heuristics, of which one uses neuroevolution (evolution of neural networks) to show how self-organisation and embodied evolution can be used within the integration. Our use of embodiment in robotic controllers shows promising results and provide solid knowledge and guidelines for further investigations.
AIJun 23, 2016
Adaptive Task Assignment in Online Learning EnvironmentsPer-Arne Andersen, Christian Kråkevik, Morten Goodwin et al.
With the increasing popularity of online learning, intelligent tutoring systems are regaining increased attention. In this paper, we introduce adaptive algorithms for personalized assignment of learning tasks to student so that to improve his performance in online learning environments. As main contribution of this paper, we propose a a novel Skill-Based Task Selector (SBTS) algorithm which is able to approximate a student's skill level based on his performance and consequently suggest adequate assignments. The SBTS is inspired by the class of multi-armed bandit algorithms. However, in contrast to standard multi-armed bandit approaches, the SBTS aims at acquiring two criteria related to student learning, namely: which topics should the student work on, and what level of difficulty should the task be. The SBTS centers on innovative reward and punishment schemes in a task and skill matrix based on the student behaviour. To verify the algorithm, the complex student behaviour is modelled using a neighbour node selection approach based on empirical estimations of a students learning curve. The algorithm is evaluated with a practical scenario from a basic java programming course. The SBTS is able to quickly and accurately adapt to the composite student competency --- even with a multitude of student models.