LGMay 18, 2017
Online learnability of Statistical Relational Learning in anomaly detectionMagnus Jändel, Pontus Svenson, Niclas Wadströmer
Statistical Relational Learning (SRL) methods for anomaly detection are introduced via a security-related application. Operational requirements for online learning stability are outlined and compared to mathematical definitions as applied to the learning process of a representative SRL method - Bayesian Logic Programs (BLP). Since a formal proof of online stability appears to be impossible, tentative common sense requirements are formulated and tested by theoretical and experimental analysis of a simple and analytically tractable BLP model. It is found that learning algorithms in initial stages of online learning can lock on unstable false predictors that nevertheless comply with our tentative stability requirements and thus masquerade as bona fide solutions. The very expressiveness of SRL seems to cause significant stability issues in settings with many variables and scarce data. We conclude that reliable anomaly detection with SRL-methods requires monitoring by an overarching framework that may involve a comprehensive context knowledge base or human supervision.
CRMay 18, 2017
Fusing restricted informationMagnus Jändel, Pontus Svenson, Ronnie Johansson
Information fusion deals with the integration and merging of data and information from multiple (heterogeneous) sources. In many cases, the information that needs to be fused has security classification. The result of the fusion process is then by necessity restricted with the strictest information security classification of the inputs. This has severe drawbacks and limits the possible dissemination of the fusion results. It leads to decreased situational awareness: the organization knows information that would enable a better situation picture, but since parts of the information is restricted, it is not possible to distribute the most correct situational information. In this paper, we take steps towards defining fusion and data mining processes that can be used even when all the underlying data that was used cannot be disseminated. The method we propose here could be used to produce a classifier where all the sensitive information has been removed and where it can be shown that an antagonist cannot even in principle obtain knowledge about the classified information by using the classifier or situation picture.
SOC-PHSep 1, 2013
Ensemble approaches for improving community detection methodsJohan Dahlin, Pontus Svenson
Statistical estimates can often be improved by fusion of data from several different sources. One example is so-called ensemble methods which have been successfully applied in areas such as machine learning for classification and clustering. In this paper, we present an ensemble method to improve community detection by aggregating the information found in an ensemble of community structures. This ensemble can found by re-sampling methods, multiple runs of a stochastic community detection method, or by several different community detection algorithms applied to the same network. The proposed method is evaluated using random networks with community structures and compared with two commonly used community detection methods. The proposed method when applied on a stochastic community detection algorithm performs well with low computational complexity, thus offering both a new approach to community detection and an additional community detection method.