Andrei Pătraşcu

h-index14

10papers

23citations

Novelty40%

AI Score27

Ranked #155,705 of 194,257 authors (top 80%)#34,164 in LG (top 85%)

10 Papers

1.8LGMay 14, 2022

Unsupervised Abnormal Traffic Detection through Topological Flow Analysis

Paul Irofti, Andrei Pătraşcu, Andrei Iulian Hîji

Cyberthreats are a permanent concern in our modern technological world. In the recent years, sophisticated traffic analysis techniques and anomaly detection (AD) algorithms have been employed to face the more and more subversive adversarial attacks. A malicious intrusion, defined as an invasive action intending to illegally exploit private resources, manifests through unusual data traffic and/or abnormal connectivity pattern. Despite the plethora of statistical or signature-based detectors currently provided in the literature, the topological connectivity component of a malicious flow is less exploited. Furthermore, a great proportion of the existing statistical intrusion detectors are based on supervised learning, that relies on labeled data. By viewing network flows as weighted directed interactions between a pair of nodes, in this paper we present a simple method that facilitate the use of connectivity graph features in unsupervised anomaly detection algorithms. We test our methodology on real network traffic datasets and observe several improvements over standard AD.

2.6LGApr 5, 2024Code

Fusing Dictionary Learning and Support Vector Machines for Unsupervised Anomaly Detection

Paul Irofti, Iulian-Andrei Hîji, Andrei Pătraşcu et al.

We study in this paper the improvement of one-class support vector machines (OC-SVM) through sparse representation techniques for unsupervised anomaly detection. As Dictionary Learning (DL) became recently a common analysis technique that reveals hidden sparse patterns of data, our approach uses this insight to endow unsupervised detection with more control on pattern finding and dimensions. We introduce a new anomaly detection model that unifies the OC-SVM and DL residual functions into a single composite objective, subsequently solved through K-SVD-type iterative algorithms. A closed-form of the alternating K-SVD iteration is explicitly derived for the new composite model and practical implementable schemes are discussed. The standard DL model is adapted for the Dictionary Pair Learning (DPL) context, where the usual sparsity constraints are naturally eliminated. Finally, we extend both objectives to the more general setting that allows the use of kernel functions. The empirical convergence properties of the resulting algorithms are provided and an in-depth analysis of their parametrization is performed while also demonstrating their numerical performance in comparison with existing methods.

1.2NAMar 5, 2024Code

Learning Explicitly Conditioned Sparsifying Transforms

Andrei Pătraşcu, Cristian Rusu, Paul Irofti

Sparsifying transforms became in the last decades widely known tools for finding structured sparse representations of signals in certain transform domains. Despite the popularity of classical transforms such as DCT and Wavelet, learning optimal transforms that guarantee good representations of data into the sparse domain has been recently analyzed in a series of papers. Typically, the conditioning number and representation ability are complementary key features of learning square transforms that may not be explicitly controlled in a given optimization model. Unlike the existing approaches from the literature, in our paper, we consider a new sparsifying transform model that enforces explicit control over the data representation quality and the condition number of the learned transforms. We confirm through numerical experiments that our model presents better numerical behavior than the state-of-the-art.

3.3LGJan 11, 2022Code

Dictionary Learning with Uniform Sparse Representations for Anomaly Detection

Paul Irofti, Cristian Rusu, Andrei Pătraşcu

Many applications like audio and image processing show that sparse representations are a powerful and efficient signal modeling technique. Finding an optimal dictionary that generates at the same time the sparsest representations of data and the smallest approximation error is a hard problem approached by dictionary learning (DL). We study how DL performs in detecting abnormal samples in a dataset of signals. In this paper we use a particular DL formulation that seeks uniform sparse representations model to detect the underlying subspace of the majority of samples in a dataset, using a K-SVD-type algorithm. Numerical simulations show that one can efficiently use this resulted subspace to discriminate the anomalies over the regular data points.

1.6LGAug 10, 2021

Complexity of Inexact Proximal Point Algorithm for minimizing convex functions with Holderian Growth

Andrei Pătraşcu, Paul Irofti

Several decades ago the Proximal Point Algorithm (PPA) started to gain a long-lasting attraction for both abstract operator theory and numerical optimization communities. Even in modern applications, researchers still use proximal minimization theory to design scalable algorithms that overcome nonsmoothness. Remarkable works as \cite{Fer:91,Ber:82constrained,Ber:89parallel,Tom:11} established tight relations between the convergence behaviour of PPA and the regularity of the objective function. In this manuscript we derive nonasymptotic iteration complexity of exact and inexact PPA to minimize convex functions under $γ-$Holderian growth: $\BigO{\log(1/ε)}$ (for $γ\in [1,2]$) and $\BigO{1/ε^{γ- 2}}$ (for $γ> 2$). In particular, we recover well-known results on PPA: finite convergence for sharp minima and linear convergence for quadratic growth, even under presence of deterministic noise. Moreover, when a simple Proximal Subgradient Method is recurrently called as an inner routine for computing each IPPA iterate, novel computational complexity bounds are obtained for Restarting Inexact PPA. Our numerical tests show improvements over existing restarting versions of the Subgradient Method.

2.3LGMar 30, 2020

Stochastic Proximal Gradient Algorithm with Minibatches. Application to Large Scale Learning Models

Andrei Patrascu, Ciprian Paduraru, Paul Irofti

Stochastic optimization lies at the core of most statistical learning models. The recent great development of stochastic algorithmic tools focused significantly onto proximal gradient iterations, in order to find an efficient approach for nonsmooth (composite) population risk functions. The complexity of finding optimal predictors by minimizing regularized risk is largely understood for simple regularizations such as $\ell_1/\ell_2$ norms. However, more complex properties desired for the predictor necessitates highly difficult regularizers as used in grouped lasso or graph trend filtering. In this chapter we develop and analyze minibatch variants of stochastic proximal gradient algorithm for general composite objective functions with stochastic nonsmooth components. We provide iteration complexity for constant and variable stepsize policies obtaining that, for minibatch size $N$, after $\mathcal{O}(\frac{1}{Nε})$ iterations $ε-$suboptimality is attained in expected quadratic distance to optimal solution. The numerical tests on $\ell_2-$regularized SVMs and parametric sparse representation problems confirm the theoretical behaviour and surpasses minibatch SGD performance.

9.6OCDec 4, 2019Code

Stochastic proximal splitting algorithm for composite minimization

Andrei Patrascu, Paul Irofti

Supported by the recent contributions in multiple branches, the first-order splitting algorithms became central for structured nonsmooth optimization. In the large-scale or noisy contexts, when only stochastic information on the smooth part of the objective function is available, the extension of proximal gradient schemes to stochastic oracles is based on proximal tractability of the nonsmooth component and it has been deeply analyzed in the literature. However, there remained gaps illustrated by composite models where the nonsmooth term is not proximally tractable anymore. In this note we tackle composite optimization problems, where the access only to stochastic information on both smooth and nonsmooth components is assumed, using a stochastic proximal first-order scheme with stochastic proximal updates. We provide $\mathcal{O}\left( \frac{1}{k} \right)$ the iteration complexity (in expectation of squared distance to the optimal set) under the strong convexity assumption on the objective function. Empirical behavior is illustrated by numerical tests on parametric sparse representation models.

3.4LGOct 24, 2019

Community-Level Anomaly Detection for Anti-Money Laundering

Andra Baltoiu, Andrei Patrascu, Paul Irofti

Anomaly detection in networks often boils down to identifying an underlying graph structure on which the abnormal occurrence rests on. Financial fraud schemes are one such example, where more or less intricate schemes are employed in order to elude transaction security protocols. We investigate the problem of learning graph structure representations using adaptations of dictionary learning aimed at encoding connectivity patterns. In particular, we adapt dictionary learning strategies to the specificity of network topologies and propose new methods that impose Laplacian structure on the dictionaries themselves. In one adaption we focus on classifying topologies by working directly on the graph Laplacian and cast the learning problem to accommodate its 2D structure. We tackle the same problem by learning dictionaries which consist of vectorized atomic Laplacians, and provide a block coordinate descent scheme to solve the new dictionary learning formulation. Imposing Laplacian structure on the dictionaries is also proposed in an adaptation of the Single Block Orthogonal learning method. Results on synthetic graph datasets comprising different graph topologies confirm the potential of dictionaries to directly represent graph structure information.

4.8LGOct 24, 2019

Quick survey of graph-based fraud detection methods

Paul Irofti, Andrei Patrascu, Andra Baltoiu

In general, anomaly detection is the problem of distinguishing between normal data samples with well defined patterns or signatures and those that do not conform to the expected profiles. Financial transactions, customer reviews, social media posts are all characterized by relational information. In these networks, fraudulent behaviour may appear as a distinctive graph edge, such as spam message, a node or a larger subgraph structure, such as when a group of clients engage in money laundering schemes. Most commonly, these networks are represented as attributed graphs, with numerical features complementing relational information. We present a survey on anomaly detection techniques used for fraud detection that exploit both the graph structure underlying the data and the contextual information contained in the attributes.

2.0OCJan 22, 2019

New nonasymptotic convergence rates of stochastic proximal pointalgorithm for convex optimization problems

Andrei Patrascu

Large sectors of the recent optimization literature focused in the last decade on the development of optimal stochastic first order schemes for constrained convex models under progressively relaxed assumptions. Stochastic proximal point is an iterative scheme born from the adaptation of proximal point algorithm to noisy stochastic optimization, with a resulting iteration related to stochastic alternating projections. Inspired by the scalability of alternating projection methods, we start from the (linear) regularity assumption, typically used in convex feasiblity problems to guarantee the linear convergence of stochastic alternating projection methods, and analyze a general weak linear regularity condition which facilitates convergence rate boosts in stochastic proximal point schemes. Our applications include many non-strongly convex functions classes often used in machine learning and statistics. Moreover, under weak linear regularity assumption we guarantee $\mathcal{O}\left(\frac{1}{k}\right)$ convergence rate for SPP, in terms of the distance to the optimal set, using only projections onto a simple component set. Linear convergence is obtained for interpolation setting, when the optimal set of the expected cost is included into the optimal sets of each functional component.