SPDec 7, 2022
A Frequency-Structure Approach for Link Stream AnalysisEsteban Bautista, Matthieu Latapy
A link stream is a set of triplets $(t, u, v)$ indicating that $u$ and $v$ interacted at time $t$. Link streams model numerous datasets and their proper study is crucial in many applications. In practice, raw link streams are often aggregated or transformed into time series or graphs where decisions are made. Yet, it remains unclear how the dynamical and structural information of a raw link stream carries into the transformed object. This work shows that it is possible to shed light into this question by studying link streams via algebraically linear graph and signal operators, for which we introduce a novel linear matrix framework for the analysis of link streams. We show that, due to their linearity, most methods in signal processing can be easily adopted by our framework to analyze the time/frequency information of link streams. However, the availability of linear graph methods to analyze relational/structural information is limited. We address this limitation by developing (i) a new basis for graphs that allow us to decompose them into structures at different resolution levels; and (ii) filters for graphs that allow us to change their structural information in a controlled manner. By plugging-in these developments and their time-domain counterpart into our framework, we are able to (i) obtain a new basis for link streams that allow us to represent them in a frequency-structure domain; and (ii) show that many interesting transformations to link streams, like the aggregation of interactions or their embedding into a euclidean space, can be seen as simple filters in our frequency-structure domain.
LGMar 2
Trivial Graph Features and Classical Learning are Enough to Detect Random AnomaliesMatthieu Latapy, Stephany Rajeh
Detecting anomalies in link streams that represent various kinds of interactions is an important research topic with crucial applications. Because of the lack of ground truth data, proposed methods are mostly evaluated through their ability to detect randomly injected links. In contrast with most proposed methods, that rely on complex approaches raising computational and/or interpretability issues, we show here that trivial graph features and classical learning techniques are sufficient to detect such anomalies extremely well. This basic approach has very low computational costs and it leads to easily interpretable results. It also has many other desirable properties that we study through an extensive set of experiments. We conclude that detection methods should now target more complex kinds of anomalies.
SIJun 15, 2021
Full Bitcoin Blockchain Data Made EasyJules Azad Emery, Matthieu Latapy
Despite the fact that it is publicly available, collecting and processing the full bitcoin blockchain data is not trivial. Its mere size, history, and other features indeed raise quite specific challenges, that we address in this paper. The strengths of our approach are the following: it relies on very basic and standard tools, which makes the procedure reliable and easily reproducible; it is a purely lossless procedure ensuring that we catch and preserve all existing data; it provides additional indexing that makes it easy to further process the whole data and select appropriate subsets of it. We present our procedure in details and illustrate its added value on large-scale use cases, like address clustering. We provide an implementation online, as well as the obtained dataset.
DSFeb 12, 2021
Computing Betweenness Centrality in Link StreamsFrédéric Simard, Clémence Magnien, Matthieu Latapy
Betweeness centrality is one of the most important concepts in graph analysis. It was recently extended to link streams, a graph generalization where links arrive over time. However, its computation raises non-trivial issues, due in particular to the fact that time is considered as continuous. We provide here the first algorithms to compute this generalized betweenness centrality, as well as several companion algorithms that have their own interest. They work in polynomial time and space, we illustrate them on typical examples, and we provide an implementation.
IRMay 6, 2019
A general graph-based framework for top-N recommendation using content, temporal and trust informationArmel Jacques Nzekon Nzeko'o, Maurice Tchuente, Matthieu Latapy
Recommending appropriate items to users is crucial in many e-commerce platforms that contain implicit data as users' browsing, purchasing and streaming history. One common approach consists in selecting the N most relevant items to each user, for a given N, which is called top-N recommendation. To do so, recommender systems rely on various kinds of information, like item and user features, past interest of users for items, browsing history and trust between users. However, they often use only one or two such pieces of information, which limits their performance. In this paper, we design and implement GraFC2T2, a general graph-based framework to easily combine and compare various kinds of side information for top-N recommendation. It encodes content-based features, temporal and trust information into a complex graph, and uses personalized PageRank on this graph to perform recommendation. We conduct experiments on Epinions and Ciao datasets, and compare obtained performances using F1-score, Hit ratio and MAP evaluation metrics, to systems based on matrix factorization and deep learning. This shows that our framework is convenient for such explorations, and that combining different kinds of information indeed improves recommendation in general.
IRMar 27, 2019
Link Stream Graph for Temporal RecommendationsArmel Jacques Nzekon Nzeko'o, Maurice Tchuente, Matthieu Latapy
Several researches on recommender systems are based on explicit rating data, but in many real world e-commerce platforms, ratings are not always available, and in those situations, recommender systems have to deal with implicit data such as users' purchase history, browsing history and streaming history. In this context, classical bipartite user-item graphs (BIP) are widely used to compute top-N recommendations. However, these graphs have some limitations, particularly in terms of taking temporal dynamic into account. This is not good because users' preference change over time. To overcome this limit, the Session-based Temporal Graph (STG) was proposed by Xiang et al. to combine long- and short-term preferences in a graph-based recommender system. But in the STG, time is divided into slices and therefore considered discontinuously. This approach loses details of the real temporal dynamics of user actions. To address this challenge, we propose the Link Stream Graph (LSG) which is an extension of link stream representation proposed by Latapy et al. and which allows to model interactions between users and items by considering time continuously. Experiments conducted on four real world implicit datasets for temporal recommendation, with 3 evaluation metrics, show that LSG is the best in 9 out of 12 cases compared to BIP and STG which are the most used state-of-the-art recommender graphs.
SIOct 11, 2017
Stream Graphs and Link Streams for the Modeling of Interactions over TimeMatthieu Latapy, Tiphaine Viard, Clémence Magnien
Graph theory provides a language for studying the structure of relations, and it is often used to study interactions over time too. However, it poorly captures the both temporal and structural nature of interactions, that calls for a dedicated formalism. In this paper, we generalize graph concepts in order to cope with both aspects in a consistent way. We start with elementary concepts like density, clusters, or paths, and derive from them more advanced concepts like cliques, degrees, clustering coefficients, or connected components. We obtain a language to directly deal with interactions over time, similar to the language provided by graphs to deal with relations. This formalism is self-consistent: usual relations between different concepts are preserved. It is also consistent with graph theory: graph concepts are special cases of the ones we introduce. This makes it easy to generalize higher-level objects such as quotient graphs, line graphs, k-cores, and centralities. This paper also considers discrete versus continuous time assumptions, instantaneous links, and extensions to more complex cases.