Farzana Nasrin

ML
h-index33
6papers
137citations
Novelty41%
AI Score35

6 Papers

LGFeb 14, 2024
Position: Topological Deep Learning is the New Frontier for Relational Learning

Theodore Papamarkou, Tolga Birdal, Michael Bronstein et al.

Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning settings. To this end, this paper discusses open problems in TDL, ranging from practical benefits to theoretical foundations. For each problem, it outlines potential solutions and future research opportunities. At the same time, this paper serves as an invitation to the scientific community to actively participate in TDL research to unlock the potential of this emerging field.

MEJun 21, 2025
Bayesian Inference for Left-Truncated Log-Logistic Distributions for Time-to-event Data Analysis

Fahad Mostafa, Md Rejuan Haque, Md Mostafijur Rahman et al.

Parameter estimation is a foundational step in statistical modeling, enabling us to extract knowledge from data and apply it effectively. Bayesian estimation of parameters incorporates prior beliefs with observed data to infer distribution parameters probabilistically and robustly. Moreover, it provides full posterior distributions, allowing uncertainty quantification and regularization, especially useful in small or truncated samples. Utilizing the left-truncated log-logistic (LTLL) distribution is particularly well-suited for modeling time-to-event data where observations are subject to a known lower bound such as precipitation data and cancer survival times. In this paper, we propose a Bayesian approach for estimating the parameters of the LTLL distribution with a fixed truncation point \( x_L > 0 \). Given a random variable \( X \sim LL(α, β; x_L) \), where \( α> 0 \) is the scale parameter and \( β> 0 \) is the shape parameter, the likelihood function is derived based on a truncated sample \( X_1, X_2, \dots, X_N \) with \( X_i > x_L \). We assume independent prior distributions for the parameters, and the posterior inference is conducted via Markov Chain Monte Carlo sampling, specifically using the Metropolis-Hastings algorithm to obtain posterior estimates \( \hatα \) and \( \hatβ \). Through simulation studies and real-world applications, we demonstrate that Bayesian estimation provides more stable and reliable parameter estimates, particularly when the likelihood surface is irregular due to left truncation. The results highlight the advantages of Bayesian inference outperform the estimation of parameter uncertainty in truncated distributions for time to event data analysis.

MLApr 15, 2021
A Random Persistence Diagram Generator

Theodore Papamarkou, Farzana Nasrin, Austin Lawson et al.

Topological data analysis (TDA) studies the shape patterns of data. Persistent homology is a widely used method in TDA that summarizes homological features of data at multiple scales and stores them in persistence diagrams (PDs). In this paper, we propose a random persistence diagram generator (RPDG) method that generates a sequence of random PDs from the ones produced by the data. RPDG is underpinned by a model based on pairwise interacting point processes, and a reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithm. A first example, which is based on a synthetic dataset, demonstrates the efficacy of RPDG and provides a comparison with another method for sampling PDs. A second example demonstrates the utility of RPDG to solve a materials science problem given a real dataset of small sample size.

MTRL-SCIJan 14, 2021
Materials Fingerprinting Classification

Adam Spannaus, Kody J. H. Law, Piotr Luszczek et al.

Significant progress in many classes of materials could be made with the availability of experimentally-derived large datasets composed of atomic identities and three-dimensional coordinates. Methods for visualizing the local atomic structure, such as atom probe tomography (APT), which routinely generate datasets comprised of millions of atoms, are an important step in realizing this goal. However, state-of-the-art APT instruments generate noisy and sparse datasets that provide information about elemental type, but obscure atomic structures, thus limiting their subsequent value for materials discovery. The application of a materials fingerprinting process, a machine learning algorithm coupled with topological data analysis, provides an avenue by which here-to-fore unprecedented structural information can be extracted from an APT dataset. As a proof of concept, the material fingerprint is applied to high-entropy alloy APT datasets containing body-centered cubic (BCC) and face-centered cubic (FCC) crystal structures. A local atomic configuration centered on an arbitrary atom is assigned a topological descriptor, with which it can be characterized as a BCC or FCC lattice with near perfect accuracy, despite the inherent noise in the dataset. This successful identification of a fingerprint is a crucial first step in the development of algorithms which can extract more nuanced information, such as chemical ordering, from existing datasets of complex materials.

MLSep 24, 2020
Bayesian Topological Learning for Classifying the Structure of Biological Networks

Vasileios Maroulas, Cassie Putman Micucci, Farzana Nasrin

Actin cytoskeleton networks generate local topological signatures due to the natural variations in the number, size, and shape of holes of the networks. Persistent homology is a method that explores these topological properties of data and summarizes them as persistence diagrams. In this work, we analyze and classify these filament networks by transforming them into persistence diagrams whose variability is quantified via a Bayesian framework on the space of persistence diagrams. The proposed generalized Bayesian framework adopts an independent and identically distributed cluster point process characterization of persistence diagrams and relies on a substitution likelihood argument. This framework provides the flexibility to estimate the posterior cardinality distribution of points in a persistence diagram and the posterior spatial distribution simultaneously. We present a closed form of the posteriors under the assumption of Gaussian mixtures and binomials for prior intensity and cardinality respectively. Using this posterior calculation, we implement a Bayes factor algorithm to classify the actin filament networks and benchmark it against several state-of-the-art classification methods.

MLDec 18, 2019
Bayesian Topological Learning for Brain State Classification

Farzana Nasrin, Christopher Oballe, David L. Boothe et al.

Investigation of human brain states through electroencephalograph (EEG) signals is a crucial step in human-machine communications. However, classifying and analyzing EEG signals are challenging due to their noisy, nonlinear and nonstationary nature. Current methodologies for analyzing these signals often fall short because they have several regularity assumptions baked in. This work provides an effective, flexible and noise-resilient scheme to analyze EEG by extracting pertinent information while abiding by the 3N (noisy, nonlinear and nonstationary) nature of data. We implement a topological tool, namely persistent homology, that tracks the evolution of topological features over time intervals and incorporates individual's expectations as prior knowledge by means of a Bayesian framework to compute posterior distributions. Relying on these posterior distributions, we apply Bayes factor classification to noisy EEG measurements. The performance of this Bayesian classification scheme is then compared with other existing methods for EEG signals.