Florian Becker

CR
4papers
625citations
Novelty44%
AI Score29

4 Papers

LGSep 1, 2022
Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions

Florian Becker, Age K. Smilde, Evrim Acar

Computational phenotyping allows for unsupervised discovery of subgroups of patients as well as corresponding co-occurring medical conditions from electronic health records (EHR). Typically, EHR data contains demographic information, diagnoses and laboratory results. Discovering (novel) phenotypes has the potential to be of prognostic and therapeutic value. Providing medical practitioners with transparent and interpretable results is an important requirement and an essential part for advancing precision medicine. Low-rank data approximation methods such as matrix (e.g., non-negative matrix factorization) and tensor decompositions (e.g., CANDECOMP/PARAFAC) have demonstrated that they can provide such transparent and interpretable insights. Recent developments have adapted low-rank data approximation methods by incorporating different constraints and regularizations that facilitate interpretability further. In addition, they offer solutions for common challenges within EHR data such as high dimensionality, data sparsity and incompleteness. Especially extracting temporal phenotypes from longitudinal EHR has received much attention in recent years. In this paper, we provide a comprehensive review of low-rank approximation-based approaches for computational phenotyping. The existing literature is categorized into temporal vs. static phenotyping approaches based on matrix vs. tensor decompositions. Furthermore, we outline different approaches for the validation of phenotypes, i.e., the assessment of clinical significance.

CRAug 21, 2023
A Modular and Adaptive System for Business Email Compromise Detection

Jan Brabec, Filip Šrajer, Radek Starosta et al.

The growing sophistication of Business Email Compromise (BEC) and spear phishing attacks poses significant challenges to organizations worldwide. The techniques featured in traditional spam and phishing detection are insufficient due to the tailored nature of modern BEC attacks as they often blend in with the regular benign traffic. Recent advances in machine learning, particularly in Natural Language Understanding (NLU), offer a promising avenue for combating such attacks but in a practical system, due to limitations such as data availability, operational costs, verdict explainability requirements or a need to robustly evolve the system, it is essential to combine multiple approaches together. We present CAPE, a comprehensive and efficient system for BEC detection that has been proven in a production environment for a period of over two years. Rather than being a single model, CAPE is a system that combines independent ML models and algorithms detecting BEC-related behaviors across various email modalities such as text, images, metadata and the email's communication context. This decomposition makes CAPE's verdicts naturally explainable. In the paper, we describe the design principles and constraints behind its architecture, as well as the challenges of model design, evaluation and adapting the system continuously through a Bayesian approach that combines limited data with domain knowledge. Furthermore, we elaborate on several specific behavioral detectors, such as those based on Transformer neural architectures.

COMP-PHJun 2, 2021Code
GemNet: Universal Directional Graph Neural Networks for Molecules

Johannes Gasteiger, Florian Becker, Stephan Günnemann

Effectively predicting molecular interactions has the potential to accelerate molecular dynamics by multiple orders of magnitude and thus revolutionize chemical simulations. Graph neural networks (GNNs) have recently shown great successes for this task, overtaking classical methods based on fixed molecular kernels. However, they still appear very limited from a theoretical perspective, since regular GNNs cannot distinguish certain types of graphs. In this work we close this gap between theory and practice. We show that GNNs with spherical representations are indeed universal approximators for predictions that are invariant to translation, and equivariant to permutation and rotation. We then discretize such GNNs via directed edge embeddings and two-hop message passing, and incorporate multiple structural improvements to arrive at the geometric message passing neural network (GemNet). We demonstrate the benefits of the proposed changes in multiple ablation studies. GemNet outperforms previous models on the COLL, MD17, and OC20 datasets by 34%, 41%, and 20%, respectively, and performs especially well on the most challenging molecules. Our implementation is available online.

OCJul 3, 2014
Solving QVIs for Image Restoration with Adaptive Constraint Sets

Frank Lenzen, Jan Lellmann, Florian Becker et al.

We consider a class of quasi-variational inequalities (QVIs) for adaptive image restoration, where the adaptivity is described via solution-dependent constraint sets. In previous work we studied both theoretical and numerical issues. While we were able to show the existence of solutions for a relatively broad class of problems, we encountered problems concerning uniqueness of the solution as well as convergence of existing algorithms for solving QVIs. In particular, it seemed that with increasing image size the growing condition number of the involved differential operator poses severe problems. In the present paper we prove uniqueness for a larger class of problems and in particular independent of the image size. Moreover, we provide a numerical algorithm with proved convergence. Experimental results support our theoretical findings.