CVJul 19, 2024Code
Memory-Efficient Pseudo-Labeling for Online Source-Free Universal Domain Adaptation using a Gaussian Mixture ModelPascal Schlachter, Simon Wagner, Bin Yang
In practice, domain shifts are likely to occur between training and test data, necessitating domain adaptation (DA) to adjust the pre-trained source model to the target domain. Recently, universal domain adaptation (UniDA) has gained attention for addressing the possibility of an additional category (label) shift between the source and target domain. This means new classes can appear in the target data, some source classes may no longer be present, or both at the same time. For practical applicability, UniDA methods must handle both source-free and online scenarios, enabling adaptation without access to the source data and performing batch-wise updates in parallel with prediction. In an online setting, preserving knowledge across batches is crucial. However, existing methods often require substantial memory, which is impractical because memory is limited and valuable, in particular on embedded systems. Therefore, we consider memory-efficiency as an additional constraint. To achieve memory-efficient online source-free universal domain adaptation (SF-UniDA), we propose a novel method that continuously captures the distribution of known classes in the feature space using a Gaussian mixture model (GMM). This approach, combined with entropy-based out-of-distribution detection, allows for the generation of reliable pseudo-labels. Finally, we combine a contrastive loss with a KL divergence loss to perform the adaptation. Our approach not only achieves state-of-the-art results in all experiments on the DomainNet and Office-Home datasets but also significantly outperforms the existing methods on the challenging VisDA-C dataset, setting a new benchmark for online SF-UniDA. Our code is available at https://github.com/pascalschlachter/GMM.
BMFeb 19, 2025Code
Learning conformational ensembles of proteins based on backbone geometryNicolas Wolf, Leif Seute, Vsevolod Viliuga et al.
Deep generative models have recently been proposed for sampling protein conformations from the Boltzmann distribution, as an alternative to often prohibitively expensive Molecular Dynamics simulations. However, current state-of-the-art approaches rely on fine-tuning pre-trained folding models and evolutionary sequence information, limiting their applicability and efficiency, and introducing potential biases. In this work, we propose a flow matching model for sampling protein conformations based solely on backbone geometry - BBFlow. We introduce a geometric encoding of the backbone equilibrium structure as input and propose to condition not only the flow but also the prior distribution on the respective equilibrium structure, eliminating the need for evolutionary information. The resulting model is orders of magnitudes faster than current state-of-the-art approaches at comparable accuracy, is transferable to multi-chain proteins, and can be trained from scratch in a few GPU days. In our experiments, we demonstrate that the proposed model achieves competitive performance with reduced inference time, across not only an established benchmark of naturally occurring proteins but also de novo proteins, for which evolutionary information is scarce or absent. BBFlow is available at https://github.com/graeter-group/bbflow.
BMJun 29, 2025Code
Flexibility-Conditioned Protein Structure Design with Flow MatchingVsevolod Viliuga, Leif Seute, Nicolas Wolf et al.
Recent advances in geometric deep learning and generative modeling have enabled the design of novel proteins with a wide range of desired properties. However, current state-of-the-art approaches are typically restricted to generating proteins with only static target properties, such as motifs and symmetries. In this work, we take a step towards overcoming this limitation by proposing a framework to condition structure generation on flexibility, which is crucial for key functionalities such as catalysis or molecular recognition. We first introduce BackFlip, an equivariant neural network for predicting per-residue flexibility from an input backbone structure. Relying on BackFlip, we propose FliPS, an SE(3)-equivariant conditional flow matching model that solves the inverse problem, that is, generating backbones that display a target flexibility profile. In our experiments, we show that FliPS is able to generate novel and diverse protein backbones with the desired flexibility, verified by Molecular Dynamics (MD) simulations. FliPS and BackFlip are available at https://github.com/graeter-group/flips .
CVMay 13, 2024
SAR Image Synthesis with Diffusion ModelsDenisa Qosja, Simon Wagner, Daniel O'Hagan
In recent years, diffusion models (DMs) have become a popular method for generating synthetic data. By achieving samples of higher quality, they quickly became superior to generative adversarial networks (GANs) and the current state-of-the-art method in generative modeling. However, their potential has not yet been exploited in radar, where the lack of available training data is a long-standing problem. In this work, a specific type of DMs, namely denoising diffusion probabilistic model (DDPM) is adapted to the SAR domain. We investigate the network choice and specific diffusion parameters for conditional and unconditional SAR image generation. In our experiments, we show that DDPM qualitatively and quantitatively outperforms state-of-the-art GAN-based methods for SAR image generation. Finally, we show that DDPM profits from pretraining on largescale clutter data, generating SAR images of even higher quality.
LGNov 7, 2024
Generating Highly Designable Proteins with Geometric Algebra Flow MatchingSimon Wagner, Leif Seute, Vsevolod Viliuga et al.
We introduce a generative model for protein backbone design utilizing geometric products and higher order message passing. In particular, we propose Clifford Frame Attention (CFA), an extension of the invariant point attention (IPA) architecture from AlphaFold2, in which the backbone residue frames and geometric features are represented in the projective geometric algebra. This enables to construct geometrically expressive messages between residues, including higher order terms, using the bilinear operations of the algebra. We evaluate our architecture by incorporating it into the framework of FrameFlow, a state-of-the-art flow matching model for protein backbone generation. The proposed model achieves high designability, diversity and novelty, while also sampling protein backbones that follow the statistical distribution of secondary structure elements found in naturally occurring proteins, a property so far only insufficiently achieved by many state-of-the-art generative models.
CHEM-PHMar 1, 2025
Stable and Accurate Orbital-Free DFT Powered by Machine LearningRoman Remme, Tobias Kaczun, Tim Ebert et al.
Hohenberg and Kohn have proven that the electronic energy and the one-particle electron density can, in principle, be obtained by minimizing an energy functional with respect to the density. While decades of theoretical work have produced increasingly faithful approximations to this elusive exact energy functional, their accuracy is still insufficient for many applications, making it reasonable to try and learn it empirically. Using rotationally equivariant atomistic machine learning, we obtain for the first time a density functional that, when applied to the organic molecules in QM9, yields energies with chemical accuracy relative to the Kohn-Sham reference while also converging to meaningful electron densities. Augmenting the training data with densities obtained from perturbed potentials proved key to these advances. This work demonstrates that machine learning can play a crucial role in narrowing the gap between theory and the practical realization of Hohenberg and Kohn's vision, paving the way for more efficient calculations in large molecular systems.
CVMay 23, 2025
Enhancing Fourier-based Doppler Resolution with Diffusion ModelsDenisa Qosja, Kilian Barth, Simon Wagner
In radar systems, high resolution in the Doppler dimension is important for detecting slow-moving targets as it allows for more distinct separation between these targets and clutter, or stationary objects. However, achieving sufficient resolution is constrained by hardware capabilities and physical factors, leading to the development of processing techniques to enhance the resolution after acquisition. In this work, we leverage artificial intelligence to increase the Doppler resolution in range-Doppler maps. Based on a zero-padded FFT, a refinement via the generative neural networks of diffusion models is achieved. We demonstrate that our method overcomes the limitations of traditional FFT, generating data where closely spaced targets are effectively separated.