Janis Fluri

CO
h-index19
4papers
183citations
Novelty53%
AI Score45

4 Papers

COMar 17, 2022
DeepLSS: breaking parameter degeneracies in large scale structure with deep learning analysis of combined probes

Tomasz Kacprzak, Janis Fluri

In classical cosmological analysis of large scale structure surveys with 2-pt functions, the parameter measurement precision is limited by several key degeneracies within the cosmology and astrophysics sectors. For cosmic shear, clustering amplitude $σ_8$ and matter density $Ω_m$ roughly follow the $S_8=σ_8(Ω_m/0.3)^{0.5}$ relation. In turn, $S_8$ is highly correlated with the intrinsic galaxy alignment amplitude $A_{\rm{IA}}$. For galaxy clustering, the bias $b_g$ is degenerate with both $σ_8$ and $Ω_m$, as well as the stochasticity $r_g$. Moreover, the redshift evolution of IA and bias can cause further parameter confusion. A tomographic 2-pt probe combination can partially lift these degeneracies. In this work we demonstrate that a deep learning analysis of combined probes of weak gravitational lensing and galaxy clustering, which we call DeepLSS, can effectively break these degeneracies and yield significantly more precise constraints on $σ_8$, $Ω_m$, $A_{\rm{IA}}$, $b_g$, $r_g$, and IA redshift evolution parameter $η_{\rm{IA}}$. The most significant gains are in the IA sector: the precision of $A_{\rm{IA}}$ is increased by approximately 8x and is almost perfectly decorrelated from $S_8$. Galaxy bias $b_g$ is improved by 1.5x, stochasticity $r_g$ by 3x, and the redshift evolution $η_{\rm{IA}}$ and $η_b$ by 1.6x. Breaking these degeneracies leads to a significant gain in constraining power for $σ_8$ and $Ω_m$, with the figure of merit improved by 15x. We give an intuitive explanation for the origin of this information gain using sensitivity maps. These results indicate that the fully numerical, map-based forward modeling approach to cosmological inference with machine learning may play an important role in upcoming LSS surveys. We discuss perspectives and challenges in its practical deployment for a full survey analysis.

CLMar 6, 2025Code
Generalized Interpolating Discrete Diffusion

Dimitri von Rütte, Janis Fluri, Yuhui Ding et al.

While state-of-the-art language models achieve impressive results through next-token prediction, they have inherent limitations such as the inability to revise already generated tokens. This has prompted exploration of alternative approaches such as discrete diffusion. However, masked diffusion, which has emerged as a popular choice due to its simplicity and effectiveness, reintroduces this inability to revise words. To overcome this, we generalize masked diffusion, deriving a new family of general interpolating discrete diffusion (GIDD) which offers greater flexibility in the design of the noising processes. Leveraging a novel diffusion ELBO, we achieve compute-matched state-of-the-art performance in diffusion language modeling. Exploiting GIDD's flexibility, we explore a hybrid approach combining masking and uniform noise, leading to improved sample quality and unlocking the ability for the model to correct its own mistakes, an area where autoregressive models notoriously have struggled. Code: https://github.com/dvruette/gidd/

LGDec 11, 2025
Scaling Behavior of Discrete Diffusion Language Models

Dimitri von Rütte, Janis Fluri, Omead Pooladzandi et al.

Modern LLM pre-training consumes vast amounts of compute and training data, making the scaling behavior, or scaling laws, of different models a key distinguishing factor. Discrete diffusion language models (DLMs) have been proposed as an alternative to autoregressive language models (ALMs). However, their scaling behavior has not yet been fully explored, with prior work suggesting that they require more data and compute to match the performance of ALMs. We study the scaling behavior of DLMs on different noise types by smoothly interpolating between masked and uniform diffusion while paying close attention to crucial hyperparameters such as batch size and learning rate. Our experiments reveal that the scaling behavior of DLMs strongly depends on the noise type and is considerably different from ALMs. While all noise types converge to similar loss values in compute-bound scaling, we find that uniform diffusion requires more parameters and less data for compute-efficient training compared to masked diffusion, making them a promising candidate in data-bound settings. We scale our uniform diffusion model up to 10B parameters trained for $10^{22}$ FLOPs, confirming the predicted scaling behavior and making it the largest publicly known uniform diffusion model to date.

COJan 27, 2018
Fast cosmic web simulations with generative adversarial networks

Andres C. Rodriguez, Tomasz Kacprzak, Aurelien Lucchi et al.

Dark matter in the universe evolves through gravity to form a complex network of halos, filaments, sheets and voids, that is known as the cosmic web. Computational models of the underlying physical processes, such as classical N-body simulations, are extremely resource intensive, as they track the action of gravity in an expanding universe using billions of particles as tracers of the cosmic matter distribution. Therefore, upcoming cosmology experiments will face a computational bottleneck that may limit the exploitation of their full scientific potential. To address this challenge, we demonstrate the application of a machine learning technique called Generative Adversarial Networks (GAN) to learn models that can efficiently generate new, physically realistic realizations of the cosmic web. Our training set is a small, representative sample of 2D image snapshots from N-body simulations of size 500 and 100 Mpc. We show that the GAN-generated samples are qualitatively and quantitatively very similar to the originals. For the larger boxes of size 500 Mpc, it is very difficult to distinguish them visually. The agreement of the power spectrum $P_k$ is 1-2\% for most of the range, between $k=0.06$ and $k=0.4$. An important advantage of generating cosmic web realizations with a GAN is the considerable gains in terms of computation time. Each new sample generated by a GAN takes a fraction of a second, compared to the many hours needed by traditional N-body techniques. We anticipate that the use of generative models such as GANs will therefore play an important role in providing extremely fast and precise simulations of cosmic web in the era of large cosmological surveys, such as Euclid and Large Synoptic Survey Telescope (LSST).