LGFeb 23
Addressing Instrument-Outcome Confounding in Mendelian Randomization through Representation LearningShimeng Huang, Matthew Robinson, Francesco Locatello
Mendelian Randomization (MR) is a prominent observational epidemiological research method designed to address unobserved confounding when estimating causal effects. However, core assumptions -- particularly the independence between instruments and unobserved confounders -- are often violated due to population stratification or assortative mating. Leveraging the increasing availability of multi-environment data, we propose a representation learning framework that exploits cross-environment invariance to recover latent exogenous components of genetic instruments. We provide theoretical guarantees for identifying these latent instruments under various mixing mechanisms and demonstrate the effectiveness of our approach through simulations and semi-synthetic experiments using data from the All of Us Research Hub.
55.3ITApr 7
Multilevel Coset Codes on LatticesLeopold Bertholet, Chloe Makdad, Stephen Mackes et al.
This work introduces coset Bombe codes, a novel class of multilevel coset codes that generalize polar codes to dense lattice structures. By leveraging multilevel coding with non-binary codes designed for the lattice modulations and making use of Voronoi shaping, Bombe codes integrate the geometric strengths of dense lattices such as $D_4$ with the capacity-approaching properties of polar codes. Experimental results in additive white Gaussian noise (AWGN) channels demonstrate that coset Bombe codes significantly outperform both BICM and MLC state-of-the-art schemes on 16-QAM. The proposed scheme simulated on AWGN achieves up to 0.8 dB of gain and reduces block size latency by half while maintaining superior bit and block error rate (BER/BLER) performance on codewords of 256 and 1024 bits.
LGOct 3, 2025
Learning Explicit Single-Cell Dynamics Using ODE RepresentationsJan-Philipp von Bassewitz, Adeel Pervez, Marco Fumero et al.
Modeling the dynamics of cellular differentiation is fundamental to advancing the understanding and treatment of diseases associated with this process, such as cancer. With the rapid growth of single-cell datasets, this has also become a particularly promising and active domain for machine learning. Current state-of-the-art models, however, rely on computationally expensive optimal transport preprocessing and multi-stage training, while also not discovering explicit gene interactions. To address these challenges we propose Cell-Mechanistic Neural Networks (Cell-MNN), an encoder-decoder architecture whose latent representation is a locally linearized ODE governing the dynamics of cellular evolution from stem to tissue cells. Cell-MNN is fully end-to-end (besides a standard PCA pre-processing) and its ODE representation explicitly learns biologically consistent and interpretable gene interactions. Empirically, we show that Cell-MNN achieves competitive performance on single-cell benchmarks, surpasses state-of-the-art baselines in scaling to larger datasets and joint training across multiple datasets, while also learning interpretable gene interactions that we validate against the TRRUST database of gene interactions.