Joakim Jaldén

LG
h-index2
8papers
81citations
Novelty53%
AI Score45

8 Papers

52.6MLMay 19
Density-Ratio Losses for Post-Hoc Learning to Defer

Alexander Soen, Ragnar Thobaben, Joakim Jaldén et al.

We study post-hoc Learning to Defer (L2D) through the lens of ideal distributions: divergence-regularized reweightings of the data distribution under which a model attains low loss. We define deferral via the density-ratio between a model's and an expert's ideals. Using the reduction from density-ratio estimation to class-probability estimation, we derive the DR CPE losses for post-hoc L2D scorers. Deferral decisions are then made by thresholding the scorer, allowing deferral rates to be adjusted without retraining. For KL-based ideal distributions, our deferral rules recovers Chow's rule under the original distribution and a connection to an expert-tilted Bayes posterior -- which incorporates the expert's performance -- depending on if the ideal distributions are joint or marginal distributions. Experimentally, our approach is competitive compared to common baselines and more robust across dataset settings. More broadly, our results cast post-hoc L2D as density-ratio learning between ideal distributions, bridging Chow-style rules, expert comparison, and elucidating connections to related learning settings including anomaly detection.

APDec 16, 2025
Restless Multi-Process Multi-Armed Bandits with Applications to Self-Driving Microscopies

Jaume Anguera Peris, Songtao Cheng, Hanzhao Zhang et al.

High-content screening microscopy generates large amounts of live-cell imaging data, yet its potential remains constrained by the inability to determine when and where to image most effectively. Optimally balancing acquisition time, computational capacity, and photobleaching budgets across thousands of dynamically evolving regions of interest remains an open challenge, further complicated by limited field-of-view adjustments and sensor sensitivity. Existing approaches either rely on static sampling or heuristics that neglect the dynamic evolution of biological processes, leading to inefficiencies and missed events. Here, we introduce the restless multi-process multi-armed bandit (RMPMAB), a new decision-theoretic framework in which each experimental region is modeled not as a single process but as an ensemble of Markov chains, thereby capturing the inherent heterogeneity of biological systems such as asynchronous cell cycles and heterogeneous drug responses. Building upon this foundation, we derive closed-form expressions for transient and asymptotic behaviors of aggregated processes, and design scalable Whittle index policies with sub-linear complexity in the number of imaging regions. Through both simulations and a real biological live-cell imaging dataset, we show that our approach achieves substantial improvements in throughput under resource constraints. Notably, our algorithm outperforms Thomson Sampling, Bayesian UCB, epsilon-Greedy, and Round Robin by reducing cumulative regret by more than 37% in simulations and capturing 93% more biologically relevant events in live imaging experiments, underscoring its potential for transformative smart microscopy. Beyond improving experimental efficiency, the RMPMAB framework unifies stochastic decision theory with optimal autonomous microscopy control, offering a principled approach to accelerate discovery across multidisciplinary sciences.

LGMay 19, 2023
Marginalized Beam Search Algorithms for Hierarchical HMMs

Xuechun Xu, Joakim Jaldén

Inferring a state sequence from a sequence of measurements is a fundamental problem in bioinformatics and natural language processing. The Viterbi and the Beam Search (BS) algorithms are popular inference methods, but they have limitations when applied to Hierarchical Hidden Markov Models (HHMMs), where the interest lies in the outer state sequence. The Viterbi algorithm can not infer outer states without inner states, while the BS algorithm requires marginalization over prohibitively large state spaces. We propose two new algorithms to overcome these limitations: the greedy marginalized BS algorithm and the local focus BS algorithm. We show that they approximate the most likely outer state sequence with higher performance than the Viterbi algorithm, and we evaluate the performance of these algorithms on an explicit duration HMM with simulation and nanopore base calling data.

SPJan 28, 2022
Inertial Navigation Using an Inertial Sensor Array

Håkan Carlsson, Isaac Skog, Gustaf Hendeby et al.

We present a comprehensive framework for fusing measurements from multiple and generally placed accelerometers and gyroscopes to perform inertial navigation. Using the angular acceleration provided by the accelerometer array, we show that the numerical integration of the orientation can be done with second-order accuracy, which is more accurate compared to the traditional first-order accuracy that can be achieved when only using the gyroscopes. Since orientation errors are the most significant error source in inertial navigation, improving the orientation estimation reduces the overall navigation error. The practical performance benefit depends on prior knowledge of the inertial sensor array, and therefore we present four different state-space models using different underlying assumptions regarding the orientation modeling. The models are evaluated using a Lie Group Extended Kalman filter through simulations and real-world experiments. We also show how individual accelerometer biases are unobservable and can be replaced by a six-dimensional bias term whose dimension is fixed and independent of the number of accelerometers.

SPOct 16, 2020
Reinforcement Learning for Efficient and Tuning-Free Link Adaptation

Vidit Saxena, Hugo Tullberg, Joakim Jaldén

Wireless links adapt the data transmission parameters to the dynamic channel state -- this is called link adaptation. Classical link adaptation relies on tuning parameters that are challenging to configure for optimal link performance. Recently, reinforcement learning has been proposed to automate link adaptation, where the transmission parameters are modeled as discrete arms of a multi-armed bandit. In this context, we propose a latent learning model for link adaptation that exploits the correlation between data transmission parameters. Further, motivated by the recent success of Thompson sampling for multi-armed bandit problems, we propose a latent Thompson sampling (LTS) algorithm that quickly learns the optimal parameters for a given channel state. We extend LTS to fading wireless channels through a tuning-free mechanism that automatically tracks the channel dynamics. In numerical evaluations with fading wireless channels, LTS improves the link throughout by up to 100% compared to the state-of-the-art link adaptation algorithms.

SPJun 15, 2020
Deep unfolding of the weighted MMSE beamforming algorithm

Lissy Pellaco, Mats Bengtsson, Joakim Jaldén

Downlink beamforming is a key technology for cellular networks. However, computing the transmit beamformer that maximizes the weighted sum rate subject to a power constraint is an NP-hard problem. As a result, iterative algorithms that converge to a local optimum are used in practice. Among them, the weighted minimum mean square error (WMMSE) algorithm has gained popularity, but its computational complexity and consequent latency has motivated the need for lower-complexity approximations at the expense of performance. Motivated by the recent success of deep unfolding in the trade-off between complexity and performance, we propose the novel application of deep unfolding to the WMMSE algorithm for a MISO downlink channel. The main idea consists of mapping a fixed number of iterations of the WMMSE algorithm into trainable neural network layers, whose architecture reflects the structure of the original algorithm. With respect to traditional end-to-end learning, deep unfolding naturally incorporates expert knowledge, with the benefits of immediate and well-grounded architecture selection, fewer trainable parameters, and better explainability. However, the formulation of the WMMSE algorithm, as described in Shi et al., is not amenable to be unfolded due to a matrix inversion, an eigendecomposition, and a bisection search performed at each iteration. Therefore, we present an alternative formulation that circumvents these operations by resorting to projected gradient descent. By means of simulations, we show that, in most of the settings, the unfolded WMMSE outperforms or performs equally to the WMMSE for a fixed number of iterations, with the advantage of a lower computational load.

LGApr 20, 2020
Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joseph E. Gonzalez, Joakim Jaldén

We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under a probabilistic linear constraint. For a few real-world instances of this problem, constrained extensions of the well-known Thompson Sampling (TS) heuristic have recently been proposed. However, finite-time analysis of constrained TS is challenging; as a result, only O(\sqrt{T}) bounds on the cumulative reward loss (i.e., the regret) are available. In this paper, we describe LinConTS, a TS-based algorithm for bandits that place a linear constraint on the probability of earning a reward in every round. We show that for LinConTS, the regret as well as the cumulative constraint violations are upper bounded by O(\log T) for the suboptimal arms. We develop a proof technique that relies on careful analysis of the dual problem and combine it with recent theoretical work on unconstrained TS. Through numerical experiments on two real-world datasets, we demonstrate that LinConTS outperforms an asymptotically optimal upper confidence bound (UCB) scheme in terms of simultaneously minimizing the regret and the violation.

LGFeb 28, 2019
Constrained Thompson Sampling for Wireless Link Optimization

Vidit Saxena, Joseph E. Gonzalez, Ion Stoica et al.

Wireless communication systems operate in complex time-varying environments. Therefore, selecting the optimal configuration parameters in these systems is a challenging problem. For wireless links, \emph{rate selection} is used to select the optimal data transmission rate that maximizes the link throughput subject to an application-defined latency constraint. We model rate selection as a stochastic multi-armed bandit (MAB) problem, where a finite set of transmission rates are modeled as independent bandit arms. For this setup, we propose Con-TS, a novel constrained version of the Thompson sampling algorithm, where the latency requirement is modeled by a high-probability linear constraint. We show that for Con-TS, the expected number of constraint violations over T transmission intervals is upper bounded by O(\sqrt{KT}), where K is the number of available rates. Further, the expected loss in cumulative throughput compared to the optimal rate selection scheme (i.e., the egret is also upper bounded by O(\sqrt{KT \log K}). Through numerical simulations, we demonstrate that Con-TS significantly outperforms state-of-the-art bandit schemes for rate selection.