Javier Zazo

LG
h-index14
9papers
108citations
Novelty61%
AI Score50

9 Papers

SYDec 28, 2015
Dynamic Potential Games in Communications: Fundamentals and Applications

Santiago Zazo, Sergio Valcarcel Macua, Matilde Sánchez-Fernández et al. · microsoft-research

In a noncooperative dynamic game, multiple agents operating in a changing environment aim to optimize their utilities over an infinite time horizon. Time-varying environments allow to model more realistic scenarios (e.g., mobile devices equipped with batteries, wireless communications over a fading channel, etc.). However, solving a dynamic game is a difficult task that requires dealing with multiple coupled optimal control problems. We focus our analysis on a class of problems, named \textit{dynamic potential games}, whose solution can be found through a single multivariate optimal control problem. Our analysis generalizes previous studies by considering that the set of environment's states and the set of players' actions are constrained, as it is required by most of the applications. And the theoretical results are the natural extension of the analysis for static potential games. We apply the analysis and provide numerical methods to solve four key example problems, with different features each: energy demand control in a smart-grid network, network flow optimization in which the relays have bounded link capacity and limited battery life, uplink multiple access communication with users that have to optimize the use of their batteries, and two optimal scheduling games with nonstationary channels.

QMApr 26, 2023
AIRIVA: A Deep Generative Model of Adaptive Immune Repertoires

Melanie F. Pradier, Niranjani Prasad, Paidamoyo Chapfuwa et al. · microsoft-research

Recent advances in immunomics have shown that T-cell receptor (TCR) signatures can accurately predict active or recent infection by leveraging the high specificity of TCR binding to disease antigens. However, the extreme diversity of the adaptive immune repertoire presents challenges in reliably identifying disease-specific TCRs. Population genetics and sequencing depth can also have strong systematic effects on repertoires, which requires careful consideration when developing diagnostic models. We present an Adaptive Immune Repertoire-Invariant Variational Autoencoder (AIRIVA), a generative model that learns a low-dimensional, interpretable, and compositional representation of TCR repertoires to disentangle such systematic effects in repertoires. We apply AIRIVA to two infectious disease case-studies: COVID-19 (natural infection and vaccination) and the Herpes Simplex Virus (HSV-1 and HSV-2), and empirically show that we can disentangle the individual disease signals. We further demonstrate AIRIVA's capability to: learn from unlabelled samples; generate in-silico TCR repertoires by intervening on the latent factors; and identify disease-associated TCRs validated using TCR annotations from external assay data.

OCMay 5, 2016
Robust Worst-Case Analysis of Demand-Side Management in Smart Grids

Javier Zazo, Santiago Zazo, Sergio Valcarcel Macua

Demand-side management presents significant benefits in reducing the energy load in smart grids by balancing consumption demands or including energy generation and/or storage devices in the user's side. These techniques coordinate the energy load so that users minimize their monetary expenditure. However, these methods require accurate predictions in the energy consumption profiles, which make them inflexible to real demand variations. In this paper we propose a realistic model that accounts for uncertainty in these variations and calculates a robust price for all users in the smart grid. We analyze the existence of solutions for this novel scenario, propose convergent distributed algorithms to find them, and perform simulations considering energy expenditure. We show that this model can effectively reduce the monetary expenses for all users in a real-time market, while at the same time it provides a reliable production cost estimate to the energy supplier.

MLMay 18
Forward-Learned Discrete Diffusion: Learning how to noise to denoise faster

Grigory Bartosh, Teodora Pandeva, Sushrut Karmalkar et al.

Discrete diffusion models are a powerful class of generative models with strong performance across many domains. For efficiency, however, discrete diffusion typically parameterizes the generative (reverse) process with factorized distributions, which makes it difficult for the model to learn the target process in a small number of steps and necessitates a long, computationally expensive sampling procedure. To reduce the gap between the target and model distributions and enable few-step generation, we propose Forward-Learned Discrete Diffusion (FLDD), which introduces discrete diffusion with a learnable forward (noising) process. Rather than fixing a Markovian forward chain, we adopt a non-Markovian formulation with learnable marginal and posterior distributions. This allows the generative process to remain factorized while matching the target defined by the noising process. We train all parameters end-to-end under the standard variational objective. Experiments on various benchmarks show that, for a given number of sampling steps, our approach produces a higher quality samples than conventional discrete diffusion models using the same reverse parameterization.

LGFeb 9
ARO: A New Lens On Matrix Optimization For Large Models

Wenbo Gong, Javier Zazo, Qijun Luo et al.

Matrix-based optimizers have attracted growing interest for improving LLM training efficiency, with significant progress centered on orthogonalization/whitening based methods. While yielding substantial performance gains, a fundamental question arises: can we develop new paradigms beyond orthogonalization, pushing the efficiency frontier further? We present \textbf{Adaptively Rotated Optimization (ARO}, a new matrix optimization framework that treats gradient rotation as a first class design principle. ARO accelerates LLM training by performing normed steepest descent in a rotated coordinate system, where the rotation is determined by a novel norm-informed policy. This perspective yields update rules that go beyond existing orthogonalization and whitening optimizers, improving sample efficiency in practice. To make comparisons reliable, we propose a rigorously controlled benchmarking protocol that reduces confounding and bias. Under this protocol, ARO consistently outperforms AdamW (by 1.3 $\sim$1.35$\times$) and orthogonalization methods (by 1.1$\sim$1.15$\times$) in LLM pretraining at up to 8B activated parameters, and up to $8\times$ overtrain budget, without evidence of diminishing returns. Finally, we discuss how ARO can be reformulated as a symmetry-aware optimizer grounded in rotational symmetries of residual streams, motivating advanced designs that enable computationally efficient exploitation of cross-layer/cross module couplings.

LGOct 24, 2025
Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing

Iskander Azangulov, Teodora Pandeva, Niranjani Prasad et al.

Masked diffusion models (MDMs) offer a compelling alternative to autoregressive models (ARMs) for discrete text generation because they enable parallel token sampling, rather than sequential, left-to-right generation. This means potentially much faster inference. However, effective parallel sampling faces two competing requirements: (i) simultaneously updated tokens must be conditionally independent, and (ii) updates should prioritise high-confidence predictions. These goals conflict because high-confidence predictions often cluster and depend on each other, opportunities for parallel updates. We present PUNT, a model-agnostic sampler that reconciles this trade-off. Our method identifies token dependencies and removes lower-confidence tokens from conflicting groups. This produces sets of indices for unmasking that satisfy both independence and confidence criteria. Our approach ensures improved parallel unmasking through approximate conditional independence testing. Our experiments show that PUNT delivers a superior trade-off between accuracy and compute when compared to other strong training-free baselines, especially for generation of longer sequences. On the IFEval benchmark, it achieves up to 16\% higher accuracy over baseline methods, including sequential generation (one-by-one). These gains hold across different values of hyperparameters, mitigating the need for brittle hyperparameter tuning. Moreover, we observe that PUNT induces an emergent hierarchical generation strategy, where the model first establishes high-level paragraph structure before local refinement, suggesting a planning-like generation process that contributes to strong alignment performance.

LGJan 13, 2021
Preferential Mixture-of-Experts: Interpretable Models that Rely on Human Expertise as much as Possible

Melanie F. Pradier, Javier Zazo, Sonali Parbhoo et al.

We propose Preferential MoE, a novel human-ML mixture-of-experts model that augments human expertise in decision making with a data-based classifier only when necessary for predictive performance. Our model exhibits an interpretable gating function that provides information on when human rules should be followed or avoided. The gating function is maximized for using human-based rules, and classification errors are minimized. We propose solving a coupled multi-objective problem with convex subproblems. We develop approximate algorithms and study their performance and convergence. Finally, we demonstrate the utility of Preferential MoE on two clinical applications for the treatment of Human Immunodeficiency Virus (HIV) and management of Major Depressive Disorder (MDD).

LGJul 23, 2019
Convolutional Dictionary Learning in Hierarchical Networks

Javier Zazo, Bahareh Tolooshams, Demba Ba

Filter banks are a popular tool for the analysis of piecewise smooth signals such as natural images. Motivated by the empirically observed properties of scale and detail coefficients of images in the wavelet domain, we propose a hierarchical deep generative model of piecewise smooth signals that is a recursion across scales: the low pass scale coefficients at one layer are obtained by filtering the scale coefficients at the next layer, and adding a high pass detail innovation obtained by filtering a sparse vector. This recursion describes a linear dynamic system that is a non-Gaussian Markov process across scales and is closely related to multilayer-convolutional sparse coding (ML-CSC) generative model for deep networks, except that our model allows for deeper architectures, and combines sparse and non-sparse signal representations. We propose an alternating minimization algorithm for learning the filters in this hierarchical model given observations at layer zero, e.g., natural images. The algorithm alternates between a coefficient-estimation step and a filter update step. The coefficient update step performs sparse (detail) and smooth (scale) coding and, when unfolded, leads to a deep neural network. We use MNIST to demonstrate the representation capabilities of the model, and its derived features (coefficients) for classification.

MAFeb 3, 2018
Learning Parametric Closed-Loop Policies for Markov Potential Games

Sergio Valcarcel Macua, Javier Zazo, Santiago Zazo

Multiagent systems where agents interact among themselves and with a stochastic environment can be formalized as stochastic games. We study a subclass named Markov potential games (MPGs) that appear often in economic and engineering applications when the agents share a common resource. We consider MPGs with continuous state-action variables, coupled constraints and nonconvex rewards. Previous analysis followed a variational approach that is only valid for very simple cases (convex rewards, invertible dynamics, and no coupled constraints); or considered deterministic dynamics and provided open-loop (OL) analysis, studying strategies that consist in predefined action sequences, which are not optimal for stochastic environments. We present a closed-loop (CL) analysis for MPGs and consider parametric policies that depend on the current state. We provide easily verifiable, sufficient and necessary conditions for a stochastic game to be an MPG, even for complex parametric functions (e.g., deep neural networks); and show that a closed-loop Nash equilibrium (NE) can be found (or at least approximated) by solving a related optimal control problem (OCP). This is useful since solving an OCP--which is a single-objective problem--is usually much simpler than solving the original set of coupled OCPs that form the game--which is a multiobjective control problem. This is a considerable improvement over the previously standard approach for the CL analysis of MPGs, which gives no approximate solution if no NE belongs to the chosen parametric family, and which is practical only for simple parametric forms. We illustrate the theoretical contributions with an example by applying our approach to a noncooperative communications engineering game. We then solve the game with a deep reinforcement learning algorithm that learns policies that closely approximates an exact variational NE of the game.