PRAug 19, 2010
Spectral measure and approximation of homogenized coefficientsAntoine Gloria, Jean-Christophe Mourrat
This article deals with the numerical approximation of effective coefficients in stochastic homogenization of discrete linear elliptic equations. The originality of this work is the use of a well-known abstract spectral representation formula to design and analyze effective and computable approximations of the homogenized coefficients. In particular, we show that information on the edge of the spectrum of the generator of the environment viewed by the particle projected on the local drift yields bounds on the approximation error, and conversely. Combined with results by Otto and the first author in low dimension, and results by the second author in high dimension, this allows us to prove that for any dimension, there exists an explicit numerical strategy to approximate homogenized coefficients which converges at the rate of the central limit theorem.
NAOct 31, 2017
Efficient methods for the estimation of homogenized coefficientsJean-Christophe Mourrat
The main goal of this paper is to define and study new methods for the computation of effective coefficients in the homogenization of divergence-form operators with random coefficients. The methods introduced here are proved to have optimal computational complexity, and are shown numerically to display small constant prefactors. In the spirit of multiscale methods, the main idea is to rely on a progressive coarsening of the problem, which we implement via a generalization of the Green-Kubo formula. The technique can be applied more generally to compute the effective diffusivity of any additive functional of a Markov process. In this broader context, we also discuss the alternative possibility of using Monte-Carlo sampling, and show how a simple one-step extrapolation can considerably improve the performance of this alternative method.
PRJul 22, 2013
Quantitative version of the Kipnis-Varadhan theorem and Monte Carlo approximation of homogenized coefficientsAntoine Gloria, Jean-Christophe Mourrat
This article is devoted to the analysis of a Monte Carlo method to approximate effective coefficients in stochastic homogenization of discrete elliptic equations. We consider the case of independent and identically distributed coefficients, and adopt the point of view of the random walk in a random environment. Given some final time t>0, a natural approximation of the homogenized coefficients is given by the empirical average of the final squared positions re-scaled by t of n independent random walks in n independent environments. Relying on a quantitative version of the Kipnis-Varadhan theorem combined with estimates of spectral exponents obtained by an original combination of PDE arguments and spectral theory, we first give a sharp estimate of the error between the homogenized coefficients and the expectation of the re-scaled final position of the random walk in terms of t. We then complete the error analysis by quantifying the fluctuations of the empirical average in terms of n and t, and prove a large-deviation estimate, as well as a central limit theorem. Our estimates are optimal, up to a logarithmic correction in dimension 2.
LGJan 24, 2025
Humanity's Last ExamLong Phan, Alice Gatti, Ziwen Han et al. · amazon-science, apple-ml
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai.
LGSep 20, 2021
Local versions of sum-of-norms clusteringAlexander Dunlap, Jean-Christophe Mourrat
Sum-of-norms clustering is a convex optimization problem whose solution can be used for the clustering of multivariate data. We propose and study a localized version of this method, and show in particular that it can separate arbitrarily close balls in the stochastic ball model. More precisely, we prove a quantitative bound on the error incurred in the clustering of disjoint connected sets. Our bound is expressed in terms of the number of datapoints and the localization length of the functional.
LGApr 28, 2021
Sum-of-norms clustering does not separate nearby ballsAlexander Dunlap, Jean-Christophe Mourrat
Sum-of-norms clustering is a popular convexification of $K$-means clustering. We show that, if the dataset is made of a large number of independent random variables distributed according to the uniform measure on the union of two disjoint balls of unit radius, and if the balls are sufficiently close to one another, then sum-of-norms clustering will typically fail to recover the decomposition of the dataset into two clusters. As the dimension tends to infinity, this happens even when the distance between the centers of the two balls is taken to be as large as $2\sqrt{2}$. In order to show this, we introduce and analyze a continuous version of sum-of-norms clustering, where the dataset is replaced by a general measure. In particular, we state and prove a local-global characterization of the clustering that seems to be new even in the case of discrete datapoints.