Javier M. Romero

h-index41

11papers

2,057citations

Novelty52%

AI Score34

Ranked #113,910 of 194,257 authors (top 59%)#38,028 in CV (top 64%)

11 Papers

3.7CVOct 11, 2022

Human Body Measurement Estimation with Adversarial Augmentation

Nataniel Ruiz, Miriam Bellver, Timo Bolkart et al. · amazon-science

We present a Body Measurement network (BMnet) for estimating 3D anthropomorphic measurements of the human body shape from silhouette images. Training of BMnet is performed on data from real human subjects, and augmented with a novel adversarial body simulator (ABS) that finds and synthesizes challenging body shapes. ABS is based on the skinned multiperson linear (SMPL) body model, and aims to maximize BMnet measurement prediction error with respect to latent SMPL shape parameters. ABS is fully differentiable with respect to these parameters, and trained end-to-end via backpropagation with BMnet in the loop. Experiments show that ABS effectively discovers adversarial examples, such as bodies with extreme body mass indices (BMI), consistent with the rarity of extreme-BMI bodies in BMnet's training set. Thus ABS is able to reveal gaps in training data and potential failures in predicting under-represented body shapes. Results show that training BMnet with ABS improves measurement prediction accuracy on real bodies by up to 10%, when compared to no augmentation or random body shape sampling. Furthermore, our method significantly outperforms SOTA measurement estimation methods by as much as 3x. Finally, we release BodyM, the first challenging, large-scale dataset of photo silhouettes and body measurements of real human subjects, to further promote research in this area. Project website: https://adversarialbodysim.github.io

18.3GRJun 30, 2022

Dressing Avatars: Deep Photorealistic Appearance for Physically Simulated Clothing

Donglai Xiang, Timur Bagautdinov, Tuur Stuyck et al.

Despite recent progress in developing animatable full-body avatars, realistic modeling of clothing - one of the core aspects of human self-expression - remains an open challenge. State-of-the-art physical simulation methods can generate realistically behaving clothing geometry at interactive rates. Modeling photorealistic appearance, however, usually requires physically-based rendering which is too expensive for interactive applications. On the other hand, data-driven deep appearance models are capable of efficiently producing realistic appearance, but struggle at synthesizing geometry of highly dynamic clothing and handling challenging body-clothing configurations. To this end, we introduce pose-driven avatars with explicit modeling of clothing that exhibit both photorealistic appearance learned from real-world data and realistic clothing dynamics. The key idea is to introduce a neural clothing appearance model that operates on top of explicit geometry: at training time we use high-fidelity tracking, whereas at animation time we rely on physically simulated geometry. Our core contribution is a physically-inspired appearance network, capable of generating photorealistic appearance with view-dependent and dynamic shadowing effects even for unseen body-clothing configurations. We conduct a thorough evaluation of our model and demonstrate diverse animation results on several subjects and different types of clothing. Unlike previous work on photorealistic full-body avatars, our approach can produce much richer dynamics and more realistic deformations even for many examples of loose clothing. We also demonstrate that our formulation naturally allows clothing to be used with avatars of different people while staying fully animatable, thus enabling, for the first time, photorealistic avatars with novel clothing.

6.5CVMar 25, 2022Code

AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling

Ziqian Bai, Timur Bagautdinov, Javier Romero et al.

Neural fields such as implicit surfaces have recently enabled avatar modeling from raw scans without explicit temporal correspondences. In this work, we exploit autoregressive modeling to further extend this notion to capture dynamic effects, such as soft-tissue deformations. Although autoregressive models are naturally capable of handling dynamics, it is non-trivial to apply them to implicit representations, as explicit state decoding is infeasible due to prohibitive memory requirements. In this work, for the first time, we enable autoregressive modeling of implicit avatars. To reduce the memory bottleneck and efficiently model dynamic implicit surfaces, we introduce the notion of articulated observer points, which relate implicit states to the explicit surface of a parametric human body model. We demonstrate that encoding implicit surfaces as a set of height fields defined on articulated observer points leads to significantly better generalization compared to a latent representation. The experiments show that our approach outperforms the state of the art, achieving plausible dynamic deformations even for unseen motions. https://zqbai-jeremy.github.io/autoavatar

9.1CVNov 10, 2023

Diffusion Shape Prior for Wrinkle-Accurate Cloth Registration

Jingfan Guo, Fabian Prada, Donglai Xiang et al.

Registering clothes from 4D scans with vertex-accurate correspondence is challenging, yet important for dynamic appearance modeling and physics parameter estimation from real-world data. However, previous methods either rely on texture information, which is not always reliable, or achieve only coarse-level alignment. In this work, we present a novel approach to enabling accurate surface registration of texture-less clothes with large deformation. Our key idea is to effectively leverage a shape prior learned from pre-captured clothing using diffusion models. We also propose a multi-stage guidance scheme based on learned functional maps, which stabilizes registration for large-scale deformation even when they vary significantly from training data. Using high-fidelity real captured clothes, our experiments show that the proposed approach based on diffusion models generalizes better than surface registration with VAE or PCA-based priors, outperforming both optimization-based and learning-based non-rigid registration methods for both interpolation and extrapolation tests.

6.2AIJun 23, 2022Code

plingo: A system for probabilistic reasoning in clingo based on lpmln

Susana Hahn, Tomi Janhunen, Roland Kaminski et al.

We present plingo, an extension of the ASP system clingo with various probabilistic reasoning modes. Plingo is centered upon LP^MLN, a probabilistic extension of ASP based on a weight scheme from Markov Logic. This choice is motivated by the fact that the core probabilistic reasoning modes can be mapped onto optimization problems and that LP^MLN may serve as a middle-ground formalism connecting to other probabilistic approaches. As a result, plingo offers three alternative frontends, for LP^MLN, P-log, and ProbLog. The corresponding input languages and reasoning modes are implemented by means of clingo's multi-shot and theory solving capabilities. The core of plingo amounts to a re-implementation of LP^MLN in terms of modern ASP technology, extended by an approximation technique based on a new method for answer set enumeration in the order of optimality. We evaluate plingo's performance empirically by comparing it to other probabilistic systems.

6.1AINov 11, 2021

Answer Set Programming Made Easy

Jorge Fandinno, Seemran Mishra, Javier Romero et al.

We take up an idea from the folklore of Answer Set Programming, namely that choices, integrity constraints along with a restricted rule format is sufficient for Answer Set Programming. We elaborate upon the foundations of this idea in the context of the logic of Here-and-There and show how it can be derived from the logical principle of extension by definition. We then provide an austere form of logic programs that may serve as a normalform for logic programs similar to conjunctive normalform in classical logic. Finally, we take the key ideas and propose a modeling methodology for ASP beginners and illustrate how it can be used.

16.0AIAug 13, 2021Code

Planning with Incomplete Information in Quantified Answer Set Programming

Jorge Fandinno, François Laferrière, Javier Romero et al.

We present a general approach to planning with incomplete information in Answer Set Programming (ASP). More precisely, we consider the problems of conformant and conditional planning with sensing actions and assumptions. We represent planning problems using a simple formalism where logic programs describe the transition function between states, the initial states and the goal states. For solving planning problems, we use Quantified Answer Set Programming (QASP), an extension of ASP with existential and universal quantifiers over atoms that is analogous to Quantified Boolean Formulas (QBFs). We define the language of quantified logic programs and use it to represent the solutions to different variants of conformant and conditional planning. On the practical side, we present a translation-based QASP solver that converts quantified logic programs into QBFs and then executes a QBF solver, and we evaluate experimentally the approach on conformant and conditional planning benchmarks. Under consideration for acceptance in TPLP.

19.3AIMay 23, 2021

Learning First-Order Representations for Planning from Black-Box States: New Results

Ivan D. Rodriguez, Blai Bonet, Javier Romero et al.

Recently Bonet and Geffner have shown that first-order representations for planning domains can be learned from the structure of the state space without any prior knowledge about the action schemas or domain predicates. For this, the learning problem is formulated as the search for a simplest first-order domain description D that along with information about instances I_i (number of objects and initial state) determine state space graphs G(P_i) that match the observed state graphs G_i where P_i = (D, I_i). The search is cast and solved approximately by means of a SAT solver that is called over a large family of propositional theories that differ just in the parameters encoding the possible number of action schemas and domain predicates, their arities, and the number of objects. In this work, we push the limits of these learners by moving to an answer set programming (ASP) encoding using the CLINGO system. The new encodings are more transparent and concise, extending the range of possible models while facilitating their exploration. We show that the domains introduced by Bonet and Geffner can be solved more efficiently in the new approach, often optimally, and furthermore, that the approach can be easily extended to handle partial information about the state graphs as well as noise that prevents some states from being distinguished.

2.0IVMay 22, 2020

Deep Learning Based Detection and Localization of Intracranial Aneurysms in Computed Tomography Angiography

Dufan Wu, Daniel Montes, Ziheng Duan et al.

Purpose: To develop CADIA, a supervised deep learning model based on a region proposal network coupled with a false-positive reduction module for the detection and localization of intracranial aneurysms (IA) from computed tomography angiography (CTA), and to assess our model's performance to a similar detection network. Methods: In this retrospective study, we evaluated 1,216 patients from two separate institutions who underwent CT for the presence of saccular IA>=2.5 mm. A two-step model was implemented: a 3D region proposal network for initial aneurysm detection and 3D DenseNetsfor false-positive reduction and further determination of suspicious IA. Free-response receiver operative characteristics (FROC) curve and lesion-/patient-level performance at established false positive per volume (FPPV) were also performed. Fisher's exact test was used to compare with a similar available model. Results: CADIA's sensitivities at 0.25 and 1 FPPV were 63.9% and 77.5%, respectively. Our model's performance varied with size and location, and the best performance was achieved in IA between 5-10 mm and in those at anterior communicating artery, with sensitivities at 1 FPPV of 95.8% and 94%, respectively. Our model showed statistically higher patient-level accuracy, sensitivity, and specificity when compared to the available model at 0.25 FPPV and the best F-1 score (P<=0.001). At 1 FPPV threshold, our model showed better accuracy and specificity (P<=0.001) and equivalent sensitivity. Conclusions: CADIA outperformed a comparable network in the detection task of IA. The addition of a false-positive reduction module is a feasible step to improve the IA detection models.

24.2CVAug 24, 2019Code

Efficient Learning on Point Clouds with Basis Point Sets

Sergey Prokudin, Christoph Lassner, Javier Romero

With the increased availability of 3D scanning technology, point clouds are moving into the focus of computer vision as a rich representation of everyday scenes. However, they are hard to handle for machine learning algorithms due to their unordered structure. One common approach is to apply occupancy grid mapping, which dramatically increases the amount of data stored and at the same time loses details through discretization. Recently, deep learning models were proposed to handle point clouds directly and achieve input permutation invariance. However, these architectures often use an increased number of parameters and are computationally inefficient. In this work, we propose basis point sets (BPS) as a highly efficient and fully general way to process point clouds with machine learning algorithms. The basis point set representation is a residual representation that can be computed efficiently and can be used with standard neural network architectures and other machine learning algorithms. Using the proposed representation as the input to a simple fully connected network allows us to match the performance of PointNet on a shape classification task while using three orders of magnitude less floating-point operations. In a second experiment, we show how the proposed representation can be used for registering high-resolution meshes to noisy 3D scans. Here, we present the first method for single-pass high-resolution mesh registration, avoiding time-consuming per-scan optimization and allowing real-time execution.

45.5CVJul 27, 2016

Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

Federica Bogo, Angjoo Kanazawa, Christoph Lassner et al.

We describe the first method to automatically estimate the 3D pose of the human body as well as its 3D shape from a single unconstrained image. We estimate a full 3D mesh and show that 2D joints alone carry a surprising amount of information about body shape. The problem is challenging because of the complexity of the human body, articulation, occlusion, clothing, lighting, and the inherent ambiguity in inferring 3D from 2D. To solve this, we first use a recently published CNN-based method, DeepCut, to predict (bottom-up) the 2D body joint locations. We then fit (top-down) a recently published statistical body shape model, called SMPL, to the 2D joints. We do so by minimizing an objective function that penalizes the error between the projected 3D model joints and detected 2D joints. Because SMPL captures correlations in human shape across the population, we are able to robustly fit it to very little data. We further leverage the 3D model to prevent solutions that cause interpenetration. We evaluate our method, SMPLify, on the Leeds Sports, HumanEva, and Human3.6M datasets, showing superior pose accuracy with respect to the state of the art.