Xiaohe Ma

CV
h-index17
4papers
15citations
Novelty65%
AI Score41

4 Papers

CVMar 29, 2022
Efficient Reflectance Capture with a Deep Gated Mixture-of-Experts

Xiaohe Ma, Yaxin Yu, Hongzhi Wu et al.

We present a novel framework to efficiently acquire near-planar anisotropic reflectance in a pixel-independent fashion, using a deep gated mixtureof-experts. While existing work employs a unified network to handle all possible input, our network automatically learns to condition on the input for enhanced reconstruction. We train a gating module to select one out of a number of specialized decoders for reflectance reconstruction, based on photometric measurements, essentially trading generality for quality. A common, pre-trained latent transform module is also appended to each decoder, to offset the burden of the increased number of decoders. In addition, the illumination conditions during acquisition can be jointly optimized. The effectiveness of our framework is validated on a wide variety of challenging samples using a near-field lightstage. Compared with the state-of-the-art technique, our results are improved at the same input bandwidth, and our bandwidth can be reduced to about 1/3 for equal-quality results.

GRApr 27
Neural Enhancement of Analytical Appearance Models

Xuanzhe Shen, Xiaohe Ma, Kun Zhou et al.

Traditional analytical reflectance models, while compact and interpretable, lack the capacity to accurately represent physical measurements. Recent neural models, which closely fit input data, are less generalizable and often more expensive to store and evaluate. To combine the strengths and overcome the limitations of these two classes of models, we present neural enhancement, a novel framework to boost an input analytical appearance model, by identifying and replacing its key computational nodes/operators with small-scale multi-layer perceptrons. This allows us to leverage the computational graph structure of the original model, while improving its expressiveness at a modest cost. To make the enhancement computationally tractable, we propose a hypercube-based search to automatically and efficiently identify the node(s) and/or operator(s) to be replaced towards maximal gain in a differentiable fashion. We enhance a number of common analytical BRDF models. The results are, at once accurate, compact and efficient, and compare favorably with state-of-the-art work on fitting measured reflectance and bidirectional texture functions. Finally, our models are fully compatible with any standard rasterization or ray-tracing pipeline.

CVDec 4, 2024
MaterialPicker: Multi-Modal DiT-Based Material Generation

Xiaohe Ma, Valentin Deschaintre, Miloš Hašan et al.

High-quality material generation is key for virtual environment authoring and inverse rendering. We propose MaterialPicker, a multi-modal material generator leveraging a Diffusion Transformer (DiT) architecture, improving and simplifying the creation of high-quality materials from text prompts and/or photographs. Our method can generate a material based on an image crop of a material sample, even if the captured surface is distorted, viewed at an angle or partially occluded, as is often the case in photographs of natural scenes. We further allow the user to specify a text prompt to provide additional guidance for the generation. We finetune a pre-trained DiT-based video generator into a material generator, where each material map is treated as a frame in a video sequence. We evaluate our approach both quantitatively and qualitatively and show that it enables more diverse material generation and better distortion correction than previous work.

CVMar 27, 2021
Learning Efficient Photometric Feature Transform for Multi-view Stereo

Kaizhang Kang, Cihui Xie, Ruisheng Zhu et al.

We present a novel framework to learn to convert the perpixel photometric information at each view into spatially distinctive and view-invariant low-level features, which can be plugged into existing multi-view stereo pipeline for enhanced 3D reconstruction. Both the illumination conditions during acquisition and the subsequent per-pixel feature transform can be jointly optimized in a differentiable fashion. Our framework automatically adapts to and makes efficient use of the geometric information available in different forms of input data. High-quality 3D reconstructions of a variety of challenging objects are demonstrated on the data captured with an illumination multiplexing device, as well as a point light. Our results compare favorably with state-of-the-art techniques.