Jinwei Ye

CV
h-index14
10papers
58citations
Novelty53%
AI Score46

10 Papers

CVJun 1
Pixel Cube: Diffusion-based Portrait Video Relighting Through Realistic Lighting Reproduction

Yufan Zhang, Yu Ji, Ayo Ajiboye et al.

We present a diffusion-based method for relighting dynamic portrait videos with photorealism and temporal consistency. Our method is fueled by a hybrid training dataset that consists of real-captured and rendered dynamic portrait videos with diverse subject appearances, facial motions, head poses, and known lighting conditions. Specifically, we construct an LED-based lighting system for realistic lighting emulation and high-speed video relighting data acquisition. By leveraging the image priors embedded in pre-trained video diffusion models, and using per-frame high dynamic range (HDR) environment map as lighting control, we train a high-performance generative model for realistic and identity-preserving dynamic portrait video relighting. In addition to the environment map control, our model uses a synthesized background image to enable control on the camera's exposure level and color tone. Our model can produce temporally consistent relit portrait video that looks realistic and harmonious under a provided new environment and faithfully preserve the subject's expression and fine facial features, including skin tone, wrinkles, and facial hair. Our model generalizes well to unseen data, in terms of the subject appearance, motion, and lighting condition. We perform extensive experiments on relighting in-the-wild videos with various environment maps and demonstrate practical applications on portrait photography. Results show that our method achieves state-of-the-art performance in photorealism, lighting harmony, and temporal consistency.

CVAug 25, 2023
Textureless Deformable Surface Reconstruction with Invisible Markers

Xinyuan Li, Yu Ji, Yanchen Liu et al.

Reconstructing and tracking deformable surface with little or no texture has posed long-standing challenges. Fundamentally, the challenges stem from textureless surfaces lacking features for establishing cross-image correspondences. In this work, we present a novel type of markers to proactively enrich the object's surface features, and thereby ease the 3D surface reconstruction and correspondence tracking. Our markers are made of fluorescent dyes, visible only under the ultraviolet (UV) light and invisible under regular lighting condition. Leveraging the markers, we design a multi-camera system that captures surface deformation under the UV light and the visible light in a time multiplexing fashion. Under the UV light, markers on the object emerge to enrich its surface texture, allowing high-quality 3D shape reconstruction and tracking. Under the visible light, markers become invisible, allowing us to capture the object's original untouched appearance. We perform experiments on various challenging scenes, including hand gestures, facial expressions, waving cloth, and hand-object interaction. In all these cases, we demonstrate that our system is able to produce robust, high-quality 3D reconstruction and tracking.

CVApr 21, 2024Code
Turb-Seg-Res: A Segment-then-Restore Pipeline for Dynamic Videos with Atmospheric Turbulence

Ripon Kumar Saha, Dehao Qin, Nianyi Li et al.

Tackling image degradation due to atmospheric turbulence, particularly in dynamic environment, remains a challenge for long-range imaging systems. Existing techniques have been primarily designed for static scenes or scenes with small motion. This paper presents the first segment-then-restore pipeline for restoring the videos of dynamic scenes in turbulent environment. We leverage mean optical flow with an unsupervised motion segmentation method to separate dynamic and static scene components prior to restoration. After camera shake compensation and segmentation, we introduce foreground/background enhancement leveraging the statistics of turbulence strength and a transformer model trained on a novel noise-based procedural turbulence generator for fast dataset augmentation. Benchmarked against existing restoration methods, our approach restores most of the geometric distortion and enhances sharpness for videos. We make our code, simulator, and data publicly available to advance the field of video restoration from turbulence: riponcs.github.io/TurbSegRes

CVNov 6, 2023
Unsupervised Region-Growing Network for Object Segmentation in Atmospheric Turbulence

Dehao Qin, Ripon Saha, Suren Jayasuriya et al.

Moving object segmentation in the presence of atmospheric turbulence is highly challenging due to turbulence-induced irregular and time-varying distortions. In this paper, we present an unsupervised approach for segmenting moving objects in videos downgraded by atmospheric turbulence. Our key approach is a detect-then-grow scheme: we first identify a small set of moving object pixels with high confidence, then gradually grow a foreground mask from those seeds to segment all moving objects. This method leverages rigid geometric consistency among video frames to disentangle different types of motions, and then uses the Sampson distance to initialize the seedling pixels. After growing per-frame foreground masks, we use spatial grouping loss and temporal consistency loss to further refine the masks in order to ensure their spatio-temporal consistency. Our method is unsupervised and does not require training on labeled data. For validation, we collect and release the first real-captured long-range turbulent video dataset with ground truth masks for moving objects. Results show that our method achieves good accuracy in segmenting moving objects and is robust for long-range videos with various turbulence strengths.

CVMar 1, 2025
Seeing A 3D World in A Grain of Sand

Yufan Zhang, Yu Ji, Yu Guo et al.

We present a snapshot imaging technique for recovering 3D surrounding views of miniature scenes. Due to their intricacy, miniature scenes with objects sized in millimeters are difficult to reconstruct, yet miniatures are common in life and their 3D digitalization is desirable. We design a catadioptric imaging system with a single camera and eight pairs of planar mirrors for snapshot 3D reconstruction from a dollhouse perspective. We place paired mirrors on nested pyramid surfaces for capturing surrounding multi-view images in a single shot. Our mirror design is customizable based on the size of the scene for optimized view coverage. We use the 3D Gaussian Splatting (3DGS) representation for scene reconstruction and novel view synthesis. We overcome the challenge posed by our sparse view input by integrating visual hull-derived depth constraint. Our method demonstrates state-of-the-art performance on a variety of synthetic and real miniature scenes.

CVSep 5, 2021
Light Field-Based Underwater 3D Reconstruction Via Angular Resampling

Yuqi Ding, Zhang Chen, Yu Ji et al.

Recovering 3D geometry of underwater scenes is challenging because of non-linear refraction of light at the water-air interface caused by the camera housing. We present a light field-based approach that leverages properties of angular samples for high-quality underwater 3D reconstruction from a single viewpoint. Specifically, we resample the light field image to angular patches. As underwater scenes exhibit weak view-dependent specularity, an angular patch tends to have uniform intensity when sampled at the correct depth. We thus impose this angular uniformity as a constraint for depth estimation. For efficient angular resampling, we design a fast approximation algorithm based on multivariate polynomial regression to approximate nonlinear refraction paths. We further develop a light field calibration algorithm that estimates the water-air interface geometry along with the camera parameters. Comprehensive experiments on synthetic and real data show our method produces state-of-the-art reconstruction on static and dynamic underwater scenes.

ROAug 24, 2021
Next-generation perception system for automated defects detection in composite laminates via polarized computational imaging

Yuqi Ding, Jinwei Ye, Corina Barbalata et al.

Finishing operations on large-scale composite components like wind turbine blades, including trimming and sanding, often require multiple workers and part repositioning. In the composites manufacturing industry, automation of such processes is challenging, as manufactured part geometry may be inconsistent and task completion is based on human judgment and experience. Implementing a mobile, collaborative robotic system capable of performing finishing tasks in dynamic and uncertain environments would improve quality and lower manufacturing costs. To complete the given tasks, the collaborative robotic team must properly understand the environment and detect irregularities in the manufactured parts. In this paper, we describe the initial implementation and demonstration of a polarized computational imaging system to identify defects in composite laminates. As the polarimetric images are highly relevant to the surface micro-geometry, they can be used to detect surface defects that are not visible in conventional color images. The proposed vision system successfully identifies defect types and surface characteristics (e.g., pinholes, voids, scratches, resin flash) for different glass fiber and carbon fiber laminates.

CVApr 15, 2019
PIV-Based 3D Fluid Flow Reconstruction Using Light Field Camera

Zhong Li, Jinwei Ye, Yu Ji et al.

Particle Imaging Velocimetry (PIV) estimates the flow of fluid by analyzing the motion of injected particles. The problem is challenging as the particles lie at different depths but have similar appearance and tracking a large number of particles is particularly difficult. In this paper, we present a PIV solution that uses densely sampled light field to reconstruct and track 3D particles. We exploit the refocusing capability and focal symmetry constraint of the light field for reliable particle depth estimation. We further propose a new motion-constrained optical flow estimation scheme by enforcing local motion rigidity and the Navier-Stoke constraint. Comprehensive experiments on synthetic and real experiments show that using a single light field camera, our technique can recover dense and accurate 3D fluid flows in small to medium volumes.

CVApr 9, 2019
Non-Lambertian Surface Shape and Reflectance Reconstruction Using Concentric Multi-Spectral Light Field

Mingyuan Zhou, Yu Ji, Yuqi Ding et al.

Recovering the shape and reflectance of non-Lambertian surfaces remains a challenging problem in computer vision since the view-dependent appearance invalidates traditional photo-consistency constraint. In this paper, we introduce a novel concentric multi-spectral light field (CMSLF) design that is able to recover the shape and reflectance of surfaces with arbitrary material in one shot. Our CMSLF system consists of an array of cameras arranged on concentric circles where each ring captures a specific spectrum. Coupled with a multi-spectral ring light, we are able to sample viewpoint and lighting variations in a single shot via spectral multiplexing. We further show that such concentric camera/light setting results in a unique pattern of specular changes across views that enables robust depth estimation. We formulate a physical-based reflectance model on CMSLF to estimate depth and multi-spectral reflectance map without imposing any surface prior. Extensive synthetic and real experiments show that our method outperforms state-of-the-art light field-based techniques, especially in non-Lambertian scenes.

CVJan 31, 2018
Robust 3D Human Motion Reconstruction Via Dynamic Template Construction

Zhong Li, Yu Ji, Wei Yang et al.

In multi-view human body capture systems, the recovered 3D geometry or even the acquired imagery data can be heavily corrupted due to occlusions, noise, limited field of- view, etc. Direct estimation of 3D pose, body shape or motion on these low-quality data has been traditionally challenging.In this paper, we present a graph-based non-rigid shape registration framework that can simultaneously recover 3D human body geometry and estimate pose/motion at high fidelity.Our approach first generates a global full-body template by registering all poses in the acquired motion sequence.We then construct a deformable graph by utilizing the rigid components in the global template. We directly warp the global template graph back to each motion frame in order to fill in missing geometry. Specifically, we combine local rigidity and temporal coherence constraints to maintain geometry and motion consistencies. Comprehensive experiments on various scenes show that our method is accurate and robust even in the presence of drastic motions.