Ofer Kruzel

1.4CVMar 10, 2022

Crowd Source Scene Change Detection and Local Map Update

Itzik Wilf, Nati Daniel, Lin Manqing et al.

As scene changes with time map descriptors become outdated, affecting VPS localization accuracy. In this work, we propose an approach to detect structural and texture scene changes to be followed by map update. In our method - map includes 3D points with descriptors generated either via LiDAR or SFM. Common approaches suffer from shortcomings: 1) Direct comparison of the two point-clouds for change detection is slow due to the need to build new point-cloud every time we want to compare; 2) Image based comparison requires to keep the map images adding substantial storage overhead. To circumvent this problems, we propose an approach based on point-clouds descriptors comparison: 1) Based on VPS poses select close query and map images pairs, 2) Registration of query images to map image descriptors, 3) Use segmentation to filter out dynamic or short term temporal changes, 4) Compare the descriptors between corresponding segments.

2.3GRJun 21, 2023

Facial Expression Re-targeting from a Single Character

Ariel Larey, Omri Asraf, Adam Kelder et al.

Video retargeting for digital face animation is used in virtual reality, social media, gaming, movies, and video conference, aiming to animate avatars' facial expressions based on videos of human faces. The standard method to represent facial expressions for 3D characters is by blendshapes, a vector of weights representing the avatar's neutral shape and its variations under facial expressions, e.g., smile, puff, blinking. Datasets of paired frames with blendshape vectors are rare, and labeling can be laborious, time-consuming, and subjective. In this work, we developed an approach that handles the lack of appropriate datasets. Instead, we used a synthetic dataset of only one character. To generalize various characters, we re-represented each frame to face landmarks. We developed a unique deep-learning architecture that groups landmarks for each facial organ and connects them to relevant blendshape weights. Additionally, we incorporated complementary methods for facial expressions that landmarks did not represent well and gave special attention to eye expressions. We have demonstrated the superiority of our approach to previous research in qualitative and quantitative metrics. Our approach achieved a higher MOS of 68% and a lower MSE of 44.2% when tested on videos with various users and expressions.

Ofer Kruzel

2 Papers