Ori Malca

h-index18
2papers

2 Papers

CVDec 29, 2024
Bringing Objects to Life: training-free 4D generation from 3D objects through view consistent noise

Ohad Rahamim, Ori Malca, Dvir Samuel et al.

Recent advancements in generative models have enabled the creation of dynamic 4D content - 3D objects in motion - based on text prompts, which holds potential for applications in virtual worlds, media, and gaming. Existing methods provide control over the appearance of generated content, including the ability to animate 3D objects. However, their ability to generate dynamics is limited to the mesh datasets they were trained on, lacking any growth or structural development capability. In this work, we introduce a training-free method for animating 3D objects by conditioning on textual prompts to guide 4D generation, enabling custom general scenes while maintaining the original object's identity. We first convert a 3D mesh into a static 4D Neural Radiance Field (NeRF) that preserves the object's visual attributes. Then, we animate the object using an Image-to-Video diffusion model driven by text. To improve motion realism, we introduce a view-consistent noising protocol that aligns object perspectives with the noising process to promote lifelike movement, and a masked Score Distillation Sampling (SDS) loss that leverages attention maps to focus optimization on relevant regions, better preserving the original object. We evaluate our model on two different 3D object datasets for temporal coherence, prompt adherence, and visual fidelity, and find that our method outperforms the baseline based on multiview training, achieving better consistency with the textual prompt in hard scenarios.

CVAug 12, 2025
Per-Query Visual Concept Learning

Ori Malca, Dvir Samuel, Gal Chechik

Visual concept learning, also known as Text-to-image personalization, is the process of teaching new concepts to a pretrained model. This has numerous applications from product placement to entertainment and personalized design. Here we show that many existing methods can be substantially augmented by adding a personalization step that is (1) specific to the prompt and noise seed, and (2) using two loss terms based on the self- and cross- attention, capturing the identity of the personalized concept. Specifically, we leverage PDM features -- previously designed to capture identity -- and show how they can be used to improve personalized semantic similarity. We evaluate the benefit that our method gains on top of six different personalization methods, and several base text-to-image models (both UNet- and DiT-based). We find significant improvements even over previous per-query personalization methods.