С.Г. Елизаров

h-index8

3papers

36citations

Novelty48%

AI Score27

Ranked #157,416 of 194,257 authors (top 81%)#51,060 in CV (top 86%)

3 Papers

16.8CVAug 6, 2024

IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts

Ciara Rowles, Shimon Vainer, Dante De Nigris et al.

Diffusion models continuously push the boundary of state-of-the-art image generation, but the process is hard to control with any nuance: practice proves that textual prompts are inadequate for accurately describing image style or fine structural details (such as faces). ControlNet and IPAdapter address this shortcoming by conditioning the generative process on imagery instead, but each individual instance is limited to modeling a single conditional posterior: for practical use-cases, where multiple different posteriors are desired within the same workflow, training and using multiple adapters is cumbersome. We propose IPAdapter-Instruct, which combines natural-image conditioning with ``Instruct'' prompts to swap between interpretations for the same conditioning image: style transfer, object extraction, both, or something else still? IPAdapterInstruct efficiently learns multiple tasks with minimal loss in quality compared to dedicated per-task models.

15.3CVSep 5, 2024

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Slava Elizarov, Ciara Rowles, Simon Donné

Generating high-quality 3D objects from textual descriptions remains a challenging problem due to computational cost, the scarcity of 3D data, and complex 3D representations. We introduce Geometry Image Diffusion (GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images, thereby avoiding the need for complex 3D-aware architectures. By integrating a Collaborative Control mechanism, we exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion. This enables strong generalization even with limited 3D training data (allowing us to use only high-quality training data) as well as retaining compatibility with guidance techniques such as IPAdapter. In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models. The generated objects consist of semantically meaningful, separate parts and include internal structures, enhancing both usability and versatility.

1.2AO-PHNov 9, 2020

Machine learning methods for the detection of polar lows in satellite mosaics: major issues and their solutions

Mikhail Krinitskiy, Polina Verezemskaya, Svyatoslav Elizarov et al.

Polar mesocyclones (PMCs) and their intense subclass polar lows (PLs) are relatively small atmospheric vortices that form mostly over the ocean in high latitudes. PLs can strongly influence deep ocean water formation since they are associated with strong surface winds and heat fluxes. Detection and tracking of PLs are crucial for understanding the climatological dynamics of PLs and for the analysis of their impacts on other components of the climatic system. At the same time, visual tracking of PLs is a highly time-consuming procedure that requires expert knowledge and extensive examination of source data. There are known procedures involving deep convolutional neural networks (DCNNs) for the detection of large-scale atmospheric phenomena in reanalysis data that demonstrate a high quality of detection. However, one cannot apply these procedures to satellite data directly since, unlike reanalyses, satellite products register all the scales of atmospheric vortices. It is also known that DCNNs were originally designed to be scale-invariant. This leads to the problem of filtering the scale of detected phenomena. There are other problems to be solved, such as a low signal-to-noise ratio of satellite data and an unbalanced number of negative (without PLs) and positive (where a PL is presented) classes in a satellite dataset. In our study, we propose a deep learning approach for the detection of PLs and PMCs in remote sensing data, which addresses class imbalance and scale filtering problems. We also outline potential solutions for other problems, along with promising improvements to the presented approach.