Xiaoting Zhang

CV
h-index26
4papers
156citations
Novelty48%
AI Score34

4 Papers

CVFeb 8, 2024
MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

Dewei Zhou, You Li, Fan Ma et al.

We present a Multi-Instance Generation (MIG) task, simultaneously generating multiple instances with diverse controls in one image. Given a set of predefined coordinates and their corresponding descriptions, the task is to ensure that generated instances are accurately at the designated locations and that all instances' attributes adhere to their corresponding description. This broadens the scope of current research on Single-instance generation, elevating it to a more versatile and practical dimension. Inspired by the idea of divide and conquer, we introduce an innovative approach named Multi-Instance Generation Controller (MIGC) to address the challenges of the MIG task. Initially, we break down the MIG task into several subtasks, each involving the shading of a single instance. To ensure precise shading for each instance, we introduce an instance enhancement attention mechanism. Lastly, we aggregate all the shaded instances to provide the necessary information for accurately generating multiple instances in stable diffusion (SD). To evaluate how well generation models perform on the MIG task, we provide a COCO-MIG benchmark along with an evaluation pipeline. Extensive experiments were conducted on the proposed COCO-MIG benchmark, as well as on various commonly used benchmarks. The evaluation results illustrate the exceptional control capabilities of our model in terms of quantity, position, attribute, and interaction. Code and demos will be released at https://migcproject.github.io/.

CVDec 20, 2024
SemDP: Semantic-level Differential Privacy Protection for Face Datasets

Xiaoting Zhang, Tao Wang, Junhao Ji

While large-scale face datasets have advanced deep learning-based face analysis, they also raise privacy concerns due to the sensitive personal information they contain. Recent schemes have implemented differential privacy to protect face datasets. However, these schemes generally treat each image as a separate database, which does not fully meet the core requirements of differential privacy. In this paper, we propose a semantic-level differential privacy protection scheme that applies to the entire face dataset. Unlike pixel-level differential privacy approaches, our scheme guarantees that semantic privacy in faces is not compromised. The key idea is to convert unstructured data into structured data to enable the application of differential privacy. Specifically, we first extract semantic information from the face dataset to build an attribute database, then apply differential perturbations to obscure this attribute data, and finally use an image synthesis model to generate a protected face dataset. Extensive experimental results show that our scheme can maintain visual naturalness and balance the privacy-utility trade-off compared to the mainstream schemes.

CVNov 21, 2019
Furnishing Your Room by What You See: An End-to-End Furniture Set Retrieval Framework with Rich Annotated Benchmark Dataset

Bingyuan Liu, Jiantao Zhang, Xiaoting Zhang et al.

Understanding interior scenes has attracted enormous interest in computer vision community. However, few works focus on the understanding of furniture within the scenes and a large-scale dataset is also lacked to advance the field. In this paper, we first fill the gap by presenting DeepFurniture, a richly annotated large indoor scene dataset, including 24k indoor images, 170k furniture instances and 20k unique furniture identities. On the dataset, we introduce a new benchmark, named furniture set retrieval. Given an indoor photo as input, the task requires to detect all the furniture instances and search a matched set of furniture identities. To address this challenging task, we propose a feature and context embedding based framework. It contains 3 major contributions: (1) An improved Mask-RCNN model with an additional mask-based classifier is introduced for better utilizing the mask information to relieve the occlusion problems in furniture detection context. (2) A multi-task style Siamese network is proposed to train the feature embedding model for retrieval, which is composed of a classification subnet supervised by self-clustered pseudo attributes and a verification subnet to estimate whether the input pair is matched. (3) In order to model the relationship of the furniture entities in an interior design, a context embedding model is employed to re-rank the retrieval results. Extensive experiments demonstrate the effectiveness of each module and the overall system.

NAJul 28, 2017
The Convergence of Least-Squares Progressive Iterative Approximation with Singular Iterative Matrix

Hongwei Lin, Qi Cao, Xiaoting Zhang

Developed in [Deng and Lin, 2014], Least-Squares Progressive Iterative Approximation (LSPIA) is an efficient iterative method for solving B-spline curve and surface least-squares fitting systems. In [Deng and Lin 2014], it was shown that LSPIA is convergent when the iterative matrix is nonsingular. In this paper, we will show that LSPIA is still convergent even the iterative matrix is singular.