Xiying Wang

CVFeb 3, 2024

Mitigating Prior Shape Bias in Point Clouds via Differentiable Center Learning

Zhe Li, Xiying Wang, Jinglin Zhao et al.

Masked autoencoding and generative pretraining have achieved remarkable success in computer vision and natural language processing, and more recently, they have been extended to the point cloud domain. Nevertheless, existing point cloud models suffer from the issue of information leakage due to the pre-sampling of center points, which leads to trivial proxy tasks for the models. These approaches primarily focus on local feature reconstruction, limiting their ability to capture global patterns within point clouds. In this paper, we argue that the reduced difficulty of pretext tasks hampers the model's capacity to learn expressive representations. To address these limitations, we introduce a novel solution called the Differentiable Center Sampling Network (DCS-Net). It tackles the information leakage problem by incorporating both global feature reconstruction and local feature reconstruction as non-trivial proxy tasks, enabling simultaneous learning of both the global and local patterns within point cloud. Experimental results demonstrate that our method enhances the expressive capacity of existing point cloud models and effectively addresses the issue of information leakage.

HCDec 1, 2018

Conversations for Vision: Remote Sighted Assistants Helping People with Visual Impairments

Sooyeon Lee, Madison Reddie, Krish Gurdasani et al.

People with visual impairment (PVI) must interact with a world they cannot see. Remote sighted assistance has emerged as a conversational/social support system. We interviewed participants who either provide or receive assistance via a conversational/social prosthetic called Aira (https://aira.io/). We identified four types of support provided: scene description, performance, social interaction, and navigation. We found that conversational style is context-dependent. Sighted assistants make intentional efforts to elicit PVI's personal knowledge and leverage it in the guidance they provide. PVI used non-verbal behaviors (e.g. hand gestures) as a parallel communication channel to provide feedback or guidance to sighted assistants. We also discuss implications for design.

Xiying Wang

2 Papers