LG HCJan 25, 2025

Into the Void: Mapping the Unseen Gaps in High Dimensional Data

Xinyu Zhang, Tyler Estro, Geoff Kuenning, Erez Zadok, Klaus Mueller

arXiv:2501.15273v14.1h-index: 44IEEE Trans Vis Comput Graph

Originality Incremental advance

AI Analysis

This addresses the challenge of efficiently discovering valuable configurations in high-dimensional datasets for domains like parameter optimization and adversarial learning, representing an incremental improvement over existing methods.

The paper tackles the problem of exploring untapped opportunities in high-dimensional data by developing a pipeline with a visual analytics system and a novel algorithm to identify empty spaces, resulting in substantially superior novel configurations compared to conventional methods.

We present a comprehensive pipeline, augmented by a visual analytics system named ``GapMiner'', that is aimed at exploring and exploiting untapped opportunities within the empty areas of high-dimensional datasets. Our approach begins with an initial dataset and then uses a novel Empty Space Search Algorithm (ESA) to identify the center points of these uncharted voids, which are regarded as reservoirs containing potentially valuable novel configurations. Initially, this process is guided by user interactions facilitated by GapMiner. GapMiner visualizes the Empty Space Configurations (ESC) identified by the search within the context of the data, enabling domain experts to explore and adjust ESCs using a linked parallel-coordinate display. These interactions enhance the dataset and contribute to the iterative training of a connected deep neural network (DNN). As the DNN trains, it gradually assumes the task of identifying high-potential ESCs, diminishing the need for direct user involvement. Ultimately, once the DNN achieves adequate accuracy, it autonomously guides the exploration of optimal configurations by predicting performance and refining configurations, using a combination of gradient ascent and improved empty-space searches. Domain users were actively engaged throughout the development of our system. Our findings demonstrate that our methodology consistently produces substantially superior novel configurations compared to conventional randomization-based methods. We illustrate the effectiveness of our method through several case studies addressing various objectives, including parameter optimization, adversarial learning, and reinforcement learning.

View on arXiv PDF

Similar