Ken Perlin

HC
h-index22
17papers
779citations
Novelty43%
AI Score42

17 Papers

CVSep 1, 2022
Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets

Kristofer Schlachter, Benjamin Ahlbrand, Zhu Wang et al.

When creating 3D content, highly specialized skills are generally needed to design and generate models of objects and other assets by hand. We address this problem through high-quality 3D asset retrieval from multi-modal inputs, including 2D sketches, images and text. We use CLIP as it provides a bridge to higher-level latent features. We use these features to perform a multi-modality fusion to address the lack of artistic control that affects common data-driven approaches. Our approach allows for multi-modal conditional feature-driven retrieval through a 3D asset database, by utilizing a combination of input latent embeddings. We explore the effects of different combinations of feature embeddings across different input types and weighting methods.

COMP-PHNov 25, 2023
A GPU-based Hydrodynamic Simulator with Boid Interactions

Xi Liu, Gizem Kayar, Ken Perlin

We present a hydrodynamic simulation system using the GPU compute shaders of DirectX for simulating virtual agent behaviors and navigation inside a smoothed particle hydrodynamical (SPH) fluid environment with real-time water mesh surface reconstruction. The current SPH literature includes interactions between SPH and heterogeneous meshes but seldom involves interactions between SPH and virtual boid agents. The contribution of the system lies in the combination of the parallel smoothed particle hydrodynamics model with the distributed boid model of virtual agents to enable agents to interact with fluids. The agents based on the boid algorithm influence the motion of SPH fluid particles, and the forces from the SPH algorithm affect the movement of the boids. To enable realistic fluid rendering and simulation in a particle-based system, it is essential to construct a mesh from the particle attributes. Our system also contributes to the surface reconstruction aspect of the pipeline, in which we performed a set of experiments with the parallel marching cubes algorithm per frame for constructing the mesh from the fluid particles in a real-time compute and memory-intensive application, producing a wide range of triangle configurations. We also demonstrate that our system is versatile enough for reinforced robotic agents instead of boid agents to interact with the fluid environment for underwater navigation and remote control engineering purposes.

30.0ROMay 20
Flying Together: Human-Guided Immersive Shared Control for Aerial Robot Teams in Unknown Environments

Lou De Bel-Air, Luca Morando, Ruitao Chen et al.

While autonomous multi-robots can achieve safe and coordinated navigation, they often struggle to adapt to unforeseen conditions and to capture operator-driven objectives in unstructured environments. We present a Virtual Reality (VR)-based shared control framework for teams of drones operating in constrained and unknown environments, enabling real-time, user-guided exploration. At the core of our approach is a novel, user-guided motion-primitive-based planner that computes continuous, collision-free trajectories while continuously integrating operator input. This planner is coupled with an admittance controller, allowing the operator to flexibly influence team behavior and guide drones toward regions of interest that autonomous planners may overlook. The system supports mixed-reality operations with both physical and simulated drones, and implements a bilateral VR-based interface, allowing the operator to guide the robot team via migration points while receiving immediate visual feedback of the team state. Experimental results show that shared control improves obstacle avoidance, maintains inter-agent spacing, and reduces operator effort, demonstrating the feasibility and advantages of immersive, human-in-the-loop multi-robot navigation.

HCSep 19, 2018Code
Chalktalk : A Visualization and Communication Language -- As a Tool in the Domain of Computer Science Education

Ken Perlin, Zhenyi He, Karl Rosenberg

In the context of a classroom lesson, concepts must be visualized and organized in many ways depending on the needs of the teacher and students. Traditional presentation media such as the blackboard or electronic whiteboard allow for static hand-drawn images, and slideshow software may be used to generate linear sequences of text and pre-animated images. However, none of these media support the creation of dynamic visualizations that can be manipulated, combined, or re-animated in real-time, and so demonstrating new concepts or adapting to changes in the requirements of a presentation is a challenge. Thus, we propose Chalktalk as a solution. Chalktalk is an open-source presentation and visualization tool in which the user's drawings are recognized as animated and interactive "sketches," which the user controls via mouse gestures. Sketches help users demonstrate and experiment with complex ideas (e.g. computer graphics, procedural animation, logic) during a live presentation without needing to create and structure all content ahead of time. Because sketches can interoperate and be programmed to represent underlying data in multiple ways, Chalktalk presents the opportunity to visualize key concepts in computer science: especially data structures, whose data and form change over time due to the variety of interactions within a computer system. To show Chalktalk's capabilities, we have prototyped sketch implementations for binary search tree (BST) and stack (LIFO) data structures, which take advantage of sketches' ability to interact and change at run-time. We discuss these prototypes and conclude with considerations for future research using the Chalktalk platform.

HCJan 11, 2024
DrawTalking: Building Interactive Worlds by Sketching and Speaking

Karl Toby Rosenberg, Rubaiat Habib Kazi, Li-Yi Wei et al.

We introduce DrawTalking, an approach to building and controlling interactive worlds by sketching and speaking while telling stories. It emphasizes user control and flexibility, and gives programming-like capability without requiring code. An early open-ended study with our prototype shows that the mechanics resonate and are applicable to many creative-exploratory use cases, with the potential to inspire and inform research in future natural interfaces for creative exploration and authoring.

HCDec 11, 2021
UrbanRama: Navigating Cities in Virtual Reality

Shaoyu Chen, Fabio Miranda, Nivan Ferreira et al.

Exploring large virtual environments, such as cities, is a central task in several domains, such as gaming and urban planning. VR systems can greatly help this task by providing an immersive experience; however, a common issue with viewing and navigating a city in the traditional sense is that users can either obtain a local or a global view, but not both at the same time, requiring them to continuously switch between perspectives, losing context and distracting them from their analysis. In this paper, our goal is to allow users to navigate to points of interest without changing perspectives. To accomplish this, we design an intuitive navigation interface that takes advantage of the strong sense of spatial presence provided by VR. We supplement this interface with a perspective that warps the environment, called UrbanRama, based on a cylindrical projection, providing a mix of local and global views. The design of this interface was performed as an iterative process in collaboration with architects and urban planners. We conducted a qualitative and a quantitative pilot user study to evaluate UrbanRama and the results indicate the effectiveness of our system in reducing perspective changes, while ensuring that the warping doesn't affect distance and orientation perception.

HCJul 13, 2021
LookAtChat: Visualizing Gaze Awareness for Remote Small-Group Conversations

Zhenyi He, Ruofei Du, Ken Perlin

Video conferences play a vital role in our daily lives. However, many nonverbal cues are missing, including gaze and spatial information. We introduce LookAtChat, a web-based video conferencing system, which empowers remote users to identify gaze awareness and spatial relationships in small-group conversations. Leveraging real-time eye-tracking technology available with ordinary webcams, LookAtChat tracks each user's gaze direction, identifies who is looking at whom, and provides corresponding spatial cues. Informed by formative interviews with 5 participants who regularly use videoconferencing software, we explored the design space of gaze visualization in both 2D and 3D layouts. We further conducted an exploratory user study (N=20) to evaluate LookAtChat in three conditions: baseline layout, 2D directional layout, and 3D perspective layout. Our findings demonstrate how LookAtChat engages participants in small-group conversations, how gaze and spatial information improve conversation quality, and the potential benefits and challenges to incorporating gaze awareness visualization into existing videoconferencing systems.

HCMay 31, 2020
A Virtual Obstacle Course within Diverse Sensory Environments

Zhu Wang, Anat Lubetzky, Charles Hendee et al.

We developed a novel assessment platform with untethered virtual reality, 3-dimensional sounds, and pressure sensing floor mat to help assess the walking balance and negotiation of obstacles given diverse sensory load and/or cognitive load. The platform provides an immersive 3D city-like scene with anticipated/unanticipated virtual obstacles. Participants negotiate the obstacles with perturbations of: auditory load by spatial audio, cognitive load by a memory task, and visual flow by generated by avatars movements at various amounts and speeds. A VR headset displays the scenes while providing real-time position and orientation of the participant's head. A pressure-sensing walkway senses foot pressure and visualizes it in a heatmap. The system helps to assess walking balance via pressure dynamics per foot, success rate of crossing obstacles, available response time as well as head kinematics in response to obstacles and multitasking. Based on the assessment, specific balance training and fall prevention program can be prescribed.

HCDec 9, 2019
Exploring the Effectiveness of Face-to-face Mixed Reality for Teaching with Chalktalk

Zhenyi He, Ken Perlin

Teaching that uses projected presentation media such as slide-shows lacks support for dynamic content whose form and behaviors require live changes during a lecture. Recent software alternatives such as the Chalktalk software platform allow the creation of interactive simulations in arbitrary sequences and combinations within presentations. These more dynamic solutions, however, do not optimize for face-to-face interactions: eye-contact, gaze direction, and concurrent awareness of another person's movements together with the presented content. To explore the extent to which these face-to-face interactions may improve learning and engagement during a lecture, we propose a Mixed Reality (MR) platform that places Chalktalk's behaviors and simulations within a mirrored virtual world environment designed for face-to-face, one-on-one interactions. We compare our system with projected Chalktalk to evaluate its relative effectiveness for learning, retention, and level of engagement.

HCNov 15, 2019
Exploring Configurations for Multi-user Communication in Virtual Reality

Zhenyi He, Karl Rosenberg, Ken Perlin

Virtual Reality (VR) enables users to collaborate while exploring scenarios not realizable in the physical world. We propose CollabVR, a distributed multi-user collaboration environment, to explore how digital content improves expression and understanding of ideas among groups. To achieve this, we designed and examined three possible configurations for participants and shared manipulable objects. In configuration (1), participants stand side-by-side. In (2), participants are positioned across from each other, mirrored face-to-face. In (3), called "eyes-free," participants stand side-by-side looking at a shared display, and draw upon a horizontal surface. We also explored a "telepathy" mode, in which participants could see from each other's point of view. We implemented "3DSketch" visual objects for participants to manipulate and move between virtual content boards in the environment. To evaluate the system, we conducted a study in which four people at a time used each of the three configurations to cooperate and communicate ideas with each other. We have provided experimental results and interview responses.

CVSep 4, 2019
Beyond Photo Realism for Domain Adaptation from Synthetic Data

Kristofer Schlachter, Connor DeFanti, Sebastian Herscher et al.

As synthetic imagery is used more frequently in training deep models, it is important to understand how different synthesis techniques impact the performance of such models. In this work, we perform a thorough evaluation of the effectiveness of several different synthesis techniques and their impact on the complexity of classifier domain adaptation to the "real" underlying data distribution that they seek to replicate. In addition, we propose a novel learned synthesis technique to better train classifier models than state-of-the-art offline graphical methods, while using significantly less computational resources. We accomplish this by learning a generative model to perform shading of synthetic geometry conditioned on a "g-buffer" representation of the scene to render, as well as a low sample Monte Carlo rendered image. The major contributions are (i) a dataset that allows comparison of real and synthetic versions of the same scene, (ii) an augmented data representation that boosts the stability of learning and improves the datasets accuracy, (iii) three different partially differentiable rendering techniques where lighting, denoising and shading are learned, and (iv) we improve a state of the art generative adversarial network (GAN) approach by using an ensemble of trained models to generate datasets that approach the performance of training on real data and surpass the performance of the full global illumination rendering.

HCFeb 8, 2019
Virtual Environments for Rehabilitation of Postural Control Dysfunction

Zhu Wang, Anat Lubetzky, Marta Gospodarek et al.

We developed a novel virtual reality [VR] platform with 3-dimensional sounds to help improve sensory integration and visuomotor processing for postural control and fall prevention in individuals with balance problems related to sensory deficits, such as vestibular dysfunction (disease of the inner ear). The system has scenes that simulate scenario-based environments. We can adjust the intensity of the visual and audio stimuli in the virtual scenes by controlling the user interface (UI) settings. A VR headset (HTC Vive or Oculus Rift) delivers stereo display while providing real-time position and orientation of the participants' head. The 3D game-like scenes make participants feel immersed and gradually exposes them to situations that may induce dizziness, anxiety or imbalance in their daily-living.

HCSep 16, 2018
Manifest the Invisible: Design for Situational Awareness of Physical Environments in Virtual Reality

Zhenyi He, Fengyuan Zhu, Ken Perlin et al.

Virtual Reality (VR) provides immersive experiences in the virtual world, but it may reduce users' awareness of physical surroundings and cause safety concerns and psychological discomfort. Hence, there is a need of an ambient information design to increase users' situational awareness (SA) of physical elements when they are immersed in VR environment. This is challenging, since there is a tradeoff between the awareness in reality and the interference with users' experience in virtuality. In this paper, we design five representations (indexical, symbolic, and iconic with three emotions) based on two dimensions (vividness and emotion) to address the problem. We conduct an empirical study to evaluate participants' SA, perceived breaks in presence (BIPs), and perceived engagement through VR tasks that require movement in space. Results show that designs with higher vividness evoke more SA, designs that are more consistent with the virtual environment can mitigate the BIP issue, and emotion-evoking designs are more engaging.

HCAug 10, 2017
PhyShare: Sharing Physical Interaction in Virtual Reality

Zhenyi He, Fengyuan Zhu, Ken Perlin

We present PhyShare, a new haptic user interface based on actuated robots. Virtual reality has recently been gaining wide adoption, and an effective haptic feedback in these scenarios can strongly support user's sensory in bridging virtual and physical world. Since participants do not directly observe these robotic proxies, we investigate the multiple mappings between physical robots and virtual proxies that can utilize the resources needed to provide a well rounded VR experience. PhyShare bots can act either as directly touchable objects or invisible carriers of physical objects, depending on different scenarios. They also support distributed collaboration, allowing remotely located VR collaborators to share the same physical feedback.

HCJan 31, 2017
Robotic Haptic Proxies for Collaborative Virtual Reality

Zhenyi He, Fengyuan Zhu, Aaron Gaudette et al.

We propose a new approach for interaction in Virtual Reality (VR) using mobile robots as proxies for haptic feedback. This approach allows VR users to have the experience of sharing and manipulating tangible physical objects with remote collaborators. Because participants do not directly observe the robotic proxies, the mapping between them and the virtual objects is not required to be direct. In this paper, we describe our implementation, various scenarios for interaction, and a preliminary user study.

CVJul 13, 2016
Accelerating Eulerian Fluid Simulation With Convolutional Networks

Jonathan Tompson, Kristofer Schlachter, Pablo Sprechmann et al.

Efficient simulation of the Navier-Stokes equations for fluid flow is a long standing problem in applied mathematics, for which state-of-the-art methods require large compute resources. In this work, we propose a data-driven approach that leverages the approximation power of deep-learning with the precision of standard solvers to obtain fast and highly realistic simulations. Our method solves the incompressible Euler equations using the standard operator splitting method, in which a large sparse linear system with many free parameters must be solved. We use a Convolutional Network with a highly tailored architecture, trained using a novel unsupervised learning framework to solve the linear system. We present real-time 2D and 3D simulations that outperform recently proposed data-driven methods; the obtained results are realistic and show good generalization properties.

HCApr 27, 2016
A Collaborative Untethered Virtual Reality Environment for Interactive Social Network Visualization

Sam Royston, Connor DeFanti, Ken Perlin

The increasing prevalence of Virtual Reality technologies as a platform for gaming and video playback warrants research into how to best apply the current state of the art to challenges in data visualization. Many current VR systems are noncollaborative, while data analysis and visualization is often a multi-person process. Our goal in this paper is to address the technical and user experience challenges that arise when creating VR environments for collaborative data visualization. We focus on the integration of multiple tracking systems and the new interaction paradigms that this integration can enable, along with visual design considerations that apply specifically to collaborative network visualization in virtual reality. We demonstrate a system for collaborative interaction with large 3D layouts of Twitter friend/follow networks. The system is built by combining a 'Holojam' architecture (multiple GearVR Headsets within an OptiTrack motion capture stage) and Perception Neuron motion suits, to offer an untethered, full-room multi-person visualization experience.