Yan Gu

RO
h-index26
24papers
433citations
Novelty49%
AI Score56

24 Papers

DSJun 2
Parallel Metric Skiplists and Nearest Neighbor Search

Xiangyun Ding, Rohin Garg, Yan Gu et al.

The metric skip-list is a data structure designed for efficient nearest and $k$-nearest neighbor search in metric spaces. For many real-world datasets with reasonable distributions - specifically, those with a constant expansion rate - it supports $\tilde{O}(n)$ construction time and $O(k\log n)$ query time, where $n$ is the input size and $k$ is the number of nearest neighbors in queries. Notably, unlike alternative approaches, it does not require a bounded aspect ratio, making it more flexible for input data distributions. However, the inherently sequential nature of its original construction has, to our knowledge, precluded any existing parallel algorithm. In this paper, we present highly parallel and work-efficient algorithms for constructing metric skip lists. Under the assumption of a constant expansion rate, our approach achieves an expected work of $O(n \log n)$ and a polylogarithmic span with high probability. Our design is based on novel algorithmic insights that improves the sequential procedure, enabling a divide-and-conquer strategy that facilitates parallelism while maintaining efficiency. With our algorithms, we can also support improved bounds for relevant applications using nearest neighbor as building blocks, including bichromatic closest pair (BCP), density-based clustering, and $k$-NN graph construction, among others. To our knowledge, many of these results represent the first solutions to achieve both work efficiency and polylogarithmic span, relying solely on the assumption of a constant expansion rate.

DCMay 3
Parallel Point-to-Point Shortest Paths and Batch Queries

Xiaojun Dong, Andy Li, Yan Gu et al.

We propose Orionet, efficient parallel implementations of Point-to-Point Shortest Paths (PPSP) queries using bidirectional search (BiDS) and other heuristics, with an additional focus on batch PPSP queries. We present a framework for parallel PPSP built on existing single-source shortest paths (SSSP) frameworks by incorporating pruning conditions. As a result, we develop efficient parallel PPSP algorithms based on early termination, bidirectional search, A$^*$ search, and bidirectional A$^*$ all with simple and efficient implementations. We extend our idea to batch PPSP queries, which are widely used in real-world scenarios. We first design a simple and flexible abstraction to represent the batch so PPSP can leverage the shared information of the batch. Orionet formalizes the batch as a query graph represented by edges between queried sources and targets. In this way, we directly extended our PPSP framework to batched queries in a simple and efficient way. We evaluate Orionet on both single and batch PPSP queries using various graph types and distance percentiles of queried pairs, and compare it against two baselines, GraphIt and MBQ. Both of them support parallel single PPSP and A$^*$ using unidirectional search. On 14 graphs we tested, on average, our bidirectional search is 2.9$\times$ faster than GraphIt, and 6.8$\times$ faster than MBQ. Our bidirectional A$^*$ is 4.4$\times$ and 6.2$\times$ faster than the A$^*$ in GraphIt and MBQ, respectively. For batched PPSP queries, we also provide in-depth experimental evaluation, and show that Orionet provides strong performance compared to the plain solutions.

CLNov 6, 2023
Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs

Longyue Wang, Zhaopeng Tu, Yan Gu et al.

Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation. First, we (Tencent AI Lab and China Literature Ltd.) release a copyrighted and document-level Chinese-English web novel corpus. Furthermore, we put forth an industry-endorsed criteria to guide human evaluation process. This year, we totally received 14 submissions from 7 academia and industry teams. We employ both automatic and human evaluations to measure the performance of the submitted systems. The official ranking of the systems is based on the overall human judgments. In addition, our extensive analysis reveals a series of interesting findings on literary and discourse-aware MT. We release data, system outputs, and leaderboard at http://www2.statmt.org/wmt23/literary-translation-task.html.

ROMar 18Code
PACE: Physics Augmentation for Coordinated End-to-end Reinforcement Learning toward Versatile Humanoid Table Tennis

Muqun Hu, Wenxi Chen, Wenjing Li et al.

Humanoid table tennis (TT) demands rapid perception, proactive whole-body motion, and agile footwork under strict timing--capabilities that remain difficult for end-to-end control policies. We propose a reinforcement learning (RL) framework that maps ball-position observations directly to whole-body joint commands for both arm striking and leg locomotion, strengthened by predictive signals and dense, physics-guided rewards. A lightweight learned predictor, fed with recent ball positions, estimates future ball states and augments the policy's observations for proactive decision-making. During training, a physics-based predictor supplies precise future states to construct dense, informative rewards that lead to effective exploration. The resulting policy attains strong performance across varied serve ranges (hit rate$\geq$96% and success rate$\geq$92%) in simulations. Ablation studies confirm that both the learned predictor and the predictive reward design are critical for end-to-end learning. Deployed zero-shot on a physical Booster T1 humanoid with 23 revolute joints, the policy produces coordinated lateral and forward-backward footwork with accurate, fast returns, suggesting a practical path toward versatile, competitive humanoid TT. We have open-sourced our RL training code at: https://github.com/purdue-tracelab/TTRL-ICRA2026

ROApr 22
A Survey of Legged Robotics in Non-Inertial Environments: Past, Present, and Future

I-Chia Chang, Xinyan Huang, Tzu-Yuan Lin et al.

Legged robots have demonstrated remarkable agility on rigid, stationary ground, but their locomotion reliability remains limited in non-inertial environments, where the supporting ground moves, tilts, or accelerates. Such conditions arise in ground transportation, maritime platforms, and aerospace settings, and they introduce persistent time-varying disturbances that break the stationary-ground assumptions underlying conventional legged locomotion. This survey reviews the state of the art in modeling, state estimation, and control for legged robots in non-inertial environments. We summarize representative application domains and motion characteristics, analyze the root causes of locomotion performance degradation, and review existing methods together with their key assumptions and limitations. We further identify open problems in robot-environment coupling, observability, robustness, and experimental validation, and discuss future directions in autonomy, system-level design, bio-inspired strategies, safety, and testing. The survey aims to clarify the technical foundations of this emerging area and support the development of reliable legged robots for real-world dynamic environments.

ROApr 14
Vectorizing Projection in Manifold-Constrained Motion Planning for Real-Time Whole-Body Control

Shrutheesh R Iyer, I-Chia Chang, Andrew Z. Liu et al.

Many robot planning tasks require satisfaction of one or more constraints throughout the entire trajectory. For geometric constraints, manifold-constrained motion planning algorithms are capable of planning collision-free path between start and goal configurations on the constraint submanifolds specified by task. Current state-of-the-art methods can take tens of seconds to solve these tasks for complex systems such as humanoid robots, making real-world use impractical, especially in dynamic settings. Inspired by recent advances in hardware accelerated motion planning, we present a CPU SIMD-accelerated manifold-constrained motion planner that revisits projection-based constraint satisfaction through the lens of parallelization. By transforming relevant components into parallelizable structures, we use SIMD parallelism to plan constraint satisfying solutions. Our approach achieves up to 100-1000x speed-ups over the state-of-the-art, making real-time constrained motion planning feasible for the first time. We demonstrate our planner on a real humanoid robot and show real-time whole-body quasi-static plan generation. Our work is available at https://commalab.org/papers/mcvamp/.

CVApr 11, 2025Code
MBE-ARI: A Multimodal Dataset Mapping Bi-directional Engagement in Animal-Robot Interaction

Ian Noronha, Advait Prasad Jawaji, Juan Camilo Soto et al.

Animal-robot interaction (ARI) remains an unexplored challenge in robotics, as robots struggle to interpret the complex, multimodal communication cues of animals, such as body language, movement, and vocalizations. Unlike human-robot interaction, which benefits from established datasets and frameworks, animal-robot interaction lacks the foundational resources needed to facilitate meaningful bidirectional communication. To bridge this gap, we present the MBE-ARI (Multimodal Bidirectional Engagement in Animal-Robot Interaction), a novel multimodal dataset that captures detailed interactions between a legged robot and cows. The dataset includes synchronized RGB-D streams from multiple viewpoints, annotated with body pose and activity labels across interaction phases, offering an unprecedented level of detail for ARI research. Additionally, we introduce a full-body pose estimation model tailored for quadruped animals, capable of tracking 39 keypoints with a mean average precision (mAP) of 92.7%, outperforming existing benchmarks in animal pose estimation. The MBE-ARI dataset and our pose estimation framework lay a robust foundation for advancing research in animal-robot interaction, providing essential tools for developing perception, reasoning, and interaction frameworks needed for effective collaboration between robots and animals. The dataset and resources are publicly available at https://github.com/RISELabPurdue/MBE-ARI/, inviting further exploration and development in this critical area.

DCJan 2, 2025
Deep Reinforcement Learning for Job Scheduling and Resource Management in Cloud Computing: An Algorithm-Level Review

Yan Gu, Zhaoze Liu, Shuhong Dai et al.

Cloud computing has revolutionized the provisioning of computing resources, offering scalable, flexible, and on-demand services to meet the diverse requirements of modern applications. At the heart of efficient cloud operations are job scheduling and resource management, which are critical for optimizing system performance and ensuring timely and cost-effective service delivery. However, the dynamic and heterogeneous nature of cloud environments presents significant challenges for these tasks, as workloads and resource availability can fluctuate unpredictably. Traditional approaches, including heuristic and meta-heuristic algorithms, often struggle to adapt to these real-time changes due to their reliance on static models or predefined rules. Deep Reinforcement Learning (DRL) has emerged as a promising solution to these challenges by enabling systems to learn and adapt policies based on continuous observations of the environment, facilitating intelligent and responsive decision-making. This survey provides a comprehensive review of DRL-based algorithms for job scheduling and resource management in cloud computing, analyzing their methodologies, performance metrics, and practical applications. We also highlight emerging trends and future research directions, offering valuable insights into leveraging DRL to advance both job scheduling and resource management in cloud computing.

CLDec 16, 2024
Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation

Longyue Wang, Siyou Liu, Chenyang Lyu et al.

Following last year, we have continued to host the WMT translation shared task this year, the second edition of the Discourse-Level Literary Translation. We focus on three language directions: Chinese-English, Chinese-German, and Chinese-Russian, with the latter two ones newly added. This year, we totally received 10 submissions from 5 academia and industry teams. We employ both automatic and human evaluations to measure the performance of the submitted systems. The official ranking of the systems is based on the overall human judgments. We release data, system outputs, and leaderboard at https://www2.statmt.org/wmt24/literary-translation-task.html.

LGOct 30, 2024
GWQ: Gradient-Aware Weight Quantization for Large Language Models

Yihua Shao, Yan Gu, Siyu Chen et al.

Large language models (LLMs) show impressive performance in solving complex language tasks. However, its large number of parameters presents significant challenges for the deployment. So, compressing LLMs to low bits can enable to deploy on resource-constrained devices. To address this problem, we propose gradient-aware weight quantization (GWQ), the first quantization approach for low-bit weight quantization that leverages gradients to localize outliers, requiring only a minimal amount of calibration data for outlier detection. GWQ retains the top 1\% outliers preferentially at FP16 precision, while the remaining non-outlier weights are stored in a low-bit. We widely evaluate GWQ on different task include language modeling, grounding detection, massive multitask language understanding and vision-language question and answering. Results show that models quantified by GWQ performs better than other quantization method. During quantization process, GWQ only need one calibration set to realize effective quant. Also, GWQ achieves 1.2x inference speedup in comparison to the original model and effectively reduces the inference memory.

QMDec 13, 2025
Multiscale Cross-Modal Mapping of Molecular, Pathologic, and Radiologic Phenotypes in Lipid-Deficient Clear Cell Renal CellCarcinoma

Ying Cui, Dongzhe Zheng, Ke Yu et al.

Clear cell renal cell carcinoma (ccRCC) exhibits extensive intratumoral heterogeneity on multiple biological scales, contributing to variable clinical outcomes and limiting the effectiveness of conventional TNM staging, which highlights the urgent need for multiscale integrative analytic frameworks. The lipid-deficient de-clear cell differentiated (DCCD) ccRCC subtype, defined by multi-omics analyses, is associated with adverse outcomes even in early-stage disease. Here, we establish a hierarchical cross-scale framework for the preoperative identification of DCCD-ccRCC. At the highest layer, cross-modal mapping transferred molecular signatures to histological and CT phenotypes, establishing a molecular-to-pathology-to-radiology supervisory bridge. Within this framework, each modality-specific model is designed to mirror the inherent hierarchical structure of tumor biology. PathoDCCD captured multi-scale microscopic features, from cellular morphology and tissue architecture to meso-regional organization. RadioDCCD integrated complementary macroscopic information by combining whole-tumor and its habitat-subregions radiomics with a 2D maximal-section heterogeneity metric. These nested models enabled integrated molecular subtype prediction and clinical risk stratification. Across five cohorts totaling 1,659 patients, PathoDCCD reliably recapitulated molecular subtypes, while RadioDCCD provided reliable preoperative prediction. The consistent predictions identified patients with the poorest clinical outcomes. This cross-scale paradigm unifies molecular biology, computational pathology, and quantitative radiology into a biologically grounded strategy for preoperative noninvasive molecular phenotyping of ccRCC.

CVMar 24, 2025
Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection

Wenxi Chen, Raymond A. Yeh, Shaoshuai Mou et al.

Out-of-distribution (OOD) detection is the task of identifying inputs that deviate from the training data distribution. This capability is essential for safely deploying deep computer vision models in open-world environments. In this work, we propose a post-hoc method, Perturbation-Rectified OOD detection (PRO), based on the insight that prediction confidence for OOD inputs is more susceptible to reduction under perturbation than in-distribution (IND) inputs. Based on the observation, we propose an adversarial score function that searches for the local minimum scores near the original inputs by applying gradient descent. This procedure enhances the separability between IND and OOD samples. Importantly, the approach improves OOD detection performance without complex modifications to the underlying model architectures. We conduct extensive experiments using the OpenOOD benchmark~\cite{yang2022openood}. Our approach further pushes the limit of softmax-based OOD detection and is the leading post-hoc method for small-scale models. On a CIFAR-10 model with adversarial training, PRO effectively detects near-OOD inputs, achieving a reduction of more than 10\% on FPR@95 compared to state-of-the-art methods.

DSJan 1, 2024
Parallel Integer Sort: Theory and Practice

Xiaojun Dong, Laxman Dhulipala, Yan Gu et al.

Integer sorting is a fundamental problem in computer science. This paper studies parallel integer sort both in theory and in practice. In theory, we show tighter bounds for a class of existing practical integer sort algorithms, which provides a solid theoretical foundation for their widespread usage in practice and strong performance. In practice, we design a new integer sorting algorithm, \textsf{DovetailSort}, that is theoretically-efficient and has good practical performance. In particular, \textsf{DovetailSort} overcomes a common challenge in existing parallel integer sorting algorithms, which is the difficulty of detecting and taking advantage of duplicate keys. The key insight in \textsf{DovetailSort} is to combine algorithmic ideas from both integer- and comparison-sorting algorithms. In our experiments, \textsf{DovetailSort} achieves competitive or better performance than existing state-of-the-art parallel integer and comparison sorting algorithms on various synthetic and real-world datasets.

IRMay 7, 2023
ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms

Magdalen Dobson Manohar, Zheqi Shen, Guy E. Blelloch et al.

Approximate nearest-neighbor search (ANNS) algorithms are a key part of the modern deep learning stack due to enabling efficient similarity search over high-dimensional vector space representations (i.e., embeddings) of data. Among various ANNS algorithms, graph-based algorithms are known to achieve the best throughput-recall tradeoffs. Despite the large scale of modern ANNS datasets, existing parallel graph based implementations suffer from significant challenges to scale to large datasets due to heavy use of locks and other sequential bottlenecks, which 1) prevents them from efficiently scaling to a large number of processors, and 2) results in nondeterminism that is undesirable in certain applications. In this paper, we introduce ParlayANN, a library of deterministic and parallel graph-based approximate nearest neighbor search algorithms, along with a set of useful tools for developing such algorithms. In this library, we develop novel parallel implementations for four state-of-the-art graph-based ANNS algorithms that scale to billion-scale datasets. Our algorithms are deterministic and achieve high scalability across a diverse set of challenging datasets. In addition to the new algorithmic ideas, we also conduct a detailed experimental study of our new algorithms as well as two existing non-graph approaches. Our experimental results both validate the effectiveness of our new techniques, and lead to a comprehensive comparison among ANNS algorithms on large scale datasets with a list of interesting findings.

ROJan 31, 2022
DRS-LIP: Linear Inverted Pendulum Model for Legged Locomotion on Dynamic Rigid Surfaces

Amir Iqbal, Sushant Veer, Yan Gu

Legged robot locomotion on a dynamic rigid surface (i.e., a rigid surface moving in the inertial frame) involves complex full-order dynamics that is high-dimensional, nonlinear, and time-varying. Towards deriving an analytically tractable dynamic model, this study theoretically extends the reduced-order linear inverted pendulum (LIP) model from legged locomotion on a stationary surface to locomotion on a dynamic rigid surface (DRS). The resulting model is herein termed as DRS-LIP. Furthermore, this study introduces an approximate analytical solution of the proposed DRS-LIP that is computationally efficient with high accuracy. To illustrate the practical uses of the analytical results, they are used to develop a hierarchical planning framework that efficiently generates physically feasible trajectories for DRS locomotion. The effectiveness of the proposed theoretical results and motion planner is demonstrated both through simulations and experimentally on a Laikago quadrupedal robot that walks on a rocking treadmill.

ROJan 25, 2022
Invariant Filtering for Legged Humanoid Locomotion on Dynamic Rigid Surfaces

Yuan Gao, Chengzhi Yuan, Yan Gu

State estimation for legged locomotion over a dynamic rigid surface (DRS), which is a rigid surface moving in the world frame (e.g., ships, aircraft, and trains), remains an under-explored problem. This paper introduces an invariant extended Kalman filter that estimates the robot's pose and velocity during DRS locomotion by using common sensors of legged robots (e.g., inertial measurement units (IMU), joint encoders, and RDB-D camera). A key feature of the filter lies in that it explicitly addresses the nonstationary surface-foot contact point and the hybrid robot behaviors. Another key feature is that, in the absence of IMU biases, the filter satisfies the attractive group affine and invariant observation conditions, and is thus provably convergent for the deterministic continuous phases. The observability analysis is performed to reveal the effects of DRS movement on the state observability, and the convergence property of the hybrid, deterministic filter system is examined for the observable state variables. Experiments of a Digit humanoid robot walking on a pitching treadmill validate the effectiveness of the proposed filter under large estimation errors and moderate DRS movement. The video of the experiments can be found at: https://youtu.be/ScQIBFUSKzo.

ROSep 10, 2021
Extended Capture Point and Optimization-based Control for Quadrupedal Robot Walking on Dynamic Rigid Surfaces

Amir Iqbal, Yan Gu

Stabilizing legged robot locomotion on a dynamic rigid surface (DRS) (i.e., rigid surface that moves in the inertial frame) is a complex planning and control problem. The complexity arises due to the hybrid nonlinear walking dynamics subject to explicitly time-varying holonomic constraints caused by the surface movement. The first main contribution of this study is the extension of the capture point from walking on a static surface to locomotion on a DRS as well as the use of the resulting capture point for online motion planning. The second main contribution is a quadratic-programming (QP) based feedback controller design that explicitly considers the DRS movement. The stability and robustness of the proposed control approach are validated through simulations of a quadrupedal robot walking on a DRS with a rocking motion. The simulation results also demonstrate the improved walking performance compared with our previous approach based on offline planning and input-output linearizing control that does not explicitly guarantee the feasibility of ground contact constraints.

ROSep 2, 2021
Invariant Filtering for Bipedal Walking on Dynamic Rigid Surfaces with Orientation-based Measurement Model

Yuan Gao, Yan Gu

Real-world applications of bipedal robot walking require accurate, real-time state estimation. State estimation for locomotion over dynamic rigid surfaces (DRS), such as elevators, ships, public transport vehicles, and aircraft, remains under-explored, although state estimator designs for stationary rigid surfaces have been extensively studied. Addressing DRS locomotion in state estimation is a challenging problem mainly due to the nonlinear, hybrid nature of walking dynamics, the nonstationary surface-foot contact points, and hardware imperfections (e.g., limited availability, noise, and drift of onboard sensors). Towards solving this problem, we introduce an Invariant Extended Kalman Filter (InEKF) whose process and measurement models explicitly consider the DRS movement and hybrid walking behaviors while respectively satisfying the group-affine condition and invariant form. Due to these attractive properties, the estimation error convergence of the filter is provably guaranteed for hybrid DRS locomotion. The measurement model of the filter also exploits the holonomic constraint associated with the support-foot and surface orientations, under which the robot's yaw angle in the world becomes observable in the presence of general DRS movement. Experimental results of bipedal walking on a rocking treadmill demonstrate the proposed filter ensures the rapid error convergence and observable base yaw angle.

ROAug 8, 2021
Global-Position Tracking Control of 3-D Bipedal Walking via Virtual Constraint Design and Multiple Lyapunov Analysis

Yan Gu, Yuan Gao, Bin Yao et al.

A safety-critical measure of legged locomotion performance is a robot's ability to track its desired time-varying position trajectory in an environment, which is herein termed as "global-position tracking". This paper introduces a nonlinear control approach that achieves asymptotic global-position tracking for three-dimensional (3-D) bipedal robot walking. Designing a global-position tracking controller presents a challenging problem due to the complex hybrid robot model and the time-varying desired global-position trajectory. Towards tackling this problem, the first main contribution is the construction of impact invariance to ensure all desired trajectories respect the foot-landing impact dynamics, which is a necessary condition for realizing asymptotic tracking of hybrid walking systems. Thanks to their independence of the desired global position, these conditions can be exploited to decouple the higher-level planning of the global position and the lower-level planning of the remaining trajectories, thereby greatly alleviating the computational burden of motion planning. The second main contribution is the Lyapunov-based stability analysis of the hybrid closed-loop system, which produces sufficient conditions to guide the controller design for achieving asymptotic global-position tracking during fully actuated walking. Simulations and experiments on a 3-D bipedal robot with twenty revolute joints confirm the validity of the proposed control approach in guaranteeing accurate tracking.

DSJun 8, 2021
ParChain: A Framework for Parallel Hierarchical Agglomerative Clustering using Nearest-Neighbor Chain

Shangdi Yu, Yiqiu Wang, Yan Gu et al.

This paper studies the hierarchical clustering problem, where the goal is to produce a dendrogram that represents clusters at varying scales of a data set. We propose the ParChain framework for designing parallel hierarchical agglomerative clustering (HAC) algorithms, and using the framework we obtain novel parallel algorithms for the complete linkage, average linkage, and Ward's linkage criteria. Compared to most previous parallel HAC algorithms, which require quadratic memory, our new algorithms require only linear memory, and are scalable to large data sets. ParChain is based on our parallelization of the nearest-neighbor chain algorithm, and enables multiple clusters to be merged on every round. We introduce two key optimizations that are critical for efficiency: a range query optimization that reduces the number of distance computations required when finding nearest neighbors of clusters, and a caching optimization that stores a subset of previously computed distances, which are likely to be reused. Experimentally, we show that our highly-optimized implementations using 48 cores with two-way hyper-threading achieve 5.8--110.1x speedup over state-of-the-art parallel HAC algorithms and achieve 13.75--54.23x self-relative speedup. Compared to state-of-the-art algorithms, our algorithms require up to 237.3x less space. Our algorithms are able to scale to data set sizes with tens of millions of points, which existing algorithms are not able to handle.

DSApr 2, 2021
Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering

Yiqiu Wang, Shangdi Yu, Yan Gu et al.

This paper presents new parallel algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN$^*$). Our approach is based on generating a well-separated pair decomposition followed by using Kruskal's minimum spanning tree algorithm and bichromatic closest pair computations. We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN$^*$. We also present a parallel approximate algorithm for OPTICS based on a recent sequential algorithm by Gan and Tao. Finally, we give a new parallel divide-and-conquer algorithm for computing the dendrogram and reachability plots, which are used in visualizing clusters of different scale that arise for both EMST and HDBSCAN$^*$. We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time). We implement our algorithms and propose a memory optimization that requires only a subset of well-separated pairs to be computed and materialized, leading to savings in both space (up to 10x) and time (up to 8x). Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude.

DSDec 12, 2019
Theoretically-Efficient and Practical Parallel DBSCAN

Yiqiu Wang, Yan Gu, Julian Shun

The DBSCAN method for spatial clustering has received significant attention due to its applicability in a variety of data analysis tasks. There are fast sequential algorithms for DBSCAN in Euclidean space that take $O(n\log n)$ work for two dimensions, sub-quadratic work for three or more dimensions, and can be computed approximately in linear work for any constant number of dimensions. However, existing parallel DBSCAN algorithms require quadratic work in the worst case, making them inefficient for large datasets. This paper bridges the gap between theory and practice of parallel DBSCAN by presenting new parallel algorithms for Euclidean exact DBSCAN and approximate DBSCAN that match the work bounds of their sequential counterparts, and are highly parallel (polylogarithmic depth). We present implementations of our algorithms along with optimizations that improve their practical performance. We perform a comprehensive experimental evaluation of our algorithms on a variety of datasets and parameter settings. Our experiments on a 36-core machine with hyper-threading show that we outperform existing parallel DBSCAN implementations by up to several orders of magnitude, and achieve speedups by up to 33x over the best sequential algorithms.

ROOct 23, 2019
Impact-Aware Online Motion Planning for Fully-Actuated Bipedal Robot Walking

Yuan Gao, Xingye Da, Yan Gu

The ability to track a general walking path with specific timing is crucial to the operational safety and reliability of bipedal robots for avoiding dynamic obstacles, such as pedestrians, in complex environments. This paper introduces an online, full-body motion planner that generates the desired impact-aware motion for fully-actuated bipedal robotic walking. The main novelty of the proposed planner lies in its capability of producing desired motions in real-time that respect the discrete impact dynamics and the desired impact timing. To derive the proposed planner, a full-order hybrid dynamic model of fully-actuated bipedal robotic walking is presented, including both continuous dynamics and discrete lading impacts. Next, the proposed impact-aware online motion planner is introduced. Finally, simulation results of a 3-D bipedal robot are provided to confirm the effectiveness of the proposed online impact-aware planner. The online planner is capable of generating full-body motion of one walking step within 0.6 second, which is shorter than a typical bipedal walking step.

AIJun 4, 2014
Cascading A*: a Parallel Approach to Approximate Heuristic Search

Yan Gu

In this paper, we proposed a new approximate heuristic search algorithm: Cascading A*, which is a two-phrase algorithm combining A* and IDA* by a new concept "envelope ball". The new algorithm CA* is efficient, able to generate approximate solution and any-time solution, and parallel friendly.