CVMar 12, 2020Code
Learning to Segment 3D Point Clouds in 2D Image SpaceYecheng Lyu, Xinming Huang, Ziming Zhang
In contrast to the literature where local patterns in 3D point clouds are captured by customized convolutional operators, in this paper we study the problem of how to effectively and efficiently project such point clouds into a 2D image space so that traditional 2D convolutional neural networks (CNNs) such as U-Net can be applied for segmentation. To this end, we are motivated by graph drawing and reformulate it as an integer programming problem to learn the topology-preserving graph-to-grid mapping for each individual point cloud. To accelerate the computation in practice, we further propose a novel hierarchical approximate algorithm. With the help of the Delaunay triangulation for graph construction from point clouds and a multi-scale U-Net for segmentation, we manage to demonstrate the state-of-the-art performance on ShapeNet and PartNet, respectively, with significant improvement over the literature. Code is available at https://github.com/Zhang-VISLab.
CVOct 19, 2021
CoFi: Coarse-to-Fine ICP for LiDAR Localization in an Efficient Long-lasting Point Cloud MapYecheng Lyu, Xinming Huang, Ziming Zhang
LiDAR odometry and localization has attracted increasing research interest in recent years. In the existing works, iterative closest point (ICP) is widely used since it is precise and efficient. Due to its non-convexity and its local iterative strategy, however, ICP-based method easily falls into local optima, which in turn calls for a precise initialization. In this paper, we propose CoFi, a Coarse-to-Fine ICP algorithm for LiDAR localization. Specifically, the proposed algorithm down-samples the input point sets under multiple voxel resolution, and gradually refines the transformation from the coarse point sets to the fine-grained point sets. In addition, we propose a map based LiDAR localization algorithm that extracts semantic feature points from the LiDAR frames and apply CoFi to estimate the pose on an efficient point cloud map. With the help of the Cylinder3D algorithm for LiDAR scan semantic segmentation, the proposed CoFi localization algorithm demonstrates the state-of-the-art performance on the KITTI odometry benchmark, with significant improvement over the literature.
ROSep 13, 2021
LiDAR Odometry Methodologies for Autonomous Driving: A SurveyNikhil Jonnavithula, Yecheng Lyu, Ziming Zhang
Vehicle odometry is an essential component of an automated driving system as it computes the vehicle's position and orientation. The odometry module has a higher demand and impact in urban areas where the global navigation satellite system (GNSS) signal is weak and noisy. Traditional visual odometry methods suffer from the diverse illumination status and get disparities during pose estimation, which results in significant errors as the error accumulates. Odometry using light detection and ranging (LiDAR) devices has attracted increasing research interest as LiDAR devices are robust to illumination variations. In this survey, we examine the existing LiDAR odometry methods and summarize the pipeline and delineate the several intermediate steps. Additionally, the existing LiDAR odometry methods are categorized by their correspondence type, and their advantages, disadvantages, and correlations are analyzed across-category and within-category in each step. Finally, we compare the accuracy and the running speed among these methodologies evaluated over the KITTI odometry dataset and outline promising future research directions.
CVMay 23, 2021
Revisiting 2D Convolutional Neural Networks for Graph-based ApplicationsYecheng Lyu, Xinming Huang, Ziming Zhang
Graph convolutional networks (GCNs) are widely used in graph-based applications such as graph classification and segmentation. However, current GCNs have limitations on implementation such as network architectures due to their irregular inputs. In contrast, convolutional neural networks (CNNs) are capable of extracting rich features from large-scale input data, but they do not support general graph inputs. To bridge the gap between GCNs and CNNs, in this paper we study the problem of how to effectively and efficiently map general graphs to 2D grids that CNNs can be directly applied to, while preserving graph topology as much as possible. We therefore propose two novel graph-to-grid mapping schemes, namely, {\em graph-preserving grid layout (GPGL)} and its extension {\em Hierarchical GPGL (H-GPGL)} for computational efficiency. We formulate the GPGL problem as integer programming and further propose an approximate yet efficient solver based on a penalized Kamada-Kawai method, a well-known optimization algorithm in 2D graph drawing. We propose a novel vertex separation penalty that encourages graph vertices to lay on the grid without any overlap. Along with this image representation, even extra 2D maxpooling layers contribute to the PointNet, a widely applied point-based neural network. We demonstrate the empirical success of GPGL on general graph classification with small graphs and H-GPGL on 3D point cloud segmentation with large graphs, based on 2D CNNs including VGG16, ResNet50 and multi-scale maxout (MSM) CNN.
CVMar 3, 2021
EllipsoidNet: Ellipsoid Representation for Point Cloud Classification and SegmentationYecheng Lyu, Xinming Huang, Ziming Zhang
Point cloud patterns are hard to learn because of the implicit local geometry features among the orderless points. In recent years, point cloud representation in 2D space has attracted increasing research interest since it exposes the local geometry features in a 2D space. By projecting those points to a 2D feature map, the relationship between points is inherited in the context between pixels, which are further extracted by a 2D convolutional neural network. However, existing 2D representing methods are either accuracy limited or time-consuming. In this paper, we propose a novel 2D representation method that projects a point cloud onto an ellipsoid surface space, where local patterns are well exposed in ellipsoid-level and point-level. Additionally, a novel convolutional neural network named EllipsoidNet is proposed to utilize those features for point cloud classification and segmentation applications. The proposed methods are evaluated in ModelNet40 and ShapeNet benchmarks, where the advantages are clearly shown over existing 2D representation methods.
CVSep 1, 2020
LodoNet: A Deep Neural Network with 2D Keypoint Matchingfor 3D LiDAR Odometry EstimationCe Zheng, Yecheng Lyu, Ming Li et al.
Deep learning based LiDAR odometry (LO) estimation attracts increasing research interests in the field of autonomous driving and robotics. Existing works feed consecutive LiDAR frames into neural networks as point clouds and match pairs in the learned feature space. In contrast, motivated by the success of image based feature extractors, we propose to transfer the LiDAR frames to image space and reformulate the problem as image feature extraction. With the help of scale-invariant feature transform (SIFT) for feature extraction, we are able to generate matched keypoint pairs (MKPs) that can be precisely returned to the 3D space. A convolutional neural network pipeline is designed for LiDAR odometry estimation by extracted MKPs. The proposed scheme, namely LodoNet, is then evaluated in the KITTI odometry estimation benchmark, achieving on par with or even better results than the state-of-the-art.
CVJun 21, 2020
TreeRNN: Topology-Preserving Deep GraphEmbedding and LearningYecheng Lyu, Ming Li, Xinming Huang et al.
General graphs are difficult for learning due to their irregular structures. Existing works employ message passing along graph edges to extract local patterns using customized graph kernels, but few of them are effective for the integration of such local patterns into global features. In contrast, in this paper we study the methods to transfer the graphs into trees so that explicit orders are learned to direct the feature integration from local to global. To this end, we apply the breadth first search (BFS) to construct trees from the graphs, which adds direction to the graph edges from the center node to the peripheral nodes. In addition, we proposed a novel projection scheme that transfer the trees to image representations, which is suitable for conventional convolution neural networks (CNNs) and recurrent neural networks (RNNs). To best learn the patterns from the graph-tree-images, we propose TreeRNN, a 2D RNN architecture that recurrently integrates the image pixels by rows and columns to help classify the graph categories. We evaluate the proposed method on several graph classification datasets, and manage to demonstrate comparable accuracy with the state-of-the-art on MUTAG, PTC-MR and NCI1 datasets.
IVJun 13, 2020
RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road SegmentationLin Bai, Yecheng Lyu, Xinming Huang
In recent years, convolutional neural network has gained popularity in many engineering applications especially for computer vision. In order to achieve better performance, often more complex structures and advanced operations are incorporated into the neural networks, which results very long inference time. For time-critical tasks such as autonomous driving and virtual reality, real-time processing is fundamental. In order to reach real-time process speed, a light-weight, high-throughput CNN architecture namely RoadNet-RT is proposed for road segmentation in this paper. It achieves 90.33% MaxF score on test set of KITTI road segmentation task and 8 ms per frame when running on GTX 1080 GPU. Comparing to the state-of-the-art network, RoadNet-RT speeds up the inference time by a factor of 20 at the cost of only 6.2% accuracy loss. For hardware design optimization, several techniques such as depthwise separable convolution and non-uniformed kernel size convolution are customized designed to further reduce the processing time. The proposed CNN architecture has been successfully implemented on an FPGA ZCU102 MPSoC platform that achieves the computation capability of 83.05 GOPS. The system throughput reaches 327.9 frames per second with image size 1216x176.
CVJun 1, 2020
Automatic Building and Labeling of HD Maps with Deep LearningMahdi Elhousni, Yecheng Lyu, Ziming Zhang et al.
In a world where autonomous driving cars are becoming increasingly more common, creating an adequate infrastructure for this new technology is essential. This includes building and labeling high-definition (HD) maps accurately and efficiently. Today, the process of creating HD maps requires a lot of human input, which takes time and is prone to errors. In this paper, we propose a novel method capable of generating labelled HD maps from raw sensor data. We implemented and tested our methods on several urban scenarios using data collected from our test vehicle. The results show that the pro-posed deep learning based method can produce highly accurate HD maps. This approach speeds up the process of building and labeling HD maps, which can make meaningful contribution to the deployment of autonomous vehicle.
LGSep 26, 2019
Graph-Preserving Grid Layout: A Simple Graph Drawing Method for Graph Classification using CNNsYecheng Lyu, Xinming Huang, Ziming Zhang
Graph convolutional networks (GCNs) suffer from the irregularity of graphs, while more widely-used convolutional neural networks (CNNs) benefit from regular grids. To bridge the gap between GCN and CNN, in contrast to previous works on generalizing the basic operations in CNNs to graph data, in this paper we address the problem of how to project undirected graphs onto the grid in a {\em principled} way where CNNs can be used as backbone for geometric deep learning. To this end, inspired by the literature of graph drawing we propose a novel graph-preserving grid layout (GPGL), an integer programming that minimizes the topological loss on the grid. Technically we propose solving GPGL approximately using a {\em regularized} Kamada-Kawai algorithm, a well-known nonconvex optimization technique in graph drawing, with a vertex separation penalty that improves the rounding performance on top of the solutions from relaxation. Using GPGL we can easily conduct data augmentation as every local minimum will lead to a grid layout for the same graph. Together with the help of multi-scale maxout CNNs, we demonstrate the empirical success of our method for graph classification.
CVAug 10, 2018
Road Segmentation Using CNN and Distributed LSTMYecheng Lyu, Lin Bai, Xinming Huang
In automated driving systems (ADS) and advanced driver-assistance systems (ADAS), an efficient road segmentation is necessary to perceive the drivable region and build an occupancy map for path planning. The existing algorithms implement gigantic convolutional neural networks (CNNs) that are computationally expensive and time consuming. In this paper, we introduced distributed LSTM, a neural network widely used in audio and video processing, to process rows and columns in images and feature maps. We then propose a new network combining the convolutional and distributed LSTM layers to solve the road segmentation problem. In the end, the network is trained and tested in KITTI road benchmark. The result shows that the combined structure enhances the feature extraction and processing but takes less processing time than pure CNN structure.
CVApr 14, 2018
Road Segmentation Using CNN with GRUYecheng Lyu, Xinming Huang
This paper presents an accurate and fast algorithm for road segmentation using convolutional neural network (CNN) and gated recurrent units (GRU). For autonomous vehicles, road segmentation is a fundamental task that can provide the drivable area for path planning. The existing deep neural network based segmentation algorithms usually take a very deep encoder-decoder structure to fuse pixels, which requires heavy computations, large memory and long processing time. Hereby, a CNN-GRU network model is proposed and trained to perform road segmentation using data captured by the front camera of a vehicle. GRU network obtains a long spatial sequence with lower computational complexity, comparing to traditional encoder-decoder architecture. The proposed road detector is evaluated on the KITTI road benchmark and achieves high accuracy for road segmentation at real-time processing speed.
RONov 7, 2017
Real-Time Road Segmentation Using LiDAR Data Processing on an FPGAYecheng Lyu, Lin Bai, Xinming Huang
This paper presents the FPGA design of a convolutional neural network (CNN) based road segmentation algorithm for real-time processing of LiDAR data. For autonomous vehicles, it is important to perform road segmentation and obstacle detection such that the drivable region can be identified for path planning. Traditional road segmentation algorithms are mainly based on image data from cameras, which is subjected to the light condition as well as the quality of road markings. LiDAR sensor can obtain the 3D geometry information of the vehicle surroundings with very high accuracy. However, it is a computational challenge to process a large amount of LiDAR data at real-time. In this work, a convolutional neural network model is proposed and trained to perform semantic segmentation using the LiDAR sensor data. Furthermore, an efficient hardware design is implemented on the FPGA that can process each LiDAR scan in 16.9ms, which is much faster than the previous works. Evaluated using KITTI road benchmarks, the proposed solution achieves high accuracy of road segmentation.