CVMar 28, 2023Code
Large-scale Training Data Search for Object Re-identificationYue Yao, Huan Lei, Tom Gedeon et al.
We consider a scenario where we have access to the target domain, but cannot afford on-the-fly training data annotation, and instead would like to construct an alternative training set from a large-scale data pool such that a competitive model can be obtained. We propose a search and pruning (SnP) solution to this training data search problem, tailored to object re-identification (re-ID), an application aiming to match the same object captured by different cameras. Specifically, the search stage identifies and merges clusters of source identities which exhibit similar distributions with the target domain. The second stage, subject to a budget, then selects identities and their images from the Stage I output, to control the size of the resulting training set for efficient training. The two steps provide us with training sets 80\% smaller than the source pool while achieving a similar or even higher re-ID accuracy. These training sets are also shown to be superior to a few existing search methods such as random sampling and greedy sampling under the same budget on training data size. If we release the budget, training sets resulting from the first stage alone allow even higher re-ID accuracy. We provide interesting discussions on the specificity of our method to the re-ID problem and particularly its role in bridging the re-ID domain gap. The code is available at https://github.com/yorkeyao/SnP.
CVAug 18, 2023Code
Training with Product Digital Twins for AutoRetail CheckoutYue Yao, Xinyu Tian, Zheng Tang et al.
Automating the checkout process is important in smart retail, where users effortlessly pass products by hand through a camera, triggering automatic product detection, tracking, and counting. In this emerging area, due to the lack of annotated training data, we introduce a dataset comprised of product 3D models, which allows for fast, flexible, and large-scale training data generation through graphic engine rendering. Within this context, we discern an intriguing facet, because of the user "hands-on" approach, bias in user behavior leads to distinct patterns in the real checkout process. The existence of such patterns would compromise training effectiveness if training data fail to reflect the same. To address this user bias problem, we propose a training data optimization framework, i.e., training with digital twins (DtTrain). Specifically, we leverage the product 3D models and optimize their rendering viewpoint and illumination to generate "digital twins" that visually resemble representative user images. These digital twins, inherit product labels and, when augmented, form the Digital Twin training set (DT set). Because the digital twins individually mimic user bias, the resulting DT training set better reflects the characteristics of the target scenario and allows us to train more effective product detection and tracking models. In our experiment, we show that DT set outperforms training sets created by existing dataset synthesis methods in terms of counting accuracy. Moreover, by combining DT set with pseudo-labeled real checkout data, further improvement is observed. The code is available at https://github.com/yorkeyao/Automated-Retail-Checkout.
CVJan 23, 2023Code
CircNet: Meshing 3D Point Clouds with Circumcenter DetectionHuan Lei, Ruitao Leng, Liang Zheng et al.
Reconstructing 3D point clouds into triangle meshes is a key problem in computational geometry and surface reconstruction. Point cloud triangulation solves this problem by providing edge information to the input points. Since no vertex interpolation is involved, it is beneficial to preserve sharp details on the surface. Taking advantage of learning-based techniques in triangulation, existing methods enumerate the complete combinations of candidate triangles, which is both complex and inefficient. In this paper, we leverage the duality between a triangle and its circumcenter, and introduce a deep neural network that detects the circumcenters to achieve point cloud triangulation. Specifically, we introduce multiple anchor priors to divide the neighborhood space of each point. The neural network then learns to predict the presences and locations of circumcenters under the guidance of those anchors. We extract the triangles dual to the detected circumcenters to form a primitive mesh, from which an edge-manifold mesh is produced via simple post-processing. Unlike existing learning-based triangulation methods, the proposed method bypasses an exhaustive enumeration of triangle combinations and local surface parameterization. We validate the efficiency, generalization, and robustness of our method on prominent datasets of both watertight and open surfaces. The code and trained models are provided at https://github.com/EnyaHermite/CircNet.
PLASM-PHMar 29
From molecular dynamics to kinetic models: data-driven generalized collision operators in 1D3V plasmasYue Zhao, Guosheng Fu, Huan Lei
We present a data-driven approach for constructing generalized collisional kinetic models for inhomogeneous plasmas in one-dimensional physical space and three-dimensional velocity space (1D-3V). The collision operator is directly learned from micro-scale molecular dynamics (MD) and accurately accounts for the unresolved particle interactions over a broad range of plasma conditions. Unlike the standard Landau operator, the present operator takes an anisotropic, non-stationary form that captures the heterogeneous collisional energy transfer arising from the many-body interactions, which is crucial for plasma kinetics beyond the weakly coupled regime. Efficient numerical evaluation is achieved through a low-rank tensor representation with $O(N \log N)$ computational complexity. The constructed kinetic equation strictly preserves conservation laws and physical constraints and therefore, enables us to develop an explicit second-order, energy-conserving scheme that ensures fully discrete conservation of mass and total energy. Numerical results demonstrate that the present model accurately predicts both transport coefficients and several 1D-3V kinetic processes compared with MD simulations across a broad range of densities and temperatures in spatially inhomogeneous settings. This work provides a systematic pathway for bridging micro-scale MD and inhomogeneous plasma kinetic descriptions where empirical models show limitation.
CVDec 3, 2021Code
Mesh Convolution with Continuous Filters for 3D Surface ParsingHuan Lei, Naveed Akhtar, Mubarak Shah et al.
Geometric feature learning for 3D surfaces is critical for many applications in computer graphics and 3D vision. However, deep learning currently lags in hierarchical modeling of 3D surfaces due to the lack of required operations and/or their efficient implementations. In this paper, we propose a series of modular operations for effective geometric feature learning from 3D triangle meshes. These operations include novel mesh convolutions, efficient mesh decimation and associated mesh (un)poolings. Our mesh convolutions exploit spherical harmonics as orthonormal bases to create continuous convolutional filters. The mesh decimation module is GPU-accelerated and able to process batched meshes on-the-fly, while the (un)pooling operations compute features for up/down-sampled meshes. We provide open-source implementation of these operations, collectively termed Picasso. Picasso supports heterogeneous mesh batching and processing. Leveraging its modular operations, we further contribute a novel hierarchical neural network for perceptual parsing of 3D surfaces, named PicassoNet++. It achieves highly competitive performance for shape analysis and scene segmentation on prominent 3D benchmarks. The code, data and trained models are available at https://github.com/EnyaHermite/Picasso.
CVMar 28, 2021Code
Picasso: A CUDA-based Library for Deep Learning over 3D MeshesHuan Lei, Naveed Akhtar, Ajmal Mian
We present Picasso, a CUDA-based library comprising novel modules for deep learning over complex real-world 3D meshes. Hierarchical neural architectures have proved effective in multi-scale feature extraction which signifies the need for fast mesh decimation. However, existing methods rely on CPU-based implementations to obtain multi-resolution meshes. We design GPU-accelerated mesh decimation to facilitate network resolution reduction efficiently on-the-fly. Pooling and unpooling modules are defined on the vertex clusters gathered during decimation. For feature learning over meshes, Picasso contains three types of novel convolutions namely, facet2vertex, vertex2facet, and facet2facet convolution. Hence, it treats a mesh as a geometric structure comprising vertices and facets, rather than a spatial graph with edges as previous methods do. Picasso also incorporates a fuzzy mechanism in its filters for robustness to mesh sampling (vertex density). It exploits Gaussian mixtures to define fuzzy coefficients for the facet2vertex convolution, and barycentric interpolation to define the coefficients for the remaining two convolutions. In this release, we demonstrate the effectiveness of the proposed modules with competitive segmentation results on S3DIS. The library will be made public through https://github.com/hlei-ziyan/Picasso.
CVSep 20, 2019Code
Spherical Kernel for Efficient Graph Convolution on 3D Point CloudsHuan Lei, Naveed Akhtar, Ajmal Mian
We propose a spherical kernel for efficient graph convolution of 3D point clouds. Our metric-based kernels systematically quantize the local 3D space to identify distinctive geometric relationships in the data. Similar to the regular grid CNN kernels, the spherical kernel maintains translation-invariance and asymmetry properties, where the former guarantees weight sharing among similar local structures in the data and the latter facilitates fine geometric learning. The proposed kernel is applied to graph neural networks without edge-dependent filter generation, making it computationally attractive for large point clouds. In our graph networks, each vertex is associated with a single point location and edges connect the neighborhood points within a defined range. The graph gets coarsened in the network with farthest point sampling. Analogous to the standard CNNs, we define pooling and unpooling operations for our network. We demonstrate the effectiveness of the proposed spherical kernel with graph neural networks for point cloud classification and semantic segmentation using ModelNet, ShapeNet, RueMonge2014, ScanNet and S3DIS datasets. The source code and the trained models can be downloaded from https://github.com/hlei-ziyan/SPH3D-GCN.
NAMay 4
High-Dimensional Enhanced Sampling via Regularized Path-Dependent McKean--Vlasov Dynamics using Tensor Density ApproximationLiyao Lyu, Siyu Guo, Huan Lei
Sampling from high-dimensional Gibbs measures poses a challenge when the energy landscape consists of multiple metastable states. Enhanced-sampling methods mitigate this difficulty by introducing adaptive biasing potentials to facilitate the exploration along prescribed collective variables (CVs), but their scalability is often limited by the dimension of the CV space. Motivated by the Wasserstein-gradient-flow interpretation of adaptive biasing, we propose a regularized path-dependent McKean--Vlasov formulation for high-dimensional enhanced sampling. The formulation replaces the variational regularization of the Wasserstein functional by a direct regularization of the CV marginal density in the McKean--Vlasov drift, avoiding the outer convolution over the CV domain. Furthermore, it replaces the instantaneous law by a weighted path-history measure to improve statistical stability in the small-replica regime. We establish well-posedness of the resulting regularized and path-dependent stochastic dynamics under suitable assumptions. For numerical realization, the history-averaged CV marginal density is approximated using an optimization-free functional hierarchical tensor representation, leading to a scalable density-based adaptive biasing scheme. Numerical experiments on benchmark potentials and molecular systems demonstrate the effectiveness of the proposed method for sampling problems with CV dimensions up to 64.
AIJul 14, 2024
AlphaDou: High-Performance End-to-End Doudizhu AI Integrating BiddingChang Lei, Huan Lei
Artificial intelligence for card games has long been a popular topic in AI research. In recent years, complex card games like Mahjong and Texas Hold'em have been solved, with corresponding AI programs reaching the level of human experts. However, the game of Doudizhu presents significant challenges due to its vast state/action space and unique characteristics involving reasoning about competition and cooperation, making the game extremely difficult to solve.The RL model Douzero, trained using the Deep Monte Carlo algorithm framework, has shown excellent performance in Doudizhu. However, there are differences between its simplified game environment and the actual Doudizhu environment, and its performance is still a considerable distance from that of human experts. This paper modifies the Deep Monte Carlo algorithm framework by using reinforcement learning to obtain a neural network that simultaneously estimates win rates and expectations. The action space is pruned using expectations, and strategies are generated based on win rates. The modified algorithm enables the AI to perform the full range of tasks in the Doudizhu game, including bidding and cardplay. The model was trained in a actual Doudizhu environment and achieved state-of-the-art performance among publicly available models. We hope that this new framework will provide valuable insights for AI development in other bidding-based games.
NAMar 24
Matrix-Free Stabilized BDF Schemes for Semilinear Parabolic Equations with Unconditional Maximum Bound Principle Preservation and Energy StabilityHaishen Dai, Huan Lei, Bin Zheng
We develop a family of stabilized backward differentiation formula (sBDF) schemes of orders one through four for semilinear parabolic equations. The proposed methods are designed to achieve three properties that are rarely available simultaneously in high-order time discretizations: unconditional preservation of the maximum bound principle (MBP), unconditional discrete energy stability, and practical matrix-free implementation. The construction integrates carefully designed stabilization terms, fixed-point iterations, and a pointwise cut-off strategy. The nonlinear algebraic systems arising from the implicit sBDF discretizations are solved by fixed-point iteration, resulting in fully matrix-free algorithms. This makes the approach particularly attractive for practical computations on general domains and under mixed boundary conditions, where FFT-based exponential time differencing methods are often unavailable or inefficient. We further present a unified analysis for the fully implemented schemes, explicitly incorporating the interplay among time discretization, nonlinear iteration, and cut-off. Unconditional contractivity of the fixed-point iterations and error estimates are established. For the Allen-Cahn equation, we additionally prove an unconditional discrete energy dissipation law. Numerical experiments confirm the theoretical convergence rates and demonstrate the robustness and efficiency of the proposed methods, particularly relative to ETD-based approaches for problems with mixed boundary conditions.
COMP-PHMar 31, 2025
Data-driven construction of a generalized kinetic collision operator from molecular dynamicsYue Zhao, Joshua W. Burby, Andrew Christlieb et al.
We introduce a data-driven approach to learn a generalized kinetic collision operator directly from molecular dynamics. Unlike the conventional (e.g., Landau) models, the present operator takes an anisotropic form that accounts for a second energy transfer arising from the collective interactions between the pair of collision particles and the environment. Numerical results show that preserving the broadly overlooked anisotropic nature of the collision energy transfer is crucial for predicting the plasma kinetics with non-negligible correlations, where the Landau model shows limitations.
CVMar 20, 2025
OffsetOPT: Explicit Surface Reconstruction without NormalsHuan Lei
Neural surface reconstruction has been dominated by implicit representations with marching cubes for explicit surface extraction. However, those methods typically require high-quality normals for accurate reconstruction. We propose OffsetOPT, a method that reconstructs explicit surfaces directly from 3D point clouds and eliminates the need for point normals. The approach comprises two stages: first, we train a neural network to predict surface triangles based on local point geometry, given uniformly distributed training point clouds. Next, we apply the frozen network to reconstruct surfaces from unseen point clouds by optimizing a per-point offset to maximize the accuracy of triangle predictions. Compared to state-of-the-art methods, OffsetOPT not only excels at reconstructing overall surfaces but also significantly preserves sharp surface features. We demonstrate its accuracy on popular benchmarks, including small-scale shapes and large-scale open surfaces.
CVDec 18, 2024
Level-Set Parameters: Novel Representation for 3D Shape AnalysisHuan Lei, Hongdong Li, Andreas Geiger et al.
3D shape analysis has been largely focused on traditional 3D representations of point clouds and meshes, but the discrete nature of these data makes the analysis susceptible to variations in input resolutions. Recent development of neural fields brings in level-set parameters from signed distance functions as a novel, continuous, and numerical representation of 3D shapes, where the shape surfaces are defined as zero-level-sets of those functions. This motivates us to extend shape analysis from the traditional 3D data to these novel parameter data. Since the level-set parameters are not Euclidean like point clouds, we establish correlations across different shapes by formulating them as a pseudo-normal distribution, and learn the distribution prior from the respective dataset. To further explore the level-set parameters with shape transformations, we propose to condition a subset of these parameters on rotations and translations, and generate them with a hypernetwork. This simplifies the pose-related shape analysis compared to using traditional data. We demonstrate the promise of the novel representations through applications in shape classification (arbitrary poses), retrieval, and 6D object pose estimation.
COMP-PHDec 29, 2021
DeePN$^2$: A deep learning-based non-Newtonian hydrodynamic modelLidong Fang, Pei Ge, Lei Zhang et al.
A long standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics. The main complication arises from the long polymer relaxation time, the complex molecular structure and heterogeneous interaction. DeePN$^2$, a deep learning-based non-Newtonian hydrodynamic model, has been proposed and has shown some success in systematically passing the micro-scale structural mechanics information to the macro-scale hydrodynamics for suspensions with simple polymer conformation and bond potential. The model retains a multi-scaled nature by mapping the polymer configurations into a set of symmetry-preserving macro-scale features. The extended constitutive laws for these macro-scale features can be directly learned from the kinetics of their micro-scale counterparts. In this paper, we develop DeePN$^2$ using more complex micro-structural models. We show that DeePN$^2$ can faithfully capture the broadly overlooked viscoelastic differences arising from the specific molecular structural mechanics without human intervention.
CVMar 13, 2021
Potential Escalator-related Injury Identification and Prevention Based on Multi-module Integrated System for Public HealthZeyu Jiao, Huan Lei, Hengshan Zong et al.
Escalator-related injuries threaten public health with the widespread use of escalators. The existing studies tend to focus on after-the-fact statistics, reflecting on the original design and use of defects to reduce the impact of escalator-related injuries, but few attention has been paid to ongoing and impending injuries. In this study, a multi-module escalator safety monitoring system based on computer vision is designed and proposed to simultaneously monitor and deal with three major injury triggers, including losing balance, not holding on to handrails and carrying large items. The escalator identification module is utilized to determine the escalator region, namely the region of interest. The passenger monitoring module is leveraged to estimate the passengers' pose to recognize unsafe behaviors on the escalator. The dangerous object detection module detects large items that may enter the escalator and raises alarms. The processing results of the above three modules are summarized in the safety assessment module as the basis for the intelligent decision of the system. The experimental results demonstrate that the proposed system has good performance and great application potential.
COMP-PHMar 7, 2020
Machine learning based non-Newtonian fluid model with molecular fidelityHuan Lei, Lei Wu, Weinan E
We introduce a machine-learning-based framework for constructing continuum non-Newtonian fluid dynamics model directly from a micro-scale description. Dumbbell polymer solutions are used as examples to demonstrate the essential ideas. To faithfully retain molecular fidelity, we establish a micro-macro correspondence via a set of encoders for the micro-scale polymer configurations and their macro-scale counterparts, a set of nonlinear conformation tensors. The dynamics of these conformation tensors can be derived from the micro-scale model and the relevant terms can be parametrized using machine learning. The final model named the deep non-Newtonian model (DeePN$^2$), takes the form of conventional non-Newtonian fluid dynamics models, with a new form of the objective tensor derivative. Both the formulation of the dynamic equation and the neural network representation rigorously preserve the rotational invariance, which ensures the admissibility of the constructed model. Numerical results demonstrate the accuracy of DeePN$^2$, where models based on empirical closures show limitations.
CVFeb 28, 2019
Octree guided CNN with Spherical Kernels for 3D Point CloudsHuan Lei, Naveed Akhtar, Ajmal Mian
We propose an octree guided neural network architecture and spherical convolutional kernel for machine learning from arbitrary 3D point clouds. The network architecture capitalizes on the sparse nature of irregular point clouds, and hierarchically coarsens the data representation with space partitioning. At the same time, the proposed spherical kernels systematically quantize point neighborhoods to identify local geometric structures in the data, while maintaining the properties of translation-invariance and asymmetry. We specify spherical kernels with the help of network neurons that in turn are associated with spatial locations. We exploit this association to avert dynamic kernel generation during network training that enables efficient learning with high resolution point clouds. The effectiveness of the proposed technique is established on the benchmark tasks of 3D object classification and segmentation, achieving new state-of-the-art on ShapeNet and RueMonge2014 datasets.
CVMay 21, 2018
Spherical Convolutional Neural Network for 3D Point CloudsHuan Lei, Naveed Akhtar, Ajmal Mian
We propose a neural network for 3D point cloud processing that exploits `spherical' convolution kernels and octree partitioning of space. The proposed metric-based spherical kernels systematically quantize point neighborhoods to identify local geometric structures in data, while maintaining the properties of translation-invariance and asymmetry. The network architecture itself is guided by octree data structuring that takes full advantage of the sparse nature of irregular point clouds. We specify spherical kernels with the help of neurons in each layer that in turn are associated with spatial locations. We exploit this association to avert dynamic kernel generation during network training, that enables efficient learning with high resolution point clouds. We demonstrate the utility of the spherical convolutional neural network for 3D object classification on standard benchmark datasets.
CVNov 25, 2016
Color Constancy with Derivative ColorsHuan Lei, Guang Jiang, Long Quan
Information about the illuminant color is well contained in both achromatic regions and the specular components of highlight regions. In this paper, we propose a novel way to achieve color constancy by exploiting such clues. The key to our approach lies in the use of suitably extracted derivative colors, which are able to compute the illuminant color robustly with kernel density estimation. While extracting derivative colors from achromatic regions to approximate the illuminant color well is basically straightforward, the success of our extraction in highlight regions is attributed to the different rates of variation of the diffuse and specular magnitudes in the dichromatic reflection model. The proposed approach requires no training phase and is simple to implement. More significantly, it performs quite satisfactorily under inter-database parameter settings. Our experiments on three standard databases demonstrate its effectiveness and fine performance in comparison to state-of-the-art methods.