Renjie Chen

h-index14

15papers

166citations

Novelty55%

AI Score47

Ranked #55,919 of 205,806 authors (top 27%)#20,587 in CV (top 35%)

15 Papers

SEAug 23, 2022

LogLG: Weakly Supervised Log Anomaly Detection via Log-Event Graph Construction

Hongcheng Guo, Yuhui Guo, Renjie Chen et al.

Fully supervised log anomaly detection methods suffer the heavy burden of annotating massive unlabeled log data. Recently, many semi-supervised methods have been proposed to reduce annotation costs with the help of parsed templates. However, these methods consider each keyword independently, which disregards the correlation between keywords and the contextual relationships among log sequences. In this paper, we propose a novel weakly supervised log anomaly detection framework, named LogLG, to explore the semantic connections among keywords from sequences. Specifically, we design an end-to-end iterative process, where the keywords of unlabeled logs are first extracted to construct a log-event graph. Then, we build a subgraph annotator to generate pseudo labels for unlabeled log sequences. To ameliorate the annotation quality, we adopt a self-supervised task to pre-train a subgraph annotator. After that, a detection model is trained with the generated pseudo labels. Conditioned on the classification results, we re-extract the keywords from the log sequences and update the log-event graph for the next iteration. Experiments on five benchmarks validate the effectiveness of LogLG for detecting anomalies on unlabeled log data and demonstrate that LogLG, as the state-of-the-art weakly supervised method, achieves significant performance improvements compared to existing methods.

CVSep 2, 2022

PCDNF: Revisiting Learning-based Point Cloud Denoising via Joint Normal Filtering

Zheng Liu, Yaowu Zhao, Sijing Zhan et al.

Recovering high quality surfaces from noisy point clouds, known as point cloud denoising, is a fundamental yet challenging problem in geometry processing. Most of the existing methods either directly denoise the noisy input or filter raw normals followed by updating point positions. Motivated by the essential interplay between point cloud denoising and normal filtering, we revisit point cloud denoising from a multitask perspective, and propose an end-to-end network, named PCDNF, to denoise point clouds via joint normal filtering. In particular, we introduce an auxiliary normal filtering task to help the overall network remove noise more effectively while preserving geometric features more accurately. In addition to the overall architecture, our network has two novel modules. On one hand, to improve noise removal performance, we design a shape-aware selector to construct the latent tangent space representation of the specific point by comprehensively considering the learned point and normal features and geometry priors. On the other hand, point features are more suitable for describing geometric details, and normal features are more conducive for representing geometric structures (e.g., sharp edges and corners). Combining point and normal features allows us to overcome their weaknesses. Thus, we design a feature refinement module to fuse point and normal features for better recovering geometric information. Extensive evaluations, comparisons, and ablation studies demonstrate that the proposed method outperforms state-of-the-arts for both point cloud denoising and normal filtering.

CVAug 4, 2025Code

Test-Time Model Adaptation for Quantized Neural Networks

Zeshuai Deng, Guohao Chen, Shuaicheng Niu et al.

Quantizing deep models prior to deployment is a widely adopted technique to speed up inference for various real-time applications, such as autonomous driving. However, quantized models often suffer from severe performance degradation in dynamic environments with potential domain shifts and this degradation is significantly more pronounced compared with their full-precision counterparts, as shown by our theoretical and empirical illustrations. To address the domain shift problem, test-time adaptation (TTA) has emerged as an effective solution by enabling models to learn adaptively from test data. Unfortunately, existing TTA methods are often impractical for quantized models as they typically rely on gradient backpropagation--an operation that is unsupported on quantized models due to vanishing gradients, as well as memory and latency constraints. In this paper, we focus on TTA for quantized models to improve their robustness and generalization ability efficiently. We propose a continual zeroth-order adaptation (ZOA) framework that enables efficient model adaptation using only two forward passes, eliminating the computational burden of existing methods. Moreover, we propose a domain knowledge management scheme to store and reuse different domain knowledge with negligible memory consumption, reducing the interference of different domain knowledge and fostering the knowledge accumulation during long-term adaptation. Experimental results on three classical architectures, including quantized transformer-based and CNN-based models, demonstrate the superiority of our methods for quantized model adaptation. On the quantized W6A6 ViT-B model, our ZOA is able to achieve a 5.0\% improvement over the state-of-the-art FOA on ImageNet-C dataset. The source code is available at https://github.com/DengZeshuai/ZOA.

CVJan 23, 2025

EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion

Jiangchuan Wei, Shiyue Yan, Wenfeng Lin et al.

Recent advancements in video generation have significantly impacted various downstream applications, particularly in identity-preserving video generation (IPT2V). However, existing methods struggle with "copy-paste" artifacts and low similarity issues, primarily due to their reliance on low-level facial image information. This dependence can result in rigid facial appearances and artifacts reflecting irrelevant details. To address these challenges, we propose EchoVideo, which employs two key strategies: (1) an Identity Image-Text Fusion Module (IITF) that integrates high-level semantic features from text, capturing clean facial identity representations while discarding occlusions, poses, and lighting variations to avoid the introduction of artifacts; (2) a two-stage training strategy, incorporating a stochastic method in the second phase to randomly utilize shallow facial information. The objective is to balance the enhancements in fidelity provided by shallow features while mitigating excessive reliance on them. This strategy encourages the model to utilize high-level features during training, ultimately fostering a more robust representation of facial identities. EchoVideo effectively preserves facial identities and maintains full-body integrity. Extensive experiments demonstrate that it achieves excellent results in generating high-quality, controllability and fidelity videos.

CVDec 5, 2024

HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting

Jingyu Lin, Jiaqi Gu, Lubin Fan et al.

Generating high-quality novel view renderings of 3D Gaussian Splatting (3DGS) in scenes featuring transient objects is challenging. We propose a novel hybrid representation, termed as HybridGS, using 2D Gaussians for transient objects per image and maintaining traditional 3D Gaussians for the whole static scenes. Note that, the 3DGS itself is better suited for modeling static scenes that assume multi-view consistency, but the transient objects appear occasionally and do not adhere to the assumption, thus we model them as planar objects from a single view, represented with 2D Gaussians. Our novel representation decomposes the scene from the perspective of fundamental viewpoint consistency, making it more reasonable. Additionally, we present a novel multi-view regulated supervision method for 3DGS that leverages information from co-visible regions, further enhancing the distinctions between the transients and statics. Then, we propose a straightforward yet effective multi-stage training strategy to ensure robust training and high-quality view synthesis across various settings. Experiments on benchmark datasets show our state-of-the-art performance of novel view synthesis in both indoor and outdoor scenes, even in the presence of distracting elements.

CVFeb 27, 2025

TrackGS: Optimizing COLMAP-Free 3D Gaussian Splatting with Global Track Constraints

Dongbo Shi, Shen Cao, Lubin Fan et al.

While 3D Gaussian Splatting (3DGS) has advanced ability on novel view synthesis, it still depends on accurate pre-computaed camera parameters, which are hard to obtain and prone to noise. Previous COLMAP-Free methods optimize camera poses using local constraints, but they often struggle in complex scenarios. To address this, we introduce TrackGS, which incorporates feature tracks to globally constrain multi-view geometry. We select the Gaussians associated with each track, which will be trained and rescaled to an infinitesimally small size to guarantee the spatial accuracy. We also propose minimizing both reprojection and backprojection errors for better geometric consistency. Moreover, by deriving the gradient of intrinsics, we unify camera parameter estimation with 3DGS training into a joint optimization framework, achieving SOTA performance on challenging datasets with severe camera movements.

MLNov 18, 2025

SCOPE: Spectral Concentration by Distributionally Robust Joint Covariance-Precision Estimation

Renjie Chen, Viet Anh Nguyen, Huifu Xu

We propose a distributionally robust formulation for simultaneously estimating the covariance matrix and the precision matrix of a random vector.The proposed model minimizes the worst-case weighted sum of the Frobenius loss of the covariance estimator and Stein's loss of the precision matrix estimator against all distributions from an ambiguity set centered at the nominal distribution. The radius of the ambiguity set is measured via convex spectral divergence. We demonstrate that the proposed distributionally robust estimation model can be reduced to a convex optimization problem, thereby yielding quasi-analytical estimators. The joint estimators are shown to be nonlinear shrinkage estimators. The eigenvalues of the estimators are shrunk nonlinearly towards a positive scalar, where the scalar is determined by the weight coefficient of the loss terms. By tuning the coefficient carefully, the shrinkage corrects the spectral bias of the empirical covariance/precision matrix estimator. By this property, we call the proposed joint estimator the Spectral concentrated COvariance and Precision matrix Estimator (SCOPE). We demonstrate that the shrinkage effect improves the condition number of the estimator. We provide a parameter-tuning scheme that adjusts the shrinkage target and intensity that is asymptotically optimal. Numerical experiments on synthetic and real data show that our shrinkage estimators perform competitively against state-of-the-art estimators in practical applications.

CVNov 21, 2025

NoPe-NeRF++: Local-to-Global Optimization of NeRF with No Pose Prior

Dongbo Shi, Shen Cao, Bojian Wu et al.

In this paper, we introduce NoPe-NeRF++, a novel local-to-global optimization algorithm for training Neural Radiance Fields (NeRF) without requiring pose priors. Existing methods, particularly NoPe-NeRF, which focus solely on the local relationships within images, often struggle to recover accurate camera poses in complex scenarios. To overcome the challenges, our approach begins with a relative pose initialization with explicit feature matching, followed by a local joint optimization to enhance the pose estimation for training a more robust NeRF representation. This method significantly improves the quality of initial poses. Additionally, we introduce global optimization phase that incorporates geometric consistency constraints through bundle adjustment, which integrates feature trajectories to further refine poses and collectively boost the quality of NeRF. Notably, our method is the first work that seamlessly combines the local and global cues with NeRF, and outperforms state-of-the-art methods in both pose estimation accuracy and novel view synthesis. Extensive evaluations on benchmark datasets demonstrate our superior performance and robustness, even in challenging scenes, thus validating our design choices.

CVJun 5, 2025

ContentV: Efficient Training of Video Generation Models with Limited Compute

Wenfeng Lin, Renjie Chen, Boyuan Liu et al.

Recent advances in video generation demand increasingly efficient training recipes to mitigate escalating computational costs. In this report, we present ContentV, an 8B-parameter text-to-video model that achieves state-of-the-art performance (85.14 on VBench) after training on 256 x 64GB Neural Processing Units (NPUs) for merely four weeks. ContentV generates diverse, high-quality videos across multiple resolutions and durations from text prompts, enabled by three key innovations: (1) A minimalist architecture that maximizes reuse of pre-trained image generation models for video generation; (2) A systematic multi-stage training strategy leveraging flow matching for enhanced efficiency; and (3) A cost-effective reinforcement learning with human feedback framework that improves generation quality without requiring additional human annotations. All the code and models are available at: https://contentv.github.io.

CVMar 19, 2024

Learning Neural Volumetric Pose Features for Camera Localization

Jingyu Lin, Jiaqi Gu, Bojian Wu et al.

We introduce a novel neural volumetric pose feature, termed PoseMap, designed to enhance camera localization by encapsulating the information between images and the associated camera poses. Our framework leverages an Absolute Pose Regression (APR) architecture, together with an augmented NeRF module. This integration not only facilitates the generation of novel views to enrich the training dataset but also enables the learning of effective pose features. Additionally, we extend our architecture for self-supervised online alignment, allowing our method to be used and fine-tuned for unlabelled images within a unified framework. Experiments demonstrate that our method achieves 14.28% and 20.51% performance gain on average in indoor and outdoor benchmark scenes, outperforming existing APR methods with state-of-the-art accuracy.

MLJul 17, 2019

Clustering Activity-Travel Behavior Time Series using Topological Data Analysis

Renjie Chen, Jingyue Zhang, Nalini Ravishanker et al.

Over the last few years, traffic data has been exploding and the transportation discipline has entered the era of big data. It brings out new opportunities for doing data-driven analysis, but it also challenges traditional analytic methods. This paper proposes a new Divide and Combine based approach to do K means clustering on activity-travel behavior time series using features that are derived using tools in Time Series Analysis and Topological Data Analysis. Clustering data from five waves of the National Household Travel Survey ranging from 1990 to 2017 suggests that activity-travel patterns of individuals over the last three decades can be grouped into three clusters. Results also provide evidence in support of recent claims about differences in activity-travel patterns of different survey cohorts. The proposed method is generally applicable and is not limited only to activity-travel behavior analysis in transportation studies. Driving behavior, travel mode choice, household vehicle ownership, when being characterized as categorical time series, can all be analyzed using the proposed method.

DSDec 18, 2018

A Scalable Heuristic for Fastest-Path Computation on Very Large Road Maps

Renjie Chen, Craig Gotsman

Fastest-path queries between two points in a very large road map is an increasingly important primitive in modern transportation and navigation systems, thus very efficient computation of these paths is critical for system performance and throughput. We present a method to compute an effective heuristic for the fastest path travel time between two points on a road map, which can be used to significantly accelerate the classical A* algorithm when computing fastest paths. Our method is based on two hierarchical sets of separators of the map represented by two binary trees. A preprocessing step computes a short vector of values per road junction based on the separator trees, which is then stored with the map and used to efficiently compute the heuristic at the online query stage. We demonstrate experimentally that this method scales well to any map size, providing a better quality heuristic, thus more efficient A* search, for fastest path queries between points at all distances - especially small and medium range - relative to other known heuristics.

DSOct 2, 2018

Efficient Fastest-Path Computations in Road Maps

Renjie Chen, Craig Gotsman

In the age of real-time online traffic information and GPS-enabled devices, fastest-path computations between two points in a road network modeled as a directed graph, where each directed edge is weighted by a "travel time" value, are becoming a standard feature of many navigation-related applications. To support this, very efficient computation of these paths in very large road networks is critical. Fastest paths may be computed as minimal-cost paths in a weighted directed graph, but traditional minimal-cost path algorithms based on variants of the classic Dijkstra algorithm do not scale well, as in the worst case they may traverse the entire graph. A common improvement, which can dramatically reduce the number of traversed graph vertices, is the A* algorithm, which requires a good heuristic lower bound on the minimal cost. We introduce a simple, but very effective, heuristic function based on a small number of values assigned to each graph vertex. The values are based on graph separators and computed efficiently in a preprocessing stage. We present experimental results demonstrating that our heuristic provides estimates of the minimal cost which are superior to those of other heuristics. Our experiments show that when used in the A* algorithm, this heuristic can reduce the number of vertices traversed by an order of magnitude compared to other heuristics.

ROAug 19, 2017

Practical Distance Functions for Path-Planning in Planar Domains

Renjie Chen, Craig Gotsman, Kai Hormann

Path planning is an important problem in robotics. One way to plan a path between two points $x,y$ within a (not necessarily simply-connected) planar domain $Ω$, is to define a non-negative distance function $d(x,y)$ on $Ω\timesΩ$ such that following the (descending) gradient of this distance function traces such a path. This presents two equally important challenges: A mathematical challenge -- to define $d$ such that $d(x,y)$ has a single minimum for any fixed $y$ (and this is when $x=y$), since a local minimum is in effect a "dead end", A computational challenge -- to define $d$ such that it may be computed efficiently. In this paper, given a description of $Ω$, we show how to assign coordinates to each point of $Ω$ and define a family of distance functions between points using these coordinates, such that both the mathematical and the computational challenges are met. This is done using the concepts of \emph{harmonic measure} and \emph{$f$-divergences}. In practice, path planning is done on a discrete network defined on a finite set of \emph{sites} sampled from $Ω$, so any method that works well on the continuous domain must be adapted so that it still works well on the discrete domain. Given a set of sites sampled from $Ω$, we show how to define a network connecting these sites such that a \emph{greedy routing} algorithm (which is the discrete equivalent of continuous gradient descent) based on the distance function mentioned above is guaranteed to generate a path in the network between any two such sites. In many cases, this network is close to a (desirable) planar graph, especially if the set of sites is dense.

ROAug 9, 2017

Path Planning with Divergence-Based Distance Functions

Renjie Chen, Craig Gotsman, Kai Hormann

Distance functions between points in a domain are sometimes used to automatically plan a gradient-descent path towards a given target point in the domain, avoiding obstacles that may be present. A key requirement from such distance functions is the absence of spurious local minima, which may foil such an approach, and this has led to the common use of harmonic potential functions. Based on the planar Laplace operator, the potential function guarantees the absence of spurious minima, but is well known to be slow to numerically compute and prone to numerical precision issues. To alleviate the first of these problems, we propose a family of novel divergence distances. These are based on f-divergence of the Poisson kernel of the domain. We define the divergence distances and compare them to the harmonic potential function and other related distance functions. Our first result is theoretical: We show that the family of divergence distances are equivalent to the harmonic potential function on simply-connected domains, namely generate paths which are identical to those generated by the potential function. The proof is based on the concept of conformal invariance. Our other results are more practical and relate to two special cases of divergence distances, one based on the Kullback-Leibler divergence and one based on the total variation divergence. We show that using divergence distances instead of the potential function and other distances has a significant computational advantage, as, following a pre-processing stage, they may be computed up to an order of magnitude faster than the others when taking advantage of certain sparsity properties of the Poisson kernel. Furthermore, the computation is "embarrassingly parallel", so may be implemented on a GPU with up to three orders of magnitude speedup.