CVJan 21, 2023
Dense RGB SLAM with Neural Implicit MapsHeng Li, Xiaodong Gu, Weihao Yuan et al.
There is an emerging trend of using neural implicit functions for map representation in Simultaneous Localization and Mapping (SLAM). Some pioneer works have achieved encouraging results on RGB-D SLAM. In this paper, we present a dense RGB SLAM method with neural implicit map representation. To reach this challenging goal without depth input, we introduce a hierarchical feature volume to facilitate the implicit map decoder. This design effectively fuses shape cues across different scales to facilitate map reconstruction. Our method simultaneously solves the camera motion and the neural implicit map by matching the rendered and input video frames. To facilitate optimization, we further propose a photometric warping loss in the spirit of multi-view stereo to better constrain the camera pose and scene geometry. We evaluate our method on commonly used benchmarks and compare it with modern RGB and RGB-D SLAM systems. Our method achieves favorable results than previous methods and even surpasses some recent RGB-D SLAM methods.The code is at poptree.github.io/DIM-SLAM/.
67.7SYMay 15
Communication-Efficient Federated Online Decision-Making with Stateful CostsYiwei Liu, Luwei Yang, Shunbo Lei
We study dynamic regret in federated online decision-making with stateful incurred costs under block-based synchronization and partial client participation. In this setting, sparse communication affects not only the pointwise update quality but also the realized state trajectory along which costs are incurred. We propose \textbf{BLADE}, a projected blockwise federated online decision method. BLADE uses only \(O(T/K)\) communication and achieves a dynamic-regret bound for the incurred cost against path-length-bounded comparator sequences; under \(K=\lceil\sqrt T\rceil\), the bound is sublinear whenever \(V_T=o(T^{1/4})\). Experiments on a controlled synthetic stable linear system validate the predicted communication--regret, memory, participation, disturbance-variation, and horizon-scaling effects.
5.2SYMay 11
Delay-Robust Secondary Frequency Control via Passive Interconnection and Randomized Block UpdatesYiwei Liu, Luwei Yang, Shunbo Lei
This paper studies secondary frequency control in transmission networks subject to communication delays at the cyber-physical interface and limited per-update computation at the control center. The regulation objective is formulated as a constrained economic dispatch problem incorporating generation capacity constraints, nodal power balance, transmission-flow limits, and scheduled tie-line power exchanges. Based on this formulation, we develop a passivity-based control framework in which an augmented projected primal-dual controller restores nominal frequency and drives the closed-loop system to the solution set of the constrained economic dispatch problem. Two-way communication delays between the physical network and the control center are modeled as scattering-based passive channels for the measurement uplink and the control-command downlink. This construction preserves the target equilibrium and enables a delay-robust passivity analysis of the delayed closed loop. To reduce the computational burden at the control center, we develop a randomized block-coordinate implementation of the augmented projected primal-dual controller. The resulting sampled-data closed loop preserves the target solution set and achieves local mean-square geometric convergence under suitable step-size and regularity conditions. Finally, a multivariable wave-domain interface filter is introduced to inject additional dissipation and improve the damping of the delayed interface without altering the steady-state interconnection. Simulations on the IEEE 14-bus system indicate that the proposed digital implementation accurately reproduces the delayed closed-loop behavior while reducing the per-update computational cost.
CVDec 17, 2023
PNeRFLoc: Visual Localization with Point-based Neural Radiance FieldsBoming Zhao, Luwei Yang, Mao Mao et al.
Due to the ability to synthesize high-quality novel views, Neural Radiance Fields (NeRF) have been recently exploited to improve visual localization in a known environment. However, the existing methods mostly utilize NeRFs for data augmentation to improve the regression model training, and the performance on novel viewpoints and appearances is still limited due to the lack of geometric constraints. In this paper, we propose a novel visual localization framework, \ie, PNeRFLoc, based on a unified point-based representation. On the one hand, PNeRFLoc supports the initial pose estimation by matching 2D and 3D feature points as traditional structure-based methods; on the other hand, it also enables pose refinement with novel view synthesis using rendering-based optimization. Specifically, we propose a novel feature adaption module to close the gaps between the features for visual localization and neural rendering. To improve the efficacy and efficiency of neural rendering-based optimization, we also develop an efficient rendering-based framework with a warping loss function. Furthermore, several robustness techniques are developed to handle illumination changes and dynamic objects for outdoor scenarios. Experiments demonstrate that PNeRFLoc performs the best on synthetic data when the NeRF model can be well learned and performs on par with the SOTA method on the visual localization benchmark datasets.
IRJan 15, 2024
Deep Evolutional Instant Interest Network for CTR Prediction in Trigger-Induced RecommendationZhibo Xiao, Luwei Yang, Tao Zhang et al.
The recommendation has been playing a key role in many industries, e.g., e-commerce, streaming media, social media, etc. Recently, a new recommendation scenario, called Trigger-Induced Recommendation (TIR), where users are able to explicitly express their instant interests via trigger items, is emerging as an essential role in many e-commerce platforms, e.g., Alibaba.com and Amazon. Without explicitly modeling the user's instant interest, traditional recommendation methods usually obtain sub-optimal results in TIR. Even though there are a few methods considering the trigger and target items simultaneously to solve this problem, they still haven't taken into account temporal information of user behaviors, the dynamic change of user instant interest when the user scrolls down and the interactions between the trigger and target items. To tackle these problems, we propose a novel method -- Deep Evolutional Instant Interest Network (DEI2N), for click-through rate prediction in TIR scenarios. Specifically, we design a User Instant Interest Modeling Layer to predict the dynamic change of the intensity of instant interest when the user scrolls down. Temporal information is utilized in user behavior modeling. Moreover, an Interaction Layer is introduced to learn better interactions between the trigger and target items. We evaluate our method on several offline and real-world industrial datasets. Experimental results show that our proposed DEI2N outperforms state-of-the-art baselines. In addition, online A/B testing demonstrates the superiority over the existing baseline in real-world production environments.
CVDec 19, 2024
GURecon: Learning Detailed 3D Geometric Uncertainties for Neural Surface ReconstructionZesong Yang, Ru Zhang, Jiale Shi et al.
Neural surface representation has demonstrated remarkable success in the areas of novel view synthesis and 3D reconstruction. However, assessing the geometric quality of 3D reconstructions in the absence of ground truth mesh remains a significant challenge, due to its rendering-based optimization process and entangled learning of appearance and geometry with photometric losses. In this paper, we present a novel framework, i.e, GURecon, which establishes a geometric uncertainty field for the neural surface based on geometric consistency. Different from existing methods that rely on rendering-based measurement, GURecon models a continuous 3D uncertainty field for the reconstructed surface, and is learned by an online distillation approach without introducing real geometric information for supervision. Moreover, in order to mitigate the interference of illumination on geometric consistency, a decoupled field is learned and exploited to finetune the uncertainty field. Experiments on various datasets demonstrate the superiority of GURecon in modeling 3D geometric uncertainty, as well as its plug-and-play extension to various neural surface representations and improvement on downstream tasks such as incremental reconstruction. The code and supplementary material are available on the project website: https://zju3dv.github.io/GURecon/.
LGFeb 12, 2021
Multiplex Bipartite Network Embedding using Dual Hypergraph Convolutional NetworksHansheng Xue, Luwei Yang, Vaibhav Rajan et al.
A bipartite network is a graph structure where nodes are from two distinct domains and only inter-domain interactions exist as edges. A large number of network embedding methods exist to learn vectorial node representations from general graphs with both homogeneous and heterogeneous node and edge types, including some that can specifically model the distinct properties of bipartite networks. However, these methods are inadequate to model multiplex bipartite networks (e.g., in e-commerce), that have multiple types of interactions (e.g., click, inquiry, and buy) and node attributes. Most real-world multiplex bipartite networks are also sparse and have imbalanced node distributions that are challenging to model. In this paper, we develop an unsupervised Dual HyperGraph Convolutional Network (DualHGCN) model that scalably transforms the multiplex bipartite network into two sets of homogeneous hypergraphs and uses spectral hypergraph convolutional operators, along with intra- and inter-message passing strategies to promote information exchange within and across domains, to learn effective node embedding. We benchmark DualHGCN using four real-world datasets on link prediction and node classification tasks. Our extensive experiments demonstrate that DualHGCN significantly outperforms state-of-the-art methods, and is robust to varying sparsity levels and imbalanced node distributions.
SIApr 1, 2020
Modeling Dynamic Heterogeneous Network for Link Prediction using Hierarchical Attention with Temporal RNNHansheng Xue, Luwei Yang, Wen Jiang et al.
Network embedding aims to learn low-dimensional representations of nodes while capturing structure information of networks. It has achieved great success on many tasks of network analysis such as link prediction and node classification. Most of existing network embedding algorithms focus on how to learn static homogeneous networks effectively. However, networks in the real world are more complex, e.g., networks may consist of several types of nodes and edges (called heterogeneous information) and may vary over time in terms of dynamic nodes and edges (called evolutionary patterns). Limited work has been done for network embedding of dynamic heterogeneous networks as it is challenging to learn both evolutionary and heterogeneous information simultaneously. In this paper, we propose a novel dynamic heterogeneous network embedding method, termed as DyHATR, which uses hierarchical attention to learn heterogeneous information and incorporates recurrent neural networks with temporal attention to capture evolutionary patterns. We benchmark our method on four real-world datasets for the task of link prediction. Experimental results show that DyHATR significantly outperforms several state-of-the-art baselines.
CVJul 5, 2016
Attribute Recognition from Adaptive PartsLuwei Yang, Ligen Zhu, Yichen Wei et al.
Previous part-based attribute recognition approaches perform part detection and attribute recognition in separate steps. The parts are not optimized for attribute recognition and therefore could be sub-optimal. We present an end-to-end deep learning approach to overcome the limitation. It generates object parts from key points and perform attribute recognition accordingly, allowing adaptive spatial transform of the parts. Both key point estimation and attribute recognition are learnt jointly in a multi-task setting. Extensive experiments on two datasets verify the efficacy of proposed end-to-end approach.