OCFeb 20, 2018
MIMO Transmit Beampattern Matching Under Waveform ConstraintsZiping Zhao, Daniel P. Palomar
In this paper, the multiple-input multiple-output (MIMO) transmit beampattern matching problem is considered. The problem is formulated to approximate a desired transmit beampattern (i.e., an energy distribution in space and frequency) and to minimize the cross-correlation of signals reflected back to the array by considering different practical waveform constraints at the same time. Due to the nonconvexity of the objective function and the waveform constraints, the optimization problem is highly nonconvex. An efficient one-step method is proposed to solve this problem based on the majorization-minimization (MM) method. The performance of the proposed algorithms compared to the state-of-art algorithms is shown through numerical simulations.
CVMar 26Code
AdaSFormer: Adaptive Serialized Transformers for Monocular Semantic Scene Completion from Indoor EnvironmentsXuzhi Wang, Xinran Wu, Song Wang et al.
Indoor monocular semantic scene completion (MSSC) is notably more challenging than its outdoor counterpart due to complex spatial layouts and severe occlusions. While transformers are well suited for modeling global dependencies, their high memory cost and difficulty in reconstructing fine-grained details have limited their use in indoor MSSC. To address these limitations, we introduce AdaSFormer, a serialized transformer framework tailored for indoor MSSC. Our model features three key designs: (1) an Adaptive Serialized Transformer with learnable shifts that dynamically adjust receptive fields; (2) a Center-Relative Positional Encoding that captures spatial information richness; and (3) a Convolution-Modulated Layer Normalization that bridges heterogeneous representations between convolutional and transformer features. Extensive experiments on NYUv2 and Occ-ScanNet demonstrate that AdaSFormer achieves state-of-the-art performance. The code is publicly available at: https://github.com/alanWXZ/AdaSFormer.
LGJun 9, 2022
Towards Understanding Graph Neural Networks: An Algorithm Unrolling PerspectiveZepeng Zhang, Ziping Zhao
The graph neural network (GNN) has demonstrated its superior performance in various applications. The working mechanism behind it, however, remains mysterious. GNN models are designed to learn effective representations for graph-structured data, which intrinsically coincides with the principle of graph signal denoising (GSD). Algorithm unrolling, a "learning to optimize" technique, has gained increasing attention due to its prospects in building efficient and interpretable neural network architectures. In this paper, we introduce a class of unrolled networks built based on truncated optimization algorithms (e.g., gradient descent and proximal gradient descent) for GSD problems. They are shown to be tightly connected to many popular GNN models in that the forward propagations in these GNNs are in fact unrolled networks serving specific GSDs. Besides, the training process of a GNN model can be seen as solving a bilevel optimization problem with a GSD problem at the lower level. Such a connection brings a fresh view of GNNs, as we could try to understand their practical capabilities from their GSD counterparts, and it can also motivate designing new GNN models. Based on the algorithm unrolling perspective, an expressive model named UGDGNN, i.e., unrolled gradient descent GNN, is further proposed which inherits appealing theoretical properties. Extensive numerical simulations on seven benchmark datasets demonstrate that UGDGNN can achieve superior or competitive performance over the state-of-the-art models.
LGOct 3, 2022
ASGNN: Graph Neural Networks with Adaptive StructureZepeng Zhang, Songtao Lu, Zengfeng Huang et al.
The graph neural network (GNN) models have presented impressive achievements in numerous machine learning tasks. However, many existing GNN models are shown to be vulnerable to adversarial attacks, which creates a stringent need to build robust GNN architectures. In this work, we propose a novel interpretable message passing scheme with adaptive structure (ASMP) to defend against adversarial attacks on graph structure. Layers in ASMP are derived based on optimization steps that minimize an objective function that learns the node feature and the graph structure simultaneously. ASMP is adaptive in the sense that the message passing process in different layers is able to be carried out over dynamically adjusted graphs. Such property allows more fine-grained handling of the noisy (or perturbed) graph structure and hence improves the robustness. Convergence properties of the ASMP scheme are theoretically established. Integrating ASMP with neural networks can lead to a new family of GNN models with adaptive structure (ASGNN). Extensive experiments on semi-supervised node classification tasks demonstrate that the proposed ASGNN outperforms the state-of-the-art GNN architectures in terms of classification performance under various adversarial attacks.
CVJul 23, 2025
Monocular Semantic Scene Completion via Masked Recurrent NetworksXuzhi Wang, Xinran Wu, Song Wang et al.
Monocular Semantic Scene Completion (MSSC) aims to predict the voxel-wise occupancy and semantic category from a single-view RGB image. Existing methods adopt a single-stage framework that aims to simultaneously achieve visible region segmentation and occluded region hallucination, while also being affected by inaccurate depth estimation. Such methods often achieve suboptimal performance, especially in complex scenes. We propose a novel two-stage framework that decomposes MSSC into coarse MSSC followed by the Masked Recurrent Network. Specifically, we propose the Masked Sparse Gated Recurrent Unit (MS-GRU) which concentrates on the occupied regions by the proposed mask updating mechanism, and a sparse GRU design is proposed to reduce the computation cost. Additionally, we propose the distance attention projection to reduce projection errors by assigning different attention scores according to the distance to the observed surface. Experimental results demonstrate that our proposed unified framework, MonoMRN, effectively supports both indoor and outdoor scenes and achieves state-of-the-art performance on the NYUv2 and SemanticKITTI datasets. Furthermore, we conduct robustness analysis under various disturbances, highlighting the role of the Masked Recurrent Network in enhancing the model's resilience to such challenges. The source code is publicly available.
HCJul 10, 2019
AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect RecognitionFabien Ringeval, Björn Schuller, Michel Valstar et al.
The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: state-of-mind recognition, depression assessment with AI, and cross-cultural affect sensing, respectively.
MLMar 20, 2018
Sparse Reduced Rank Regression With Nonconvex RegularizationZiping Zhao, Daniel P. Palomar
In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing functions have been used for SRRR in literature. In this work, a nonconvex function is proposed for better sparsity inducing. An efficient algorithm is developed based on the alternating minimization (or projection) method to solve the nonconvex optimization problem. Numerical simulations show that the proposed algorithm is much more efficient compared to the benchmark methods and the nonconvex function can result in a better estimation accuracy.
MLOct 16, 2017
Robust Maximum Likelihood Estimation of Sparse Vector Error Correction ModelZiping Zhao, Daniel P. Palomar
In econometrics and finance, the vector error correction model (VECM) is an important time series model for cointegration analysis, which is used to estimate the long-run equilibrium variable relationships. The traditional analysis and estimation methodologies assume the underlying Gaussian distribution but, in practice, heavy-tailed data and outliers can lead to the inapplicability of these methods. In this paper, we propose a robust model estimation method based on the Cauchy distribution to tackle this issue. In addition, sparse cointegration relations are considered to realize feature selection and dimension reduction. An efficient algorithm based on the majorization-minimization (MM) method is applied to solve the proposed nonconvex problem. The performance of this algorithm is shown through numerical simulations.