CVJan 7, 2023
A Novel Improved Mask RCNN for Multiple Targets Detection in the Indoor Complex ScenesZongmin Liu, Jirui Wang, Jie Li et al.
With the expansive aging of global population, service robot with living assistance applied in indoor scenes will serve as a crucial role in the field of elderly care and health in the future. Service robots need to detect multiple targets when completing auxiliary tasks. However, indoor scenes are usually complex and there are many types of interference factors, leading to great challenges in the multiple targets detection. To overcome this technical difficulty, a novel improved Mask RCNN method for multiple targets detection in the indoor complex scenes is proposed in this paper. The improved model utilizes Mask RCNN as the network framework. On this basis, Convolutional Block Attention Module (CBAM) with channel mechanism and space mechanism is integrated, and the influence of different background, distance, angle and interference factors are comprehensively considered. Meanwhile, in order to evaluate the detection and identification effects of the established model, a comprehensive evaluation system based on loss function and Mean Average Precision (mAP) is established. For verification, experiments on the detection and identification effects under different distances, backgrounds, angles and interference factors were conducted. The results show that designed model improves the accuracy to a higher level and has a better anti-interference ability than other methods when the detection speed was nearly the same.
LGApr 5, 2022
Multi-task nonparallel support vector machine for classificationZongmin Liu, Yitian Xu
Direct multi-task twin support vector machine (DMTSVM) explores the shared information between multiple correlated tasks, then it produces better generalization performance. However, it contains matrix inversion operation when solving the dual problems, so it costs much running time. Moreover, kernel trick cannot be directly utilized in the nonlinear case. To effectively avoid above problems, a novel multi-task nonparallel support vector machine (MTNPSVM) including linear and nonlinear cases is proposed in this paper. By introducing epsilon-insensitive loss instead of square loss in DMTSVM, MTNPSVM effectively avoids matrix inversion operation and takes full advantage of the kernel trick. Theoretical implication of the model is further discussed. To further improve the computational efficiency, the alternating direction method of multipliers (ADMM) is employed when solving the dual problem. The computational complexity and convergence of the algorithm are provided. In addition, the property and sensitivity of the parameter in model are further explored. The experimental results on fifteen benchmark datasets and twelve image datasets demonstrate the validity of MTNPSVM in comparison with the state-of-the-art algorithms. Finally, it is applied to real Chinese Wine dataset, and also verifies its effectiveness.
ROJul 20, 2023
A novel integrated method of detection-grasping for specific object based on the box coordinate matchingZongmin Liu, Jirui Wang, Jie Li et al.
To better care for the elderly and disabled, it is essential for service robots to have an effective fusion method of object detection and grasp estimation. However, limited research has been observed on the combination of object detection and grasp estimation. To overcome this technical difficulty, a novel integrated method of detection-grasping for specific object based on the box coordinate matching is proposed in this paper. Firstly, the SOLOv2 instance segmentation model is improved by adding channel attention module (CAM) and spatial attention module (SAM). Then, the atrous spatial pyramid pooling (ASPP) and CAM are added to the generative residual convolutional neural network (GR-CNN) model to optimize grasp estimation. Furthermore, a detection-grasping integrated algorithm based on box coordinate matching (DG-BCM) is proposed to obtain the fusion model of object detection and grasp estimation. For verification, experiments on object detection and grasp estimation are conducted separately to verify the superiority of improved models. Additionally, grasping tasks for several specific objects are implemented on a simulation platform, demonstrating the feasibility and effectiveness of DG-BCM algorithm proposed in this paper.
LGJun 7, 2024
From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural NetworksVignesh Kothapalli, Tianyu Pang, Shenyang Deng et al.
Training strategies for modern deep neural networks (NNs) tend to induce a heavy-tailed (HT) empirical spectral density (ESD) in the layer weights. While previous efforts have shown that the HT phenomenon correlates with good generalization in large NNs, a theoretical explanation of its occurrence is still lacking. Especially, understanding the conditions which lead to this phenomenon can shed light on the interplay between generalization and weight spectra. Our work aims to bridge this gap by presenting a simple, rich setting to model the emergence of HT ESD. In particular, we present a theory-informed setup for 'crafting' heavy tails in the ESD of two-layer NNs and present a systematic analysis of the HT ESD emergence without any gradient noise. This is the first work to analyze a noise-free setting, and we also incorporate optimizer (GD/Adam) dependent (large) learning rates into the HT ESD analysis. Our results highlight the role of learning rates on the Bulk+Spike and HT shape of the ESDs in the early phase of training, which can facilitate generalization in the two-layer NN. These observations shed light on the behavior of large-scale NNs, albeit in a much simpler setting.
SYNov 30, 2021
Double Fuzzy Probabilistic Interval Linguistic Term Set and a Dynamic Fuzzy Decision Making Model based on Markov Process with tts Application in Multiple Criteria Group Decision MakingZongmin Liu
The probabilistic linguistic term has been proposed to deal with probability distributions in provided linguistic evaluations. However, because it has some fundamental defects, it is often difficult for decision-makers to get reasonable information of linguistic evaluations for group decision making. In addition, weight information plays a significant role in dynamic information fusion and decision making process. However, there are few research methods to determine the dynamic attribute weight with time. In this paper, I propose the concept of double fuzzy probability interval linguistic term set (DFPILTS). Firstly, fuzzy semantic integration, DFPILTS definition, its preference relationship, some basic algorithms and aggregation operators are defined. Then, a fuzzy linguistic Markov matrix with its network is developed. Then, a weight determination method based on distance measure and information entropy to reducing the inconsistency of DFPILPR and obtain collective priority vector based on group consensus is developed. Finally, an aggregation-based approach is developed, and an optimal investment case from a financial risk is used to illustrate the application of DFPILTS and decision method in multi-criteria decision making.