Hossein Mousavi

h-index12

12papers

156citations

Novelty43%

AI Score23

Ranked #176,276 of 194,257 authors (top 91%)#55,810 in CV (top 94%)

12 Papers

1.2SYJul 18, 2019

Space-Time Sampling for Network Observability

Hossein K. Mousavi, Qiyu Sun, Nader Motee

Designing sparse sampling strategies is one of the important components in having resilient estimation and control in networked systems as they make network design problems more cost-effective due to their reduced sampling requirements and less fragile to where and when samples are collected. It is shown that under what conditions taking coarse samples from a network will contain the same amount of information as a more finer set of samples. Our goal is to estimate initial condition of linear time-invariant networks using a set of noisy measurements. The observability condition is reformulated as the frame condition, where one can easily trace location and time stamps of each sample. We compare estimation quality of various sampling strategies using estimation measures, which depend on spectrum of the corresponding frame operators. Using properties of the minimal polynomial of the state matrix, deterministic and randomized methods are suggested to construct observability frames. Intrinsic tradeoffs assert that collecting samples from fewer subsystems dictates taking more samples (in average) per subsystem. Three scalable algorithms are developed to generate sparse space-time sampling strategies with explicit error bounds.

1.2SYDec 21, 2018

Sparse Sensing, Communication, and Actuation via Self-Triggered Control Algorithms

MirSaleh Bahavarnia, Hossein K. Mousavi, Nader Motee

We propose a self-triggered control algorithm to reduce onboard processor usage, communication bandwidth, and energy consumption across a linear time-invariant networked control system. We formulate an optimal control problem by penalizing the l0-measures of the feedback gain and the vector of control inputs and maximizing the dwell time between the consecutive triggering times. It is shown that the corresponding l1-relaxation of the optimal control problem is feasible and results in a stabilizing feedback control law with guaranteed performance bounds, while providing a sparse schedule for collecting samples from sensors, communication with other subsystems, and activating the input actuators.

1.2SYSep 26, 2019

Resilient Sparse Controller Design with Guaranteed Disturbance Attenuation

MirSaleh Bahavarnia, Hossein K. Mousavi

We design resilient sparse state-feedback controllers for a linear time-invariant (LTI) control system while attaining a pre-specified guarantee on ${\mathcal{H}}_\infty$ performance measure. We leverage a technique from non-fragile control theory to identify a region of resilient state-feedback controllers. Afterward, we explore the region to identify a sparse controller. To this end, we use two different techniques: the greedy method of sparsification, as well as the re-weighted $\ell_1$ norm minimization. Our approach highlights a tradeoff between the sparsity of the feedback gain, performance measure, and fragility of the design. To best of our knowledge, this work is the first framework providing performance guarantees for sparse feedback gain design.

0.9CVOct 22, 2019

Predictive Coding Networks Meet Action Recognition

Xia Huang, Hossein Mousavi, Gemma Roig

Action recognition is a key problem in computer vision that labels videos with a set of predefined actions. Capturing both, semantic content and motion, along the video frames is key to achieve high accuracy performance on this task. Most of the state-of-the-art methods rely on RGB frames for extracting the semantics and pre-computed optical flow fields as a motion cue. Then, both are combined using deep neural networks. Yet, it has been argued that such models are not able to leverage the motion information extracted from the optical flow, but instead the optical flow allows for better recognition of people and objects in the video. This urges the need to explore different cues or models that can extract motion in a more informative fashion. To tackle this issue, we propose to explore the predictive coding network, so called PredNet, a recurrent neural network that propagates predictive coding errors across layers and time steps. We analyze whether PredNet can better capture motions in videos by estimating over time the representations extracted from pre-trained networks for action recognition. In this way, the model only relies on the video frames, and does not need pre-processed optical flows as input. We report the effectiveness of our proposed model on UCF101 and HMDB51 datasets.

4.8LGSep 20, 2019

A Layered Architecture for Active Perception: Image Classification using Deep Reinforcement Learning

Hossein K. Mousavi, Guangyi Liu, Weihang Yuan et al.

We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem. A three-layer architecture is suggested that consists of a meta-layer that decides the intermediate goals, an action-layer that selects local actions as the agent navigates towards a goal, and a classification-layer that evaluates the reward and makes a prediction. We design and implement these layers using deep reinforcement learning. A generalized policy gradient algorithm is utilized to learn the parameters of these layers to maximize the expected reward. Our proposed methodology is tested on the MNIST dataset of handwritten digits, which provides us with a level of explainability while interpreting the agent's intermediate goals and course of action.

8.1LGMay 13, 2019

Multi-Agent Image Classification via Reinforcement Learning

Hossein K. Mousavi, Mohammadreza Nazari, Martin Takáč et al.

We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allowed to exchange information with their neighboring agents to update their own beliefs. It is shown how reinforcement learning techniques can be utilized to achieve decentralized implementation of the classification problem by running a decentralized consensus protocol. Our experimental results on the MNIST handwritten digit dataset demonstrates the effectiveness of our proposed framework.

1.9ROFeb 4, 2019

Estimation with Fast Landmark Selection in Robot Visual Navigation

Hossein K. Mousavi, Nader Motee

We consider the visual feature selection to improve the estimation quality required for the accurate navigation of a robot. We build upon a key property that asserts: contributions of trackable features (landmarks) appear linearly in the information matrix of the corresponding estimation problem. We utilize standard models for motion and vision system using a camera to formulate the feature selection problem over moving finite time horizons. A scalable randomized sampling algorithm is proposed to select more informative features (and ignore the rest) to achieve a superior position estimation quality. We provide probabilistic performance guarantees for our method. The time-complexity of our feature selection algorithm is linear in the number of candidate features, which is practically plausible and outperforms existing greedy methods that scale quadratically with the number of candidates features. Our numerical simulations confirm that not only the execution time of our proposed method is comparably less than that of the greedy method, but also the resulting estimation quality is very close to the greedy method.

1.2SYApr 19, 2019

Koopman Performance Analysis of Nonlinear Consensus Networks

Hossein K. Mousavi, Christoforos Somarakis, Qiyu Sun et al.

Spectral decomposition of dynamical systems is a popular methodology to investigate the fundamental qualitative and quantitative properties of these systems and their solutions. In this chapter, we consider a class of nonlinear cooperative protocols, which consist of multiple agents that are coupled together via an undirected state-dependent graph. We develop a representation of the system solution by decomposing the nonlinear system utilizing ideas from the Koopman operator theory and its spectral analysis. We use recent results on the extensions of the well-known Hartman theorem for hyperbolic systems to establish a connection between the original nonlinear dynamics and the linearized dynamics in terms of Koopman spectral properties. The expected value of the output energy of the nonlinear protocol, which is related to the notions of coherence and robustness in dynamical networks, is evaluated and characterized in terms of Koopman eigenvalues, eigenfunctions, and modes. Spectral representation of the performance measure enables us to develop algorithmic methods to assess the performance of this class of nonlinear dynamical networks as a function of their graph topology. Finally, we propose a scalable computational method for approximation of the components of the Koopman mode decomposition, which is necessary to evaluate the systemic performance measure of the nonlinear dynamic network.

1.1CVNov 21, 2016

Efficient Convolutional Neural Network with Binary Quantization Layer

Mahdyar Ravanbakhsh, Hossein Mousavi, Moin Nabi et al.

In this paper we introduce a novel method for segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary encoding can be embedded into the CNN as an extra layer at the end of the network. This results in real-time segmentation. To the best of our knowledge our method is the first attempt on general semantic image segmentation using CNN. All the previous papers were limited to few number of category of the images (e.g. PASCAL VOC). Experiments show that our segmentation algorithm outperform the state-of-the-art non-semantic segmentation methods by a large margin.

3.8CVSep 29, 2016

CNN-aware Binary Map for General Semantic Segmentation

Mahdyar Ravanbakhsh, Hossein Mousavi, Moin Nabi et al.

In this paper we introduce a novel method for general semantic segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary codes are very robust against noise and non-semantic changes in the image. These binary encoding can be embedded into the CNN as an extra layer at the end of the network. This results in real-time segmentation. To the best of our knowledge our method is the first attempt on general semantic image segmentation using CNN. All the previous papers were limited to few number of category of the images (e.g. PASCAL VOC). Experiments show that our segmentation algorithm outperform the state-of-the-art non-semantic segmentation methods by large margin.

5.3CVJul 26, 2016

Emotion-Based Crowd Representation for Abnormality Detection

Hamidreza Rabiee, Javad Haddadnia, Hossein Mousavi et al.

In crowd behavior understanding, a model of crowd behavior need to be trained using the information extracted from video sequences. Since there is no ground-truth available in crowd datasets except the crowd behavior labels, most of the methods proposed so far are just based on low-level visual features. However, there is a huge semantic gap between low-level motion/appearance features and high-level concept of crowd behaviors. In this paper we propose an attribute-based strategy to alleviate this problem. While similar strategies have been recently adopted for object and action recognition, as far as we know, we are the first showing that the crowd emotions can be used as attributes for crowd behavior understanding. The main idea is to train a set of emotion-based classifiers, which can subsequently be used to represent the crowd motion. For this purpose, we collect a big dataset of video clips and provide them with both annotations of "crowd behaviors" and "crowd emotions". We show the results of the proposed method on our dataset, which demonstrate that the crowd emotions enable the construction of more descriptive models for crowd behaviors. We aim at publishing the dataset with the article, to be used as a benchmark for the communities.

7.0CVDec 13, 2015

Action Recognition with Image Based CNN Features

Mahdyar Ravanbakhsh, Hossein Mousavi, Mohammad Rastegari et al.

Most of human actions consist of complex temporal compositions of more simple actions. Action recognition tasks usually relies on complex handcrafted structures as features to represent the human action model. Convolutional Neural Nets (CNN) have shown to be a powerful tool that eliminate the need for designing handcrafted features. Usually, the output of the last layer in CNN (a layer before the classification layer -known as fc7) is used as a generic feature for images. In this paper, we show that fc7 features, per se, can not get a good performance for the task of action recognition, when the network is trained only on images. We present a feature structure on top of fc7 features, which can capture the temporal variation in a video. To represent the temporal components, which is needed to capture motion information, we introduced a hierarchical structure. The hierarchical model enables to capture sub-actions from a complex action. At the higher levels of the hierarchy, it represents a coarse capture of action sequence and lower levels represent fine action elements. Furthermore, we introduce a method for extracting key-frames using binary coding of each frame in a video, which helps to improve the performance of our hierarchical model. We experimented our method on several action datasets and show that our method achieves superior results compared to other state-of-the-arts methods.