Fang Zheng

AI
4papers
388citations
Novelty42%
AI Score24

4 Papers

CVMay 26, 2022
Social Interpretable Tree for Pedestrian Trajectory Prediction

Liushuai Shi, Le Wang, Chengjiang Long et al.

Understanding the multiple socially-acceptable future behaviors is an essential task for many vision applications. In this paper, we propose a tree-based method, termed as Social Interpretable Tree (SIT), to address this multi-modal prediction task, where a hand-crafted tree is built depending on the prior information of observed trajectory to model multiple future trajectories. Specifically, a path in the tree from the root to leaf represents an individual possible future trajectory. SIT employs a coarse-to-fine optimization strategy, in which the tree is first built by high-order velocity to balance the complexity and coverage of the tree and then optimized greedily to encourage multimodality. Finally, a teacher-forcing refining operation is used to predict the final fine trajectory. Compared with prior methods which leverage implicit latent variables to represent possible future trajectories, the path in the tree can explicitly explain the rough moving behaviors (e.g., go straight and then turn right), and thus provides better interpretability. Despite the hand-crafted tree, the experimental results on ETH-UCY and Stanford Drone datasets demonstrate that our method is capable of matching or exceeding the performance of state-of-the-art methods. Interestingly, the experiments show that the raw built tree without training outperforms many prior deep neural network based approaches. Meanwhile, our method presents sufficient flexibility in long-term prediction and different best-of-$K$ predictions.

NAJan 18, 2014
Coset Sum: an alternative to the tensor product in wavelet construction

Youngmi Hur, Fang Zheng

A multivariate biorthogonal wavelet system can be obtained from a pair of multivariate biorthogonal refinement masks in Multiresolution Analysis setup. Some multivariate refinement masks may be decomposed into lower dimensional refinement masks. Tensor product is a popular way to construct a decomposable multivariate refinement mask from lower dimensional refinement masks. We present an alternative method, which we call coset sum, for constructing multivariate refinement masks from univariate refinement masks. The coset sum shares many essential features of the tensor product that make it attractive in practice: (1) it preserves the biorthogonality of univariate refinement masks, (2) it preserves the accuracy number of the univariate refinement mask, and (3) the wavelet system associated with it has fast algorithms for computing and inverting the wavelet coefficients. The coset sum can even provide a wavelet system with faster algorithms in certain cases than the tensor product. These features of the coset sum suggest that it is worthwhile to develop and practice alternative methods to the tensor product for constructing multivariate wavelet systems. Some experimental results using 2-D images are presented to illustrate our findings.

AIJul 31, 2021
Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

Fang Zheng, Le Wang, Sanping Zhou et al.

Understanding complex social interactions among agents is a key challenge for trajectory prediction. Most existing methods consider the interactions between pairwise traffic agents or in a local area, while the nature of interactions is unlimited, involving an uncertain number of agents and non-local areas simultaneously. Besides, they treat heterogeneous traffic agents the same, namely those among agents of different categories, while neglecting people's diverse reaction patterns toward traffic agents in ifferent categories. To address these problems, we propose a simple yet effective Unlimited Neighborhood Interaction Network (UNIN), which predicts trajectories of heterogeneous agents in multiple categories. Specifically, the proposed unlimited neighborhood interaction module generates the fused-features of all agents involved in an interaction simultaneously, which is adaptive to any number of agents and any range of interaction area. Meanwhile, a hierarchical graph attention module is proposed to obtain category-to-category interaction and agent-to-agent interaction. Finally, parameters of a Gaussian Mixture Model are estimated for generating the future trajectories. Extensive experimental results on benchmark datasets demonstrate a significant performance improvement of our method over the state-of-the-art methods.

DLFeb 20, 2020
MODMA dataset: a Multi-modal Open Dataset for Mental-disorder Analysis

Hanshu Cai, Yiwen Gao, Shuting Sun et al.

According to the World Health Organization, the number of mental disorder patients, especially depression patients, has grown rapidly and become a leading contributor to the global burden of disease. However, the present common practice of depression diagnosis is based on interviews and clinical scales carried out by doctors, which is not only labor-consuming but also time-consuming. One important reason is due to the lack of physiological indicators for mental disorders. With the rising of tools such as data mining and artificial intelligence, using physiological data to explore new possible physiological indicators of mental disorder and creating new applications for mental disorder diagnosis has become a new research hot topic. However, good quality physiological data for mental disorder patients are hard to acquire. We present a multi-modal open dataset for mental-disorder analysis. The dataset includes EEG and audio data from clinically depressed patients and matching normal controls. All our patients were carefully diagnosed and selected by professional psychiatrists in hospitals. The EEG dataset includes not only data collected using traditional 128-electrodes mounted elastic cap, but also a novel wearable 3-electrode EEG collector for pervasive applications. The 128-electrodes EEG signals of 53 subjects were recorded as both in resting state and under stimulation; the 3-electrode EEG signals of 55 subjects were recorded in resting state; the audio data of 52 subjects were recorded during interviewing, reading, and picture description. We encourage other researchers in the field to use it for testing their methods of mental-disorder analysis.