Junfeng Hu

h-index11

12papers

1,489citations

Novelty48%

AI Score34

Ranked #113,471 of 194,257 authors (top 58%)#24,970 in LG (top 62%)

12 Papers

33.5LGJun 14, 2023Code

LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting

Xu Liu, Yutong Xia, Yuxuan Liang et al.

Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning in capturing non-linear patterns of traffic data. However, the promising results achieved on current public datasets may not be applicable to practical scenarios due to limitations within these datasets. First, the limited sizes of them may not reflect the real-world scale of traffic networks. Second, the temporal coverage of these datasets is typically short, posing hurdles in studying long-term patterns and acquiring sufficient samples for training deep models. Third, these datasets often lack adequate metadata for sensors, which compromises the reliability and interpretability of the data. To mitigate these limitations, we introduce the LargeST benchmark dataset. It encompasses a total number of 8,600 sensors in California with a 5-year time coverage and includes comprehensive metadata. Using LargeST, we perform in-depth data analysis to extract data insights, benchmark well-known baselines in terms of their performance and efficiency, and identify challenges as well as opportunities for future research. We release the datasets and baseline implementations at: https://github.com/liuxu77/LargeST.

37.4LGOct 15, 2023Code

UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting

Xu Liu, Junfeng Hu, Yuan Li et al.

Multivariate time series forecasting plays a pivotal role in contemporary web technologies. In contrast to conventional methods that involve creating dedicated models for specific time series application domains, this research advocates for a unified model paradigm that transcends domain boundaries. However, learning an effective cross-domain model presents the following challenges. First, various domains exhibit disparities in data characteristics, e.g., the number of variables, posing hurdles for existing models that impose inflexible constraints on these factors. Second, the model may encounter difficulties in distinguishing data from various domains, leading to suboptimal performance in our assessments. Third, the diverse convergence rates of time series domains can also result in compromised empirical performance. To address these issues, we propose UniTime for effective cross-domain time series learning. Concretely, UniTime can flexibly adapt to data with varying characteristics. It also uses domain instructions and a Language-TS Transformer to offer identification information and align two modalities. In addition, UniTime employs masking to alleviate domain convergence speed imbalance issues. Our extensive experiments demonstrate the effectiveness of UniTime in advancing state-of-the-art forecasting performance and zero-shot transferability.

17.5LGOct 26, 2023

Towards Unifying Diffusion Models for Probabilistic Spatio-Temporal Graph Learning

Junfeng Hu, Xu Liu, Zhencheng Fan et al.

Spatio-temporal graph learning is a fundamental problem in modern urban systems. Existing approaches tackle different tasks independently, tailoring their models to unique task characteristics. These methods, however, fall short of modeling intrinsic uncertainties in the spatio-temporal data. Meanwhile, their specialized designs misalign with the current research efforts toward unifying spatio-temporal graph learning solutions. In this paper, we propose to model these tasks in a unified probabilistic perspective, viewing them as predictions based on conditional information with shared dependencies. Based on this proposal, we introduce Unified Spatio-Temporal Diffusion Models (USTD) to address the tasks uniformly under the uncertainty-aware diffusion framework. USTD is holistically designed, comprising a shared spatio-temporal encoder and attention-based denoising decoders that are task-specific. The encoder, optimized by pre-training strategies, effectively captures conditional spatio-temporal patterns. The decoders, utilizing attention mechanisms, generate predictions by leveraging learned patterns. Opting for forecasting and kriging, the decoders are designed as Spatial Gated Attention (SGA) and Temporal Gated Attention (TGA) for each task, with different emphases on the spatial and temporal dimensions. Combining the advantages of deterministic encoders and probabilistic decoders, USTD achieves state-of-the-art performances compared to both deterministic and probabilistic baselines, while also providing valuable uncertainty estimates.

12.5LGMay 21, 2024Code

Prompt-Based Spatio-Temporal Graph Transfer Learning

Junfeng Hu, Xu Liu, Zhencheng Fan et al.

Spatio-temporal graph neural networks have proven efficacy in capturing complex dependencies for urban computing tasks such as forecasting and kriging. Yet, their performance is constrained by the reliance on extensive data for training on a specific task, thereby limiting their adaptability to new urban domains with varied task demands. Although transfer learning has been proposed to remedy this problem by leveraging knowledge across domains, the cross-task generalization still remains under-explored in spatio-temporal graph transfer learning due to the lack of a unified framework. To bridge the gap, we propose Spatio-Temporal Graph Prompting (STGP), a prompt-based framework capable of adapting to multi-diverse tasks in a data-scarce domain. Specifically, we first unify different tasks into a single template and introduce a task-agnostic network architecture that aligns with this template. This approach enables capturing dependencies shared across tasks. Furthermore, we employ learnable prompts to achieve domain and task transfer in a two-stage prompting pipeline, facilitating the prompts to effectively capture domain knowledge and task-specific properties. Our extensive experiments demonstrate that STGP outperforms state-of-the-art baselines in three tasks-forecasting, kriging, and extrapolation-achieving an improvement of up to 10.7%.

16.5LGMay 30, 2023Code

Graph Neural Processes for Spatio-Temporal Extrapolation

Junfeng Hu, Yuxuan Liang, Zhencheng Fan et al.

We study the task of spatio-temporal extrapolation that generates data at target locations from surrounding contexts in a graph. This task is crucial as sensors that collect data are sparsely deployed, resulting in a lack of fine-grained information due to high deployment and maintenance costs. Existing methods either use learning-based models like Neural Networks or statistical approaches like Gaussian Processes for this task. However, the former lacks uncertainty estimates and the latter fails to capture complex spatial and temporal correlations effectively. To address these issues, we propose Spatio-Temporal Graph Neural Processes (STGNP), a neural latent variable model which commands these capabilities simultaneously. Specifically, we first learn deterministic spatio-temporal representations by stacking layers of causal convolutions and cross-set graph neural networks. Then, we learn latent variables for target locations through vertical latent state transitions along layers and obtain extrapolations. Importantly during the transitions, we propose Graph Bayesian Aggregation (GBA), a Bayesian graph aggregator that aggregates contexts considering uncertainties in context data and graph structure. Extensive experiments show that STGNP has desirable properties such as uncertainty estimates and strong learning capabilities, and achieves state-of-the-art results by a clear margin.

12.5LGSep 16, 2021

Decoupling Long- and Short-Term Patterns in Spatiotemporal Inference

Junfeng Hu, Yuxuan Liang, Zhencheng Fan et al.

Sensors are the key to environmental monitoring, which impart benefits to smart cities in many aspects, such as providing real-time air quality information to assist human decision-making. However, it is impractical to deploy massive sensors due to the expensive costs, resulting in sparse data collection. Therefore, how to get fine-grained data measurement has long been a pressing issue. In this paper, we aim to infer values at non-sensor locations based on observations from available sensors (termed spatiotemporal inference), where capturing spatiotemporal relationships among the data plays a critical role. Our investigations reveal two significant insights that have not been explored by previous works. Firstly, data exhibits distinct patterns at both long- and short-term temporal scales, which should be analyzed separately. Secondly, short-term patterns contain more delicate relations including those across spatial and temporal dimensions simultaneously, while long-term patterns involve high-level temporal trends. Based on these observations, we propose to decouple the modeling of short-term and long-term patterns. Specifically, we introduce a joint spatiotemporal graph attention network to learn the relations across space and time for short-term patterns. Furthermore, we propose a graph recurrent network with a time skip strategy to alleviate the gradient vanishing problem and model the long-term dependencies. Experimental results on four public real-world datasets demonstrate that our method effectively captures both long- and short-term relations, achieving state-of-the-art performance against existing methods.

31.0CLOct 17, 2020

Incorporate Semantic Structures into Machine Translation Evaluation via UCCA

Jin Xu, Yinuo Guo, Junfeng Hu

Copying mechanism has been commonly used in neural paraphrasing networks and other text generation tasks, in which some important words in the input sequence are preserved in the output sequence. Similarly, in machine translation, we notice that there are certain words or phrases appearing in all good translations of one source text, and these words tend to convey important semantic information. Therefore, in this work, we define words carrying important semantic meanings in sentences as semantic core words. Moreover, we propose an MT evaluation approach named Semantically Weighted Sentence Similarity (SWSS). It leverages the power of UCCA to identify semantic core words, and then calculates sentence similarity scores on the overlap of semantic core words. Experimental results show that SWSS can consistently improve the performance of popular MT evaluation metrics which are based on lexical similarity.

1.2STSep 30, 2020

Evaluation of company investment value based on machine learning

Junfeng Hu, Xiaosa Li, Yuru Xu et al.

In this paper, company investment value evaluation models are established based on comprehensive company information. After data mining and extracting a set of 436 feature parameters, an optimal subset of features is obtained by dimension reduction through tree-based feature selection, followed by the 5-fold cross-validation using XGBoost and LightGBM models. The results show that the Root-Mean-Square Error (RMSE) reached 3.098 and 3.059, respectively. In order to further improve the stability and generalization capability, Bayesian Ridge Regression has been used to train a stacking model based on the XGBoost and LightGBM models. The corresponding RMSE is up to 3.047. Finally, the importance of different features to the LightGBM model is analysed.

2.6CVNov 6, 2019Code

Predicting Long-Term Skeletal Motions by a Spatio-Temporal Hierarchical Recurrent Network

Junfeng Hu, Zhencheng Fan, Jun Liao et al.

The primary goal of skeletal motion prediction is to generate future motion by observing a sequence of 3D skeletons. A key challenge in motion prediction is the fact that a motion can often be performed in several different ways, with each consisting of its own configuration of poses and their spatio-temporal dependencies, and as a result, the predicted poses often converge to the motionless poses or non-human like motions in long-term prediction. This leads us to define a hierarchical recurrent network model that explicitly characterizes these internal configurations of poses and their local and global spatio-temporal dependencies. The model introduces a latent vector variable from the Lie algebra to represent spatial and temporal relations simultaneously. Furthermore, a structured stack LSTM-based decoder is devised to decode the predicted poses with a new loss function defined to estimate the quantized weight of each body part in a pose. Empirical evaluations on benchmark datasets suggest our approach significantly outperforms the state-of-the-art methods on both short-term and long-term motion prediction.

0.3CLMar 3, 2018

Understanding and Improving Multi-Sense Word Embeddings via Extended Robust Principal Component Analysis

Haoyue Shi, Yuqi Sun, Junfeng Hu

Unsupervised learned representations of polysemous words generate a large of pseudo multi senses since unsupervised methods are overly sensitive to contextual variations. In this paper, we address the pseudo multi-sense detection for word embeddings by dimensionality reduction of sense pairs. We propose a novel principal analysis method, termed Ex-RPCA, designed to detect both pseudo multi senses and real multi senses. With Ex-RPCA, we empirically show that pseudo multi senses are generated systematically in unsupervised method. Moreover, the multi-sense word embeddings can by improved by a simple linear transformation based on Ex-RPCA. Our improved word embedding outperform the original one by 5.6 points on Stanford contextual word similarity (SCWS) dataset. We hope our simple yet effective approach will help the linguistic analysis of multi-sense word embeddings in the future.

1.0CLJul 5, 2017

Context Aware Document Embedding

Zhaocheng Zhu, Junfeng Hu

Recently, doc2vec has achieved excellent results in different tasks. In this paper, we present a context aware variant of doc2vec. We introduce a novel weight estimating mechanism that generates weights for each word occurrence according to its contribution in the context, using deep neural networks. Our context aware model can achieve similar results compared to doc2vec initialized byWikipedia trained vectors, while being much more efficient and free from heavy external corpus. Analysis of context aware weights shows they are a kind of enhanced IDF weights that capture sub-topic level keywords in documents. They might result from deep neural networks that learn hidden representations with the least entropy.

0.3CLMay 25, 2017

Max-Cosine Matching Based Neural Models for Recognizing Textual Entailment

Zhipeng Xie, Junfeng Hu

Recognizing textual entailment is a fundamental task in a variety of text mining or natural language processing applications. This paper proposes a simple neural model for RTE problem. It first matches each word in the hypothesis with its most-similar word in the premise, producing an augmented representation of the hypothesis conditioned on the premise as a sequence of word pairs. The LSTM model is then used to model this augmented sequence, and the final output from the LSTM is fed into a softmax layer to make the prediction. Besides the base model, in order to enhance its performance, we also proposed three techniques: the integration of multiple word-embedding library, bi-way integration, and ensemble based on model averaging. Experimental results on the SNLI dataset have shown that the three techniques are effective in boosting the predicative accuracy and that our method outperforms several state-of-the-state ones.