NIApr 6, 2022
Domain Adversarial Graph Convolutional Network Based on RSSI and Crowdsensing for Indoor LocalizationMingxin Zhang, Zipei Fan, Ryosuke Shibasaki et al.
In recent years, the use of WiFi fingerprints for indoor positioning has grown in popularity, largely due to the widespread availability of WiFi and the proliferation of mobile communication devices. However, many existing methods for constructing fingerprint datasets rely on labor-intensive and time-consuming processes of collecting large amounts of data. Additionally, these methods often focus on ideal laboratory environments, rather than considering the practical challenges of large multi-floor buildings. To address these issues, we present a novel WiDAGCN model that can be trained using a small number of labeled site survey data and large amounts of unlabeled crowdsensed WiFi fingerprints. By constructing heterogeneous graphs based on received signal strength indicators (RSSIs) between waypoints and WiFi access points (APs), our model is able to effectively capture the topological structure of the data. We also incorporate graph convolutional networks (GCNs) to extract graph-level embeddings, a feature that has been largely overlooked in previous WiFi indoor localization studies. To deal with the challenges of large amounts of unlabeled data and multiple data domains, we employ a semi-supervised domain adversarial training scheme to effectively utilize unlabeled data and align the data distributions across domains. Our system is evaluated using a public indoor localization dataset that includes multiple buildings, and the results show that it performs competitively in terms of localization accuracy in large buildings.
LGJul 2, 2022
GOF-TTE: Generative Online Federated Learning Framework for Travel Time EstimationZhiwen Zhang, Hongjun Wang, Jiyuan Chen et al.
Estimating the travel time of a path is an essential topic for intelligent transportation systems. It serves as the foundation for real-world applications, such as traffic monitoring, route planning, and taxi dispatching. However, building a model for such a data-driven task requires a large amount of users' travel information, which directly relates to their privacy and thus is less likely to be shared. The non-Independent and Identically Distributed (non-IID) trajectory data across data owners also make a predictive model extremely challenging to be personalized if we directly apply federated learning. Finally, previous work on travel time estimation does not consider the real-time traffic state of roads, which we argue can significantly influence the prediction. To address the above challenges, we introduce GOF-TTE for the mobile user group, Generative Online Federated Learning Framework for Travel Time Estimation, which I) utilizes the federated learning approach, allowing private data to be kept on client devices while training, and designs the global model as an online generative model shared by all clients to infer the real-time road traffic state. II) apart from sharing a base model at the server, adapts a fine-tuned personalized model for every client to study their personal driving habits, making up for the residual error made by localized global model prediction. % III) designs the global model as an online generative model shared by all clients to infer the real-time road traffic state. We also employ a simple privacy attack to our framework and implement the differential privacy mechanism to further guarantee privacy safety. Finally, we conduct experiments on two real-world public taxi datasets of DiDi Chengdu and Xi'an. The experimental results demonstrate the effectiveness of our proposed framework.
SIJun 21, 2022
Online Trajectory Prediction for Metropolitan Scale Mobility Digital TwinZipei Fan, Xiaojie Yang, Wei Yuan et al.
Knowing "what is happening" and "what will happen" of the mobility in a city is the building block of a data-driven smart city system. In recent years, mobility digital twin that makes a virtual replication of human mobility and predicting or simulating the fine-grained movements of the subjects in a virtual space at a metropolitan scale in near real-time has shown its great potential in modern urban intelligent systems. However, few studies have provided practical solutions. The main difficulties are four-folds. 1) The daily variation of human mobility is hard to model and predict; 2) the transportation network enforces a complex constraints on human mobility; 3) generating a rational fine-grained human trajectory is challenging for existing machine learning models; and 4) making a fine-grained prediction incurs high computational costs, which is challenging for an online system. Bearing these difficulties in mind, in this paper we propose a two-stage human mobility predictor that stratifies the coarse and fine-grained level predictions. In the first stage, to encode the daily variation of human mobility at a metropolitan level, we automatically extract citywide mobility trends as crowd contexts and predict long-term and long-distance movements at a coarse level. In the second stage, the coarse predictions are resolved to a fine-grained level via a probabilistic trajectory retrieval method, which offloads most of the heavy computations to the offline phase. We tested our method using a real-world mobile phone GPS dataset in the Kanto area in Japan, and achieved good prediction accuracy and a time efficiency of about 2 min in predicting future 1h movements of about 220K mobile phone users on a single machine to support more higher-level analysis of mobility prediction.
AIJan 13, 2023
Multitask Weakly Supervised Learning for Origin Destination Travel Time EstimationHongjun Wang, Zhiwen Zhang, Zipei Fan et al.
Travel time estimation from GPS trips is of great importance to order duration, ridesharing, taxi dispatching, etc. However, the dense trajectory is not always available due to the limitation of data privacy and acquisition, while the origin destination (OD) type of data, such as NYC taxi data, NYC bike data, and Capital Bikeshare data, is more accessible. To address this issue, this paper starts to estimate the OD trips travel time combined with the road network. Subsequently, a Multitask Weakly Supervised Learning Framework for Travel Time Estimation (MWSL TTE) has been proposed to infer transition probability between roads segments, and the travel time on road segments and intersection simultaneously. Technically, given an OD pair, the transition probability intends to recover the most possible route. And then, the output of travel time is equal to the summation of all segments' and intersections' travel time in this route. A novel route recovery function has been proposed to iteratively maximize the current route's co occurrence probability, and minimize the discrepancy between routes' probability distribution and the inverse distribution of routes' estimation loss. Moreover, the expected log likelihood function based on a weakly supervised framework has been deployed in optimizing the travel time from road segments and intersections concurrently. We conduct experiments on a wide range of real world taxi datasets in Xi'an and Chengdu and demonstrate our method's effectiveness on route recovery and travel time estimation.
LGSep 25, 2023
MemDA: Forecasting Urban Time Series with Memory-based Drift AdaptationZekun Cai, Renhe Jiang, Xinyu Yang et al.
Urban time series data forecasting featuring significant contributions to sustainable development is widely studied as an essential task of the smart city. However, with the dramatic and rapid changes in the world environment, the assumption that data obey Independent Identically Distribution is undermined by the subsequent changes in data distribution, known as concept drift, leading to weak replicability and transferability of the model over unseen data. To address the issue, previous approaches typically retrain the model, forcing it to fit the most recent observed data. However, retraining is problematic in that it leads to model lag, consumption of resources, and model re-invalidation, causing the drift problem to be not well solved in realistic scenarios. In this study, we propose a new urban time series prediction model for the concept drift problem, which encodes the drift by considering the periodicity in the data and makes on-the-fly adjustments to the model based on the drift using a meta-dynamic network. Experiments on real-world datasets show that our design significantly outperforms state-of-the-art methods and can be well generalized to existing prediction backbones by reducing their sensitivity to distribution changes.
LGNov 12, 2025Code
Blurred Encoding for Trajectory Representation LearningSilin Zhou, Yao Chen, Shuo Shang et al.
Trajectory representation learning (TRL) maps trajectories to vector embeddings and facilitates tasks such as trajectory classification and similarity search. State-of-the-art (SOTA) TRL methods transform raw GPS trajectories to grid or road trajectories to capture high-level travel semantics, i.e., regions and roads. However, they lose fine-grained spatial-temporal details as multiple GPS points are grouped into a single grid cell or road segment. To tackle this problem, we propose the BLUrred Encoding method, dubbed BLUE, which gradually reduces the precision of GPS coordinates to create hierarchical patches with multiple levels. The low-level patches are small and preserve fine-grained spatial-temporal details, while the high-level patches are large and capture overall travel patterns. To complement different patch levels with each other, our BLUE is an encoder-decoder model with a pyramid structure. At each patch level, a Transformer is used to learn the trajectory embedding at the current level, while pooling prepares inputs for the higher level in the encoder, and up-resolution provides guidance for the lower level in the decoder. BLUE is trained using the trajectory reconstruction task with the MSE loss. We compare BLUE with 8 SOTA TRL methods for 3 downstream tasks, the results show that BLUE consistently achieves higher accuracy than all baselines, outperforming the best-performing baselines by an average of 30.90%. Our code is available at https://github.com/slzhou-xy/BLUE.
CVJul 20, 2023
Hybrid Feature Embedding For Automatic Building Outline ExtractionWeihang Ran, Wei Yuan, Xiaodan Shi et al.
Building outline extracted from high-resolution aerial images can be used in various application fields such as change detection and disaster assessment. However, traditional CNN model cannot recognize contours very precisely from original images. In this paper, we proposed a CNN and Transformer based model together with active contour model to deal with this problem. We also designed a triple-branch decoder structure to handle different features generated by encoder. Experiment results show that our model outperforms other baseline model on two datasets, achieving 91.1% mIoU on Vaihingen and 83.8% on Bing huts.
CVJul 9, 2023
Enhancing Building Semantic Segmentation Accuracy with Super Resolution and Deep Learning: Investigating the Impact of Spatial Resolution on Various DatasetsZhiling Guo, Xiaodan Shi, Haoran Zhang et al.
The development of remote sensing and deep learning techniques has enabled building semantic segmentation with high accuracy and efficiency. Despite their success in different tasks, the discussions on the impact of spatial resolution on deep learning based building semantic segmentation are quite inadequate, which makes choosing a higher cost-effective data source a big challenge. To address the issue mentioned above, in this study, we create remote sensing images among three study areas into multiple spatial resolutions by super-resolution and down-sampling. After that, two representative deep learning architectures: UNet and FPN, are selected for model training and testing. The experimental results obtained from three cities with two deep learning models indicate that the spatial resolution greatly influences building segmentation results, and with a better cost-effectiveness around 0.3m, which we believe will be an important insight for data selection and preparation.
AIJun 21, 2022
Route to Time and Time to Route: Travel Time Estimation from Sparse TrajectoriesZhiwen Zhang, Hongjun Wang, Zipei Fan et al.
Due to the rapid development of Internet of Things (IoT) technologies, many online web apps (e.g., Google Map and Uber) estimate the travel time of trajectory data collected by mobile devices. However, in reality, complex factors, such as network communication and energy constraints, make multiple trajectories collected at a low sampling rate. In this case, this paper aims to resolve the problem of travel time estimation (TTE) and route recovery in sparse scenarios, which often leads to the uncertain label of travel time and route between continuously sampled GPS points. We formulate this problem as an inexact supervision problem in which the training data has coarsely grained labels and jointly solve the tasks of TTE and route recovery. And we argue that both two tasks are complementary to each other in the model-learning procedure and hold such a relation: more precise travel time can lead to better inference for routes, in turn, resulting in a more accurate time estimation). Based on this assumption, we propose an EM algorithm to alternatively estimate the travel time of inferred route through weak supervision in E step and retrieve the route based on estimated travel time in M step for sparse trajectories. We conducted experiments on three real-world trajectory datasets and demonstrated the effectiveness of the proposed method.
LGFeb 24Code
TrajGPT-R: Generating Urban Mobility Trajectory with Reinforcement Learning-Enhanced Generative Pre-trained TransformerJiawei Wang, Chuang Yang, Jiawei Yong et al.
Mobility trajectories are essential for understanding urban dynamics and enhancing urban planning, yet access to such data is frequently hindered by privacy concerns. This research introduces a transformative framework for generating large-scale urban mobility trajectories, employing a novel application of a transformer-based model pre-trained and fine-tuned through a two-phase process. Initially, trajectory generation is conceptualized as an offline reinforcement learning (RL) problem, with a significant reduction in vocabulary space achieved during tokenization. The integration of Inverse Reinforcement Learning (IRL) allows for the capture of trajectory-wise reward signals, leveraging historical data to infer individual mobility preferences. Subsequently, the pre-trained model is fine-tuned using the constructed reward model, effectively addressing the challenges inherent in traditional RL-based autoregressive methods, such as long-term credit assignment and handling of sparse reward environments. Comprehensive evaluations on multiple datasets illustrate that our framework markedly surpasses existing models in terms of reliability and diversity. Our findings not only advance the field of urban mobility modeling but also provide a robust methodology for simulating urban data, with significant implications for traffic management and urban development planning. The implementation is publicly available at https://github.com/Wangjw6/TrajGPT_R.
CVJun 24, 2023
Real-World Video for Zoom Enhancement based on Spatio-Temporal CouplingZhiling Guo, Yinqiang Zheng, Haoran Zhang et al.
In recent years, single-frame image super-resolution (SR) has become more realistic by considering the zooming effect and using real-world short- and long-focus image pairs. In this paper, we further investigate the feasibility of applying realistic multi-frame clips to enhance zoom quality via spatio-temporal information coupling. Specifically, we first built a real-world video benchmark, VideoRAW, by a synchronized co-axis optical system. The dataset contains paired short-focus raw and long-focus sRGB videos of different dynamic scenes. Based on VideoRAW, we then presented a Spatio-Temporal Coupling Loss, termed as STCL. The proposed STCL is intended for better utilization of information from paired and adjacent frames to align and fuse features both temporally and spatially at the feature level. The outperformed experimental results obtained in different zoom scenarios demonstrate the superiority of integrating real-world video dataset and STCL into existing SR models for zoom quality enhancement, and reveal that the proposed method can serve as an advanced and viable tool for video zoom.
LGFeb 12
Manifold-Aware Temporal Domain Generalization for Large Language ModelsYiheng Yao, Zekun Cai, Xinyuan Song et al.
Temporal distribution shifts are pervasive in real-world deployments of Large Language Models (LLMs), where data evolves continuously over time. While Temporal Domain Generalization (TDG) seeks to model such structured evolution, existing approaches characterize model adaptation in the full parameter space. This formulation becomes computationally infeasible for modern LLMs. This paper introduces a geometric reformulation of TDG under parameter-efficient fine-tuning. We establish that the low-dimensional temporal structure underlying model evolution can be preserved under parameter-efficient reparameterization, enabling temporal modeling without operating in the ambient parameter space. Building on this principle, we propose Manifold-aware Temporal LoRA (MaT-LoRA), which constrains temporal updates to a shared low-dimensional manifold within a low-rank adaptation subspace, and models its evolution through a structured temporal core. This reparameterization dramatically reduces temporal modeling complexity while retaining expressive power. Extensive experiments on synthetic and real-world datasets, including scientific documents, news publishers, and review ratings, demonstrate that MaT-LoRA achieves superior temporal generalization performance with practical scalability for LLMs.
LGDec 14, 2021Code
Event-Aware Multimodal Mobility NowcastingZhaonan Wang, Renhe Jiang, Hao Xue et al.
As a decisive part in the success of Mobility-as-a-Service (MaaS), spatio-temporal predictive modeling for crowd movements is a challenging task particularly considering scenarios where societal events drive mobility behavior deviated from the normality. While tremendous progress has been made to model high-level spatio-temporal regularities with deep learning, most, if not all of the existing methods are neither aware of the dynamic interactions among multiple transport modes nor adaptive to unprecedented volatility brought by potential societal events. In this paper, we are therefore motivated to improve the canonical spatio-temporal network (ST-Net) from two perspectives: (1) design a heterogeneous mobility information network (HMIN) to explicitly represent intermodality in multimodal mobility; (2) propose a memory-augmented dynamic filter generator (MDFG) to generate sequence-specific parameters in an on-the-fly fashion for various scenarios. The enhanced event-aware spatio-temporal network, namely EAST-Net, is evaluated on several real-world datasets with a wide variety and coverage of societal events. Both quantitative and qualitative experimental results verify the superiority of our approach compared with the state-of-the-art baselines. Code and data are published on https://github.com/underdoc-wang/EAST-Net.
LGAug 20, 2021Code
DL-Traff: Survey and Benchmark of Deep Learning Models for Urban Traffic PredictionRenhe Jiang, Du Yin, Zhaonan Wang et al.
Nowadays, with the rapid development of IoT (Internet of Things) and CPS (Cyber-Physical Systems) technologies, big spatiotemporal data are being generated from mobile phones, car navigation systems, and traffic sensors. By leveraging state-of-the-art deep learning technologies on such data, urban traffic prediction has drawn a lot of attention in AI and Intelligent Transportation System community. The problem can be uniformly modeled with a 3D tensor (T, N, C), where T denotes the total time steps, N denotes the size of the spatial domain (i.e., mesh-grids or graph-nodes), and C denotes the channels of information. According to the specific modeling strategy, the state-of-the-art deep learning models can be divided into three categories: grid-based, graph-based, and multivariate time-series models. In this study, we first synthetically review the deep traffic models as well as the widely used datasets, then build a standard benchmark to comprehensively evaluate their performances with the same settings and metrics. Our study named DL-Traff is implemented with two most popular deep learning frameworks, i.e., TensorFlow and PyTorch, which is already publicly available as two GitHub repositories https://github.com/deepkashiwa20/DL-Traff-Grid and https://github.com/deepkashiwa20/DL-Traff-Graph. With DL-Traff, we hope to deliver a useful resource to researchers who are interested in spatiotemporal data analysis.
AIFeb 22, 2024
Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility GenerationJiawei Wang, Renhe Jiang, Chuang Yang et al.
This paper introduces a novel approach using Large Language Models (LLMs) integrated into an agent framework for flexible and effective personal mobility generation. LLMs overcome the limitations of previous models by effectively processing semantic data and offering versatility in modeling various tasks. Our approach addresses three research questions: aligning LLMs with real-world urban mobility data, developing reliable activity generation strategies, and exploring LLM applications in urban mobility. The key technical contribution is a novel LLM agent framework that accounts for individual activity patterns and motivations, including a self-consistency approach to align LLMs with real-world activity data and a retrieval-augmented strategy for interpretable activity generation. We evaluate our LLM agent framework and compare it with state-of-the-art personal mobility generation approaches, demonstrating the effectiveness of our approach and its potential applications in urban mobility. Overall, this study marks the pioneering work of designing an LLM agent framework for activity generation based on real-world human activity data, offering a promising tool for urban mobility analysis.
LGDec 3, 2024
CausalMob: Causal Human Mobility Prediction with LLMs-derived Human Intentions toward Public EventsXiaojie Yang, Hangli Ge, Jiawei Wang et al.
Large-scale human mobility exhibits spatial and temporal patterns that can assist policymakers in decision making. Although traditional prediction models attempt to capture these patterns, they often interfered by non-periodic public events, such as disasters and occasional celebrations. Since regular human mobility patterns are heavily affected by these events, estimating their causal effects is critical to accurate mobility predictions. Although news articles provide unique perspectives on these events in an unstructured format, processing is a challenge. In this study, we propose a causality-augmented prediction model, called CausalMob, to analyze the causal effects of public events. We first utilize large language models (LLMs) to extract human intentions from news articles and transform them into features that act as causal treatments. Next, the model learns representations of spatio-temporal regional covariates from multiple data sources to serve as confounders for causal inference. Finally, we present a causal effect estimation framework to ensure event features remain independent of confounders during prediction. Based on large-scale real-world data, the experimental results show that the proposed model excels in human mobility prediction, outperforming state-of-the-art models.
LGMar 23, 2025
Causality-Aware Next Location Prediction Framework based on Human Mobility StratificationXiaojie Yang, Zipei Fan, Hangli Ge et al.
Human mobility data are fused with multiple travel patterns and hidden spatiotemporal patterns are extracted by integrating user, location, and time information to improve next location prediction accuracy. In existing next location prediction methods, different causal relationships that result from patterns in human mobility data are ignored, which leads to confounding information that can have a negative effect on predictions. Therefore, this study introduces a causality-aware framework for next location prediction, focusing on human mobility stratification for travel patterns. In our research, a novel causal graph is developed that describes the relationships between various input variables. We use counterfactuals to enhance the indirect effects in our causal graph for specific travel patterns: non-anchor targeted travels. The proposed framework is designed as a plug-and-play module that integrates multiple next location prediction paradigms. We tested our proposed framework using several state-of-the-art models and human mobility datasets, and the results reveal that the proposed module improves the prediction performance. In addition, we provide results from the ablation study and quantitative study to demonstrate the soundness of our causal graph and its ability to further enhance the interpretability of the current next location prediction models.
MLMay 17, 2025
Continuous Domain GeneralizationZekun Cai, Yiheng Yao, Guangji Bai et al.
Real-world data distributions often shift continuously across multiple latent factors such as time, geography, and socioeconomic contexts. However, existing domain generalization approaches typically treat domains as discrete or as evolving along a single axis (e.g., time). This oversimplification fails to capture the complex, multidimensional nature of real-world variation. This paper introduces the task of Continuous Domain Generalization (CDG), which aims to generalize predictive models to unseen domains defined by arbitrary combinations of continuous variations. We present a principled framework grounded in geometric and algebraic theories, showing that optimal model parameters across domains lie on a low-dimensional manifold. To model this structure, we propose a Neural Lie Transport Operator (NeuralLio), which enables structure-preserving parameter transitions by enforcing geometric continuity and algebraic consistency. To handle noisy or incomplete domain variation descriptors, we introduce a gating mechanism to suppress irrelevant dimensions and a local chart-based strategy for robust generalization. Extensive experiments on synthetic and real-world datasets, including remote sensing, scientific documents, and traffic forecasting, demonstrate that our method significantly outperforms existing baselines in both generalization accuracy and robustness.
LGNov 21, 2021
Differentiable Projection for Constrained Deep LearningDou Huang, Haoran Zhang, Xuan Song et al.
Deep neural networks (DNNs) have achieved extraordinary performance in solving different tasks in various fields. However, the conventional DNN model is steadily approaching the ground-truth value through loss backpropagation. In some applications, some prior knowledge could be easily obtained, such as constraints which the ground truth observation follows. Here, we try to give a general approach to incorporate information from these constraints to enhance the performance of the DNNs. Theoretically, we could formulate these kinds of problems as constrained optimization problems that KKT conditions could solve. In this paper, we propose to use a differentiable projection layer in DNN instead of directly solving time-consuming KKT conditions. The proposed projection method is differentiable, and no heavy computation is required. Finally, we also conducted some experiments using a randomly generated synthetic dataset and image segmentation task using the PASCAL VOC dataset to evaluate the performance of the proposed projection method. Experimental results show that the projection method is sufficient and outperforms baseline methods.
LGSep 17, 2021
An open GPS trajectory dataset and benchmark for travel mode detectionJinyu Chen, Haoran Zhang, Xuan Song et al.
Travel mode detection has been a hot topic in the field of GPS trajectory-related processing. Former scholars have developed many mathematical methods to improve the accuracy of detection. Among these studies, almost all of the methods require ground truth dataset for training. A large amount of the studies choose to collect the GPS trajectory dataset for training by their customized ways. Currently, there is no open GPS dataset marked with travel mode. If there exists one, it will not only save a lot of efforts in model developing, but also help compare the performance of models. In this study, we propose and open GPS trajectory dataset marked with travel mode and benchmark for the travel mode detection. The dataset is collected by 7 independent volunteers in Japan and covers the time period of a complete month. The travel mode ranges from walking to railway. A part of routines are traveled repeatedly in different time slots to experience different road and travel conditions. We also provide a case study to distinguish the walking and bike trips in a massive GPS trajectory dataset.
CVJul 6, 2021
Adapting Vehicle Detector to Target Domain by Adversarial Prediction AlignmentYohei Koga, Hiroyuki Miyazaki, Ryosuke Shibasaki
While recent advancement of domain adaptation techniques is significant, most of methods only align a feature extractor and do not adapt a classifier to target domain, which would be a cause of performance degradation. We propose novel domain adaptation technique for object detection that aligns prediction output space. In addition to feature alignment, we aligned predictions of locations and class confidences of our vehicle detector for satellite images by adversarial training. The proposed method significantly improved AP score by over 5%, which shows effectivity of our method for object detection tasks in satellite images.
CVNov 25, 2020
Deep-learning coupled with novel classification method to classify the urban environment of the developing worldQianwei Cheng, AKM Mahbubur Rahman, Anis Sarker et al.
Rapid globalization and the interdependence of humanity that engender tremendous in-flow of human migration towards the urban spaces. With advent of high definition satellite images, high resolution data, computational methods such as deep neural network, capable hardware; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high-frequency data. In this paper we propose a novel classification method that is readily usable for machine analysis and show applicability of the methodology on a developing world setting. The state-of-the-art is mostly dominated by classification of building structures, building types etc. and largely represents the developed world which are insufficient for developing countries such as Bangladesh where the surrounding is crucial for the classification. Moreover, the traditional methods propose small-scale classifications, which give limited information with poor scalability and are slow to compute. We categorize the urban area in terms of informal and formal spaces taking the surroundings into account. 50 km x 50 km Google Earth image of Dhaka, Bangladesh was visually annotated and categorized by an expert. The classification is based broadly on two dimensions: urbanization and the architectural form of urban environment. Consequently, the urban space is divided into four classes: 1) highly informal; 2) moderately informal; 3) moderately formal; and 4) highly formal areas. In total 16 sub-classes were identified. For semantic segmentation, Google's DeeplabV3+ model was used which increases the field of view of the filters to incorporate larger context. Image encompassing 70% of the urban space was used for training and the remaining 30% was used for testing and validation. The model is able to segment with 75% accuracy and 60% Mean IoU.
HCJul 7, 2020
EpiMob: Interactive Visual Analytics of Citywide Human Mobility Restrictions for Epidemic ControlChuang Yang, Zhiwen Zhang, Zipei Fan et al.
The outbreak of coronavirus disease (COVID-19) has swept across more than 180 countries and territories since late January 2020. As a worldwide emergency response, governments have implemented various measures and policies, such as self-quarantine, travel restrictions, work from home, and regional lockdown, to control the spread of the epidemic. These countermeasures seek to restrict human mobility because COVID-19 is a highly contagious disease that is spread by human-to-human transmission. Medical experts and policymakers have expressed the urgency to effectively evaluate the outcome of human restriction policies with the aid of big data and information technology. Thus, based on big human mobility data and city POI data, an interactive visual analytics system called Epidemic Mobility (EpiMob) was designed in this study. The system interactively simulates the changes in human mobility and infection status in response to the implementation of a certain restriction policy or a combination of policies (e.g., regional lockdown, telecommuting, screening). Users can conveniently designate the spatial and temporal ranges for different mobility restriction policies. Then, the results reflecting the infection situation under different policies are dynamically displayed and can be flexibly compared and analyzed in depth. Multiple case studies consisting of interviews with domain experts were conducted in the largest metropolitan area of Japan (i.e., Greater Tokyo Area) to demonstrate that the system can provide insight into the effects of different human mobility restriction policies for epidemic control, through measurements and comparisons.
LGNov 16, 2019
VLUC: An Empirical Benchmark for Video-Like Urban Computing on Citywide Crowd and Traffic PredictionRenhe Jiang, Zekun Cai, Zhaonan Wang et al.
Nowadays, massive urban human mobility data are being generated from mobile phones, car navigation systems, and traffic sensors. Predicting the density and flow of the crowd or traffic at a citywide level becomes possible by using the big data and cutting-edge AI technologies. It has been a very significant research topic with high social impact, which can be widely applied to emergency management, traffic regulation, and urban planning. In particular, by meshing a large urban area to a number of fine-grained mesh-grids, citywide crowd and traffic information in a continuous time period can be represented like a video, where each timestamp can be seen as one video frame. Based on this idea, a series of methods have been proposed to address video-like prediction for citywide crowd and traffic. In this study, we publish a new aggregated human mobility dataset generated from a real-world smartphone application and build a standard benchmark for such kind of video-like urban computing with this new dataset and the existing open datasets. We first comprehensively review the state-of-the-art works of literature and formulate the density and in-out flow prediction problem, then conduct a thorough performance assessment for those methods. With this benchmark, we hope researchers can easily follow up and quickly launch a new solution on this topic.
LGSep 28, 2018
Semantic Segmentation for Urban Planning Maps based on U-NetZhiling Guo, Hiroaki Shengoku, Guangming Wu et al.
The automatic digitizing of paper maps is a significant and challenging task for both academia and industry. As an important procedure of map digitizing, the semantic segmentation section mainly relies on manual visual interpretation with low efficiency. In this study, we select urban planning maps as a representative sample and investigate the feasibility of utilizing U-shape fully convolutional based architecture to perform end-to-end map semantic segmentation. The experimental results obtained from the test area in Shibuya district, Tokyo, demonstrate that our proposed method could achieve a very high Jaccard similarity coefficient of 93.63% and an overall accuracy of 99.36%. For implementation on GPGPU and cuDNN, the required processing time for the whole Shibuya district can be less than three minutes. The results indicate the proposed method can serve as a viable tool for urban planning map semantic segmentation task with high accuracy and efficiency.
CVAug 13, 2017
Visual Graph MiningQuanshi Zhang, Xuan Song, Ryosuke Shibasaki
In this study, we formulate the concept of "mining maximal-size frequent subgraphs" in the challenging domain of visual data (images and videos). In general, visual knowledge can usually be modeled as attributed relational graphs (ARGs) with local attributes representing local parts and pairwise attributes describing the spatial relationship between parts. Thus, from a practical perspective, such mining of maximal-size subgraphs can be regarded as a general platform for discovering and modeling the common objects within cluttered and unlabeled visual data. Then, from a theoretical perspective, visual graph mining should encode and overcome the great fuzziness of messy data collected from complex real-world situations, which conflicts with the conventional theoretical basis of graph mining designed for tabular data. Common subgraphs hidden in these ARGs usually have soft attributes, with considerable inter-graph variation. More importantly, we should also discover the latent pattern space, including similarity metrics for the pattern and hidden node relations, during the mining process. In this study, we redefine the visual subgraph pattern that encodes all of these challenges in a general way, and propose an approximate but efficient solution to graph mining. We conduct five experiments to evaluate our method with different kinds of visual data, including videos and RGB/RGB-D images. These experiments demonstrate the generality of the proposed method.