CVApr 18, 2022
Real-World Deep Local Motion DeblurringHaoying Li, Ziran Zhang, Tingting Jiang et al.
Most existing deblurring methods focus on removing global blur caused by camera shake, while they cannot well handle local blur caused by object movements. To fill the vacancy of local deblurring in real scenes, we establish the first real local motion blur dataset (ReLoBlur), which is captured by a synchronized beam-splitting photographing system and corrected by a post-progressing pipeline. Based on ReLoBlur, we propose a Local Blur-Aware Gated network (LBAG) and several local blur-aware techniques to bridge the gap between global and local deblurring: 1) a blur detection approach based on background subtraction to localize blurred regions; 2) a gate mechanism to guide our network to focus on blurred regions; and 3) a blur-aware patch cropping strategy to address data imbalance problem. Extensive experiments prove the reliability of ReLoBlur dataset, and demonstrate that LBAG achieves better performance than state-of-the-art global deblurring methods without our proposed local blur-aware techniques.
LGApr 20, 2022
An unsupervised approach for semantic place annotation of trajectories based on the prior probabilityJunyi Cheng, Xianfeng Zhang, Peng Luo et al.
Semantic place annotation can provide individual semantics, which can be of great help in the field of trajectory data mining. Most existing methods rely on annotated or external data and require retraining following a change of region, thus preventing their large-scale applications. Herein, we propose an unsupervised method denoted as UPAPP for the semantic place annotation of trajectories using spatiotemporal information. The Bayesian Criterion is specifically employed to decompose the spatiotemporal probability of the candidate place into spatial probability, duration probability, and visiting time probability. Spatial information in ROI and POI data is subsequently adopted to calculate the spatial probability. In terms of the temporal probabilities, the Term Frequency Inverse Document Frequency weighting algorithm is used to count the potential visits to different place types in the trajectories, and generates the prior probabilities of the visiting time and duration. The spatiotemporal probability of the candidate place is then combined with the importance of the place category to annotate the visited places. Validation with a trajectory dataset collected by 709 volunteers in Beijing showed that our method achieved an overall and average accuracy of 0.712 and 0.720, respectively, indicating that the visited places can be annotated accurately without any external data.
LGMar 28, 2024Code
Genetic Quantization-Aware Approximation for Non-Linear Operations in TransformersPingcheng Dong, Yonghao Tan, Dong Zhang et al.
Non-linear functions are prevalent in Transformers and their lightweight variants, incurring substantial and frequently underestimated hardware costs. Previous state-of-the-art works optimize these operations by piece-wise linear approximation and store the parameters in look-up tables (LUT), but most of them require unfriendly high-precision arithmetics such as FP/INT 32 and lack consideration of integer-only INT quantization. This paper proposed a genetic LUT-Approximation algorithm namely GQA-LUT that can automatically determine the parameters with quantization awareness. The results demonstrate that GQA-LUT achieves negligible degradation on the challenging semantic segmentation task for both vanilla and linear Transformer models. Besides, proposed GQA-LUT enables the employment of INT8-based LUT-Approximation that achieves an area savings of 81.3~81.7% and a power reduction of 79.3~80.2% compared to the high-precision FP/INT 32 alternatives. Code is available at https:// github.com/PingchengDong/GQA-LUT.
ARApr 10, 2025Code
APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-DesignYonghao Tan, Pingcheng Dong, Yongkun Wu et al.
DNN accelerators, significantly advanced by model compression and specialized dataflow techniques, have marked considerable progress. However, the frequent access of high-precision partial sums (PSUMs) leads to excessive memory demands in architectures utilizing input/weight stationary dataflows. Traditional compression strategies have typically overlooked PSUM quantization, which may account for 69% of power consumption. This study introduces a novel Additive Partial Sum Quantization (APSQ) method, seamlessly integrating PSUM accumulation into the quantization framework. A grouping strategy that combines APSQ with PSUM quantization enhanced by a reconfigurable architecture is further proposed. The APSQ performs nearly lossless on NLP and CV tasks across BERT, Segformer, and EfficientViT models while compressing PSUMs to INT8. This leads to a notable reduction in energy costs by 28-87%. Extended experiments on LLaMA2-7B demonstrate the potential of APSQ for large language models. Code is available at https://github.com/Yonghao-Tan/APSQ.
91.3ARMay 10
31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-DecodingPingcheng Dong, Yonghao Tan, Xuejiao Liu et al.
This work presents a 55nm speculative decoding-based LLM accelerator with bumping-based face-to-face ReRAM-on-logic stacking technology. It features a local rotation unit for outlier-free low-bit quantization, a stacking-aware PNM architecture co-designed with blockwise vector quantization to reduce weight EMA overheads, and an adaptive parallel speculative decoding scheme with an out-of-order scheduler for high resource and bandwidth utilization. Our chip achieves 14.08-to-135.69token/s and 4.46-to-7.17x speedup over vanilla speculative decoding.
37.5LGApr 7
Weighted Bayesian Conformal PredictionXiayin Lou, Peng Luo
Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption -- a limitation the authors themselves identify. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose \textbf{Weighted Bayesian Conformal Prediction (WBCP)}, which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as \emph{Geographical BQ-CP}, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.
CEFeb 22, 2025
Interpreting core forms of urban morphology linked to urban functions with explainable graph neural networkDongsheng Chen, Yu Feng, Xun Li et al.
Understanding the high-order relationship between urban form and function is essential for modeling the underlying mechanisms of sustainable urban systems. Nevertheless, it is challenging to establish an accurate data representation for complex urban forms that are readily explicable in human terms. This study proposed the concept of core urban morphology representation and developed an explainable deep learning framework for explicably symbolizing complex urban forms into the novel representation, which we call CoMo. By interpretating the well-trained deep learning model with a stable weighted F1-score of 89.14%, CoMo presents a promising approach for revealing links between urban function and urban form in terms of core urban morphology representation. Using Boston as a study area, we analyzed the core urban forms at the individual-building, block, and neighborhood level that are important to corresponding urban functions. The residential core forms follow a gradual morphological pattern along the urban spine, which is consistent with a center-urban-suburban transition. Furthermore, we prove that urban morphology directly affects land use efficiency, which has a significantly strong correlation with the location (R2=0.721, p<0.001). Overall, CoMo can explicably symbolize urban forms, provide evidence for the classic urban location theory, and offer mechanistic insights for digital twins.
MLDec 5, 2024
GeoConformal prediction: a model-agnostic framework of measuring the uncertainty of spatial predictionXiayin Lou, Peng Luo, Liqiu Meng
Spatial prediction is a fundamental task in geography. In recent years, with advances in geospatial artificial intelligence (GeoAI), numerous models have been developed to improve the accuracy of geographic variable predictions. Beyond achieving higher accuracy, it is equally important to obtain predictions with uncertainty measures to enhance model credibility and support responsible spatial prediction. Although geostatistic methods like Kriging offer some level of uncertainty assessment, such as Kriging variance, these measurements are not always accurate and lack general applicability to other spatial models. To address this issue, we propose a model-agnostic uncertainty assessment method called GeoConformal Prediction, which incorporates geographical weighting into conformal prediction. We applied it to two classic spatial prediction cases, spatial regression and spatial interpolation, to evaluate its reliability. First, in the spatial regression case, we used XGBoost to predict housing prices, followed by GeoConformal to calculate uncertainty. Our results show that GeoConformal achieved a coverage rate of 93.67%, while Bootstrap methods only reached a maximum coverage of 81.00% after 2000 runs. Next, we applied GeoConformal to spatial interpolation models. We found that the uncertainty obtained from GeoConformal aligned closely with the variance in Kriging. Finally, using GeoConformal, we analyzed the sources of uncertainty in spatial prediction. We found that explicitly including local features in AI models can significantly reduce prediction uncertainty, especially in areas with strong local dependence. Our findings suggest that GeoConformal holds potential not only for geographic knowledge discovery but also for guiding the design of future GeoAI models, paving the way for more reliable and interpretable spatial prediction frameworks.
SEApr 29, 2025
CoCo-Bench: A Comprehensive Code Benchmark For Multi-task Large Language Model EvaluationWenjing Yin, Tianze Sun, Yijiong Yu et al.
Large language models (LLMs) play a crucial role in software engineering, excelling in tasks like code generation and maintenance. However, existing benchmarks are often narrow in scope, focusing on a specific task and lack a comprehensive evaluation framework that reflects real-world applications. To address these gaps, we introduce CoCo-Bench (Comprehensive Code Benchmark), designed to evaluate LLMs across four critical dimensions: code understanding, code generation, code modification, and code review. These dimensions capture essential developer needs, ensuring a more systematic and representative evaluation. CoCo-Bench includes multiple programming languages and varying task difficulties, with rigorous manual review to ensure data quality and accuracy. Empirical results show that CoCo-Bench aligns with existing benchmarks while uncovering significant variations in model performance, effectively highlighting strengths and weaknesses. By offering a holistic and objective evaluation, CoCo-Bench provides valuable insights to guide future research and technological advancements in code-oriented LLMs, establishing a reliable benchmark for the field.
AISep 25, 2025
GeoEvolve: Automating Geospatial Model Discovery via Multi-Agent Large Language ModelsPeng Luo, Xiayin Lou, Yu Zheng et al.
Geospatial modeling provides critical solutions for pressing global challenges such as sustainability and climate change. Existing large language model (LLM)-based algorithm discovery frameworks, such as AlphaEvolve, excel at evolving generic code but lack the domain knowledge and multi-step reasoning required for complex geospatial problems. We introduce GeoEvolve, a multi-agent LLM framework that couples evolutionary search with geospatial domain knowledge to automatically design and refine geospatial algorithms. GeoEvolve operates in two nested loops: an inner loop leverages a code evolver to generate and mutate candidate solutions, while an outer agentic controller evaluates global elites and queries a GeoKnowRAG module -- a structured geospatial knowledge base that injects theoretical priors from geography. This knowledge-guided evolution steers the search toward theoretically meaningful and computationally efficient algorithms. We evaluate GeoEvolve on two fundamental and classical tasks: spatial interpolation (kriging) and spatial uncertainty quantification (geospatial conformal prediction). Across these benchmarks, GeoEvolve automatically improves and discovers new algorithms, incorporating geospatial theory on top of classical models. It reduces spatial interpolation error (RMSE) by 13-21% and enhances uncertainty estimation performance by 17\%. Ablation studies confirm that domain-guided retrieval is essential for stable, high-quality evolution. These results demonstrate that GeoEvolve provides a scalable path toward automated, knowledge-driven geospatial modeling, opening new opportunities for trustworthy and efficient AI-for-Science discovery.
CLMay 20, 2025
Mechanistic Fine-tuning for In-context LearningHakaze Cho, Peng Luo, Mariko Kato et al.
In-context Learning (ICL) utilizes structured demonstration-query inputs to induce few-shot learning on Language Models (LMs), which are not originally pre-trained on ICL-style data. To bridge the gap between ICL and pre-training, some approaches fine-tune LMs on large ICL-style datasets by an end-to-end paradigm with massive computational costs. To reduce such costs, in this paper, we propose Attention Behavior Fine-Tuning (ABFT), utilizing the previous findings on the inner mechanism of ICL, building training objectives on the attention scores instead of the final outputs, to force the attention scores to focus on the correct label tokens presented in the context and mitigate attention scores from the wrong label tokens. Our experiments on 9 modern LMs and 8 datasets empirically find that ABFT outperforms in performance, robustness, unbiasedness, and efficiency, with only around 0.01% data cost compared to the previous methods. Moreover, our subsequent analysis finds that the end-to-end training objective contains the ABFT objective, suggesting the implicit bias of ICL-style data to the emergence of induction heads. Our work demonstrates the possibility of controlling specific module sequences within LMs to improve their behavior, opening up the future application of mechanistic interpretability.
SOC-PHJan 31, 2024
Uncover the nature of overlapping community in citiesPeng Luo, Di Zhu
Urban spaces, though often perceived as discrete communities, are shared by various functional and social groups. Our study introduces a graph-based physics-aware deep learning framework, illuminating the intricate overlapping nature inherent in urban communities. Through analysis of individual mobile phone positioning data at Twin Cities metro area (TCMA) in Minnesota, USA, our findings reveal that 95.7 % of urban functional complexity stems from the overlapping structure of communities during weekdays. Significantly, our research not only quantifies these overlaps but also reveals their compelling correlations with income and racial indicators, unraveling the complex segregation patterns in U.S. cities. As the first to elucidate the overlapping nature of urban communities, this work offers a unique geospatial perspective on looking at urban structures, highlighting the nuanced interplay of socioeconomic dynamics within cities.
CRJun 18, 2019
Recent Advances of Image Steganography with Generative Adversarial NetworksJia Liu, Yan Ke, Yu Lei et al.
In the past few years, the Generative Adversarial Network (GAN) which proposed in 2014 has achieved great success. GAN has achieved many research results in the field of computer vision and natural language processing. Image steganography is dedicated to hiding secret messages in digital images, and has achieved the purpose of covert communication. Recently, research on image steganography has demonstrated great potential for using GAN and neural networks. In this paper we review different strategies for steganography such as cover modification, cover selection and cover synthesis by GANs, and discuss the characteristics of these methods as well as evaluation metrics and provide some possible future research directions in image steganography.