AIApr 8, 2023
Generative AI Meets Future Cities: Towards an Era of Autonomous Urban IntelligenceDongjie Wang, Chang-Tien Lu, Xinyue Ye et al.
The two fields of urban planning and artificial intelligence (AI) arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. In the present paper, we introduce the importance of urban planning from the sustainability, living, economic, disaster, and environmental perspectives. We review the fundamental concepts of urban planning and relate these concepts to crucial open problems of machine learning, including adversarial learning, generative neural networks, deep encoder-decoder networks, conversational AI, and geospatial and temporal machine learning, thereby assaying how AI can contribute to modern urban planning. Thus, a central problem is automated land-use configuration, which is formulated as the generation of land uses and building configuration for a target area from surrounding geospatial, human mobility, social media, environment, and economic activities. Finally, we delineate some implications of AI for urban planning and propose key research areas at the intersection of both topics.
10.5ROJun 2
RSC: Decentralized Rigid Formation Flocking for Large-Scale Swarms via Hybrid Predictive Control and Online ReconfigurationGanyu Zou, Linhan Wang, Chen Dai et al.
Decentralized rigid formation flocking requires a swarm of autonomous agents to maintain a predetermined geometric configuration while moving, relying solely on local sensing and communication. However, existing decentralized control methods struggle to maintain strict inter-agent distance constraints in cluttered environments, often suffering from local minima deadlocks, high frequency control oscillations, or limited flexibility during obstacle navigation, resulting in low success rate. To address these limitations, we propose Rigid Swarm Control (RSC), a decentralized control framework for large-scale rigid formation flocking. To escape local minima via robust long-term planning while ensuring short-term safety, RSC integrates finite-horizon trajectory predictions with a reactive artificial potential field (APF) safety controller within a hybrid architecture. Furthermore, to accelerate formation reassembly after obstacle traversal without interrupting task execution, RSC introduces an online leader-follower reconfiguration mechanism based on stable role exchange. Extensive evaluations in challenging cluttered environments with 25 UAVs demonstrate that RSC reliably unifies rigid formation maintenance, obstacle avoidance, and target tracking. Under strict success criteria - collision-free operation with a maximum relative edge-length error below 10%, RSC achieves an 83% success rate, significantly outperforming existing heuristic and learning-based baselines that fall below 5%.
CLNov 15, 2023Code
Uncertainty Estimation on Sequential Labeling via Uncertainty TransmissionJianfeng He, Linlin Yu, Shuo Lei et al.
Sequential labeling is a task predicting labels for each token in a sequence, such as Named Entity Recognition (NER). NER tasks aim to extract entities and predict their labels given a text, which is important in information extraction. Although previous works have shown great progress in improving NER performance, uncertainty estimation on NER (UE-NER) is still underexplored but essential. This work focuses on UE-NER, which aims to estimate uncertainty scores for the NER predictions. Previous uncertainty estimation models often overlook two unique characteristics of NER: the connection between entities (i.e., one entity embedding is learned based on the other ones) and wrong span cases in the entity extraction subtask. Therefore, we propose a Sequential Labeling Posterior Network (SLPN) to estimate uncertainty scores for the extracted entities, considering uncertainty transmitted from other tokens. Moreover, we have defined an evaluation strategy to address the specificity of wrong-span cases. Our SLPN has achieved significant improvements on three datasets, such as a 5.54-point improvement in AUPR on the MIT-Restaurant dataset. Our code is available at \url{https://github.com/he159ok/UncSeqLabeling_SLPN}.
CVSep 11, 2023Code
Self-Correlation and Cross-Correlation Learning for Few-Shot Remote Sensing Image Semantic SegmentationLinhan Wang, Shuo Lei, Jianfeng He et al.
Remote sensing image semantic segmentation is an important problem for remote sensing image interpretation. Although remarkable progress has been achieved, existing deep neural network methods suffer from the reliance on massive training data. Few-shot remote sensing semantic segmentation aims at learning to segment target objects from a query image using only a few annotated support images of the target class. Most existing few-shot learning methods stem primarily from their sole focus on extracting information from support images, thereby failing to effectively address the large variance in appearance and scales of geographic objects. To tackle these challenges, we propose a Self-Correlation and Cross-Correlation Learning Network for the few-shot remote sensing image semantic segmentation. Our model enhances the generalization by considering both self-correlation and cross-correlation between support and query images to make segmentation predictions. To further explore the self-correlation with the query image, we propose to adopt a classical spectral method to produce a class-agnostic segmentation mask based on the basic visual information of the image. Extensive experiments on two remote sensing image datasets demonstrate the effectiveness and superiority of our model in few-shot remote sensing image semantic segmentation. Code and models will be accessed at https://github.com/linhanwang/SCCNet.
CLJun 3, 2023
TART: Improved Few-shot Text Classification Using Task-Adaptive Reference TransformationShuo Lei, Xuchao Zhang, Jianfeng He et al.
Meta-learning has emerged as a trending technique to tackle few-shot text classification and achieve state-of-the-art performance. However, the performance of existing approaches heavily depends on the inter-class variance of the support set. As a result, it can perform well on tasks when the semantics of sampled classes are distinct while failing to differentiate classes with similar semantics. In this paper, we propose a novel Task-Adaptive Reference Transformation (TART) network, aiming to enhance the generalization by transforming the class prototypes to per-class fixed reference points in task-adaptive metric spaces. To further maximize divergence between transformed prototypes in task-adaptive metric spaces, TART introduces a discriminative reference regularization among transformed prototypes. Extensive experiments are conducted on four benchmark datasets and our method demonstrates clear superiority over the state-of-the-art models in all the datasets. In particular, our model surpasses the state-of-the-art method by 7.4% and 5.4% in 1-shot and 5-shot classification on the 20 Newsgroups dataset, respectively.
OCAug 4, 2022
Memetic algorithms for Spatial Partitioning problemsSubhodip Biswas, Fanglan Chen, Zhiqian Chen et al.
Spatial optimization problems (SOPs) are characterized by spatial relationships governing the decision variables, objectives, and/or constraint functions. In this article, we focus on a specific type of SOP called spatial partitioning, which is a combinatorial problem due to the presence of discrete spatial units. Exact optimization methods do not scale with the size of the problem, especially within practicable time limits. This motivated us to develop population-based metaheuristics for solving such SOPs. However, the search operators employed by these population-based methods are mostly designed for real-parameter continuous optimization problems. For adapting these methods to SOPs, we apply domain knowledge in designing spatially-aware search operators for efficiently searching through the discrete search space while preserving the spatial constraints. To this end, we put forward a simple yet effective algorithm called swarm-based spatial memetic algorithm (SPATIAL) and test it on the school (re)districting problem. Detailed experimental investigations are performed on real-world datasets to evaluate the performance of SPATIAL. Besides, ablation studies are performed to understand the role of the individual components of SPATIAL. Additionally, we discuss how SPATIAL~is helpful in the real-life planning process and its applicability to different scenarios and motivate future research directions.
CLOct 24, 2022
LANS: Large-scale Arabic News Summarization CorpusAbdulaziz Alhamadani, Xuchao Zhang, Jianfeng He et al.
Text summarization has been intensively studied in many languages, and some languages have reached advanced stages. Yet, Arabic Text Summarization (ATS) is still in its developing stages. Existing ATS datasets are either small or lack diversity. We build, LANS, a large-scale and diverse dataset for Arabic Text Summarization task. LANS offers 8.4 million articles and their summaries extracted from newspapers websites metadata between 1999 and 2019. The high-quality and diverse summaries are written by journalists from 22 major Arab newspapers, and include an eclectic mix of at least more than 7 topics from each source. We conduct an intrinsic evaluation on LANS by both automatic and human evaluations. Human evaluation of 1000 random samples reports 95.4% accuracy for our collected summaries, and automatic evaluation quantifies the diversity and abstractness of the summaries. The dataset is publicly available upon request.
AIJun 8, 2022
Sampling-based techniques for designing school boundariesSubhodip Biswas, Fanglan Chen, Zhiqian Chen et al.
Recently, an increasing number of researchers, especially in the realm of political redistricting, have proposed sampling-based techniques to generate a subset of plans from the vast space of districting plans. These techniques have been increasingly adopted by U.S. courts of law and independent commissions as a tool for identifying partisan gerrymanders. Motivated by these recent developments, we develop a set of similar sampling techniques for designing school boundaries based on the flip proposal. Note that the flip proposal here refers to the change in the districting plan by a single assignment. These sampling-based techniques serve a dual purpose. They can be used as a baseline for comparing redistricting algorithms based on local search. Additionally, these techniques can help to infer the problem characteristics that may be further used for developing efficient redistricting methods. We empirically touch on both these aspects in regards to the problem of school redistricting.
DBOct 18, 2023
A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, ChallengeLe Ma, Ran Zhang, Yikun Han et al.
Vector databases (VDBs) have emerged to manage high-dimensional data that exceed the capabilities of traditional database management systems, and are now tightly integrated with large language models as well as widely applied in modern artificial intelligence systems. Although relatively few studies describe existing or introduce new vector database architectures, the core technologies underlying VDBs, such as approximate nearest neighbor search, have been extensively studied and are well documented in the literature. In this work, we present a comprehensive review of the relevant algorithms to provide a general understanding of this booming research area. Specifically, we first provide a review of storage and retrieval techniques in VDBs, with detailed design principles and technological evolution. Then, we conduct an in-depth comparison of several advanced VDB solutions with their strengths, limitations, and typical application scenarios. Finally, we also outline emerging opportunities for coupling VDBs with large language models, including open research problems and trends, such as novel indexing strategies. This survey aims to serve as a practical resource, enabling readers to quickly gain an overall understanding of the current knowledge landscape in this rapidly developing area.
LGOct 28, 2023
ALERTA-Net: A Temporal Distance-Aware Recurrent Networks for Stock Movement and Volatility PredictionShengkun Wang, YangXiao Bai, Kaiqun Fu et al.
For both investors and policymakers, forecasting the stock market is essential as it serves as an indicator of economic well-being. To this end, we harness the power of social media data, a rich source of public sentiment, to enhance the accuracy of stock market predictions. Diverging from conventional methods, we pioneer an approach that integrates sentiment analysis, macroeconomic indicators, search engine data, and historical prices within a multi-attention deep learning model, masterfully decoding the complex patterns inherent in the data. We showcase the state-of-the-art performance of our proposed model using a dataset, specifically curated by us, for predicting stock market movements and volatility.
STAug 25, 2024
StockTime: A Time Series Specialized Large Language Model Architecture for Stock Price PredictionShengkun Wang, Taoran Ji, Linhan Wang et al.
The stock price prediction task holds a significant role in the financial domain and has been studied for a long time. Recently, large language models (LLMs) have brought new ways to improve these predictions. While recent financial large language models (FinLLMs) have shown considerable progress in financial NLP tasks compared to smaller pre-trained language models (PLMs), challenges persist in stock price forecasting. Firstly, effectively integrating the modalities of time series data and natural language to fully leverage these capabilities remains complex. Secondly, FinLLMs focus more on analysis and interpretability, which can overlook the essential features of time series data. Moreover, due to the abundance of false and redundant information in financial markets, models often produce less accurate predictions when faced with such input data. In this paper, we introduce StockTime, a novel LLM-based architecture designed specifically for stock price data. Unlike recent FinLLMs, StockTime is specifically designed for stock price time series data. It leverages the natural ability of LLMs to predict the next token by treating stock prices as consecutive tokens, extracting textual information such as stock correlations, statistical trends and timestamps directly from these stock prices. StockTime then integrates both textual and time series data into the embedding space. By fusing this multimodal data, StockTime effectively predicts stock prices across arbitrary look-back periods. Our experiments demonstrate that StockTime outperforms recent LLMs, as it gives more accurate predictions while reducing memory usage and runtime costs.
LGMar 21, 2023
DG-Trans: Dual-level Graph Transformer for Spatiotemporal Incident Impact Prediction on Traffic NetworksYanshen Sun, Kaiqun Fu, Chang-Tien Lu
The prompt estimation of traffic incident impacts can guide commuters in their trip planning and improve the resilience of transportation agencies' decision-making on resilience. However, it is more challenging than node-level and graph-level forecasting tasks, as it requires extracting the anomaly subgraph or sub-time-series from dynamic graphs. In this paper, we propose DG-Trans, a novel traffic incident impact prediction framework, to foresee the impact of traffic incidents through dynamic graph learning. The proposed framework contains a dual-level spatial transformer and an importance-score-based temporal transformer, and the performance of this framework is justified by two newly constructed benchmark datasets. The dual-level spatial transformer removes unnecessary edges between nodes to isolate the affected subgraph from the other nodes. Meanwhile, the importance-score-based temporal transformer identifies abnormal changes in node features, causing the predictions to rely more on measurement changes after the incident occurs. Therefore, DG-Trans is equipped with dual abilities that extract spatiotemporal dependency and identify anomaly nodes affected by incidents while removing noise introduced by benign nodes. Extensive experiments on real-world datasets verify that DG-Trans outperforms the existing state-of-the-art methods, especially in extracting spatiotemporal dependency patterns and predicting traffic accident impacts. It offers promising potential for traffic incident management systems.
LGJul 3, 2024
AMA-LSTM: Pioneering Robust and Fair Financial Audio Analysis for Stock Volatility PredictionShengkun Wang, Taoran Ji, Jianfeng He et al.
Stock volatility prediction is an important task in the financial industry. Recent advancements in multimodal methodologies, which integrate both textual and auditory data, have demonstrated significant improvements in this domain, such as earnings calls (Earnings calls are public available and often involve the management team of a public company and interested parties to discuss the company's earnings). However, these multimodal methods have faced two drawbacks. First, they often fail to yield reliable models and overfit the data due to their absorption of stochastic information from the stock market. Moreover, using multimodal models to predict stock volatility suffers from gender bias and lacks an efficient way to eliminate such bias. To address these aforementioned problems, we use adversarial training to generate perturbations that simulate the inherent stochasticity and bias, by creating areas resistant to random information around the input space to improve model robustness and fairness. Our comprehensive experiments on two real-world financial audio datasets reveal that this method exceeds the performance of current state-of-the-art solution. This confirms the value of adversarial training in reducing stochasticity and bias for stock volatility prediction tasks.
ROSep 29, 2023
Learning Decentralized Flocking Controllers with Spatio-Temporal Graph Neural NetworkSiji Chen, Yanshen Sun, Peihan Li et al.
Recently a line of researches has delved the use of graph neural networks (GNNs) for decentralized control in swarm robotics. However, it has been observed that relying solely on the states of immediate neighbors is insufficient to imitate a centralized control policy. To address this limitation, prior studies proposed incorporating $L$-hop delayed states into the computation. While this approach shows promise, it can lead to a lack of consensus among distant flock members and the formation of small clusters, consequently resulting in the failure of cohesive flocking behaviors. Instead, our approach leverages spatiotemporal GNN, named STGNN that encompasses both spatial and temporal expansions. The spatial expansion collects delayed states from distant neighbors, while the temporal expansion incorporates previous states from immediate neighbors. The broader and more comprehensive information gathered from both expansions results in more effective and accurate predictions. We develop an expert algorithm for controlling a swarm of robots and employ imitation learning to train our decentralized STGNN model based on the expert algorithm. We simulate the proposed STGNN approach in various settings, demonstrating its decentralized capacity to emulate the global expert algorithm. Further, we implemented our approach to achieve cohesive flocking, leader following and obstacle avoidance by a group of Crazyflie drones. The performance of STGNN underscores its potential as an effective and reliable approach for achieving cohesive flocking, leader following and obstacle avoidance tasks.
CVJan 29
Drive-JEPA: Video JEPA Meets Multimodal Trajectory Distillation for End-to-End DrivingLinhan Wang, Zichong Yang, Chen Bai et al.
End-to-end autonomous driving increasingly leverages self-supervised video pretraining to learn transferable planning representations. However, pretraining video world models for scene understanding has so far brought only limited improvements. This limitation is compounded by the inherent ambiguity of driving: each scene typically provides only a single human trajectory, making it difficult to learn multimodal behaviors. In this work, we propose Drive-JEPA, a framework that integrates Video Joint-Embedding Predictive Architecture (V-JEPA) with multimodal trajectory distillation for end-to-end driving. First, we adapt V-JEPA for end-to-end driving, pretraining a ViT encoder on large-scale driving videos to produce predictive representations aligned with trajectory planning. Second, we introduce a proposal-centric planner that distills diverse simulator-generated trajectories alongside human trajectories, with a momentum-aware selection mechanism to promote stable and safe behavior. When evaluated on NAVSIM, the V-JEPA representation combined with a simple transformer-based decoder outperforms prior methods by 3 PDMS in the perception-free setting. The complete Drive-JEPA framework achieves 93.3 PDMS on v1 and 87.8 EPDMS on v2, setting a new state-of-the-art.
AIMay 26, 2024
Network Interdiction Goes NeuralLei Zhang, Zhiqian Chen, Chang-Tien Lu et al.
Network interdiction problems are combinatorial optimization problems involving two players: one aims to solve an optimization problem on a network, while the other seeks to modify the network to thwart the first player's objectives. Such problems typically emerge in an attacker-defender context, encompassing areas such as military operations, disease spread analysis, and communication network management. The primary bottleneck in network interdiction arises from the high time complexity of using conventional exact solvers and the challenges associated with devising efficient heuristic solvers. GNNs, recognized as a cutting-edge methodology, have shown significant effectiveness in addressing single-level CO problems on graphs, such as the traveling salesman problem, graph matching, and graph edit distance. Nevertheless, network interdiction presents a bi-level optimization challenge, which current GNNs find difficult to manage. To address this gap, we represent network interdiction problems as Mixed-Integer Linear Programming (MILP) instances, then apply a multipartite GNN with sufficient representational capacity to learn these formulations. This approach ensures that our neural network is more compatible with the mathematical algorithms designed to solve network interdiction problems, resulting in improved generalization. Through two distinct tasks, we demonstrate that our proposed method outperforms theoretical baseline models and provides advantages over traditional exact solvers.
LGOct 31, 2025
MVeLMA: Multimodal Vegetation Loss Modeling Architecture for Predicting Post-fire Vegetation LossMeenu Ravi, Shailik Sarkar, Yanshen Sun et al.
Understanding post-wildfire vegetation loss is critical for developing effective ecological recovery strategies and is often challenging due to the extended time and effort required to capture the evolving ecosystem features. Recent works in this area have not fully explored all the contributing factors, their modalities, and interactions with each other. Furthermore, most research in this domain is limited by a lack of interpretability in predictive modeling, making it less useful in real-world settings. In this work, we propose a novel end-to-end ML pipeline called MVeLMA (\textbf{M}ultimodal \textbf{Ve}getation \textbf{L}oss \textbf{M}odeling \textbf{A}rchitecture) to predict county-wise vegetation loss from fire events. MVeLMA uses a multimodal feature integration pipeline and a stacked ensemble-based architecture to capture different modalities while also incorporating uncertainty estimation through probabilistic modeling. Through comprehensive experiments, we show that our model outperforms several state-of-the-art (SOTA) and baseline models in predicting post-wildfire vegetation loss. Furthermore, we generate vegetation loss confidence maps to identify high-risk counties, thereby helping targeted recovery efforts. The findings of this work have the potential to inform future disaster relief planning, ecological policy development, and wildlife recovery management.
CLJun 25, 2024Code
Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?Jianfeng He, Runing Yang, Linlin Yu et al.
Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncertainty model metrics on diverse and potentially conflicting NLG metrics. To address this issue, we introduce a comprehensive UE-TS benchmark incorporating 31 NLG metrics across four dimensions. The benchmark evaluates the uncertainty estimation capabilities of two large language models and one pre-trained language model on three datasets, with human-annotation analysis incorporated where applicable. We also assess the performance of 14 common uncertainty estimation methods within this benchmark. Our findings emphasize the importance of considering multiple uncorrelated NLG metrics and diverse uncertainty estimation methods to ensure reliable and efficient evaluation of UE-TS techniques. Our code and data are available https://github.com/he159ok/Benchmark-of-Uncertainty-Estimation-Methods-in-Text-Summarization.
LGDec 19, 2024Code
Downscaling Precipitation with Bias-informed Conditional Diffusion ModelRan Lyu, Linhan Wang, Yanshen Sun et al.
Climate change is intensifying rainfall extremes, making high-resolution precipitation projections crucial for society to better prepare for impacts such as flooding. However, current Global Climate Models (GCMs) operate at spatial resolutions too coarse for localized analyses. To address this limitation, deep learning-based statistical downscaling methods offer promising solutions, providing high-resolution precipitation projections with a moderate computational cost. In this work, we introduce a bias-informed conditional diffusion model for statistical downscaling of precipitation. Specifically, our model leverages a conditional diffusion approach to learn distribution priors from large-scale, high-resolution precipitation datasets. The long-tail distribution of precipitation poses a unique challenge for training diffusion models; to address this, we apply gamma correction during preprocessing. Additionally, to correct biases in the downscaled results, we employ a guided-sampling strategy to enhance bias correction. Our experiments demonstrate that the proposed model achieves highly accurate results in an 8 times downscaling setting, outperforming previous deterministic methods. The code and dataset are available at https://github.com/RoseLV/research_super-resolution
LGMay 15, 2024
A Comprehensive Survey on Data AugmentationZaitian Wang, Pengfei Wang, Kunpeng Liu et al.
Data augmentation is a series of techniques that generate high-quality artificial data by manipulating existing data samples. By leveraging data augmentation techniques, AI models can achieve significantly improved applicability in tasks involving scarce or imbalanced datasets, thereby substantially enhancing AI models' generalization capabilities. Existing literature surveys only focus on a certain type of specific modality data and categorize these methods from modality-specific and operation-centric perspectives, which lacks a consistent summary of data augmentation methods across multiple modalities and limits the comprehension of how existing data samples serve the data augmentation process. To bridge this gap, this survey proposes a more enlightening taxonomy that encompasses data augmentation techniques for different common data modalities by investigating how to take advantage of the intrinsic relationship between and within instances. Additionally, it categorizes data augmentation methods across five data modalities through a unified inductive approach.
CLMar 27, 2024
Exploring the Deceptive Power of LLM-Generated Fake News: A Study of Real-World Detection ChallengesYanshen Sun, Jianfeng He, Limeng Cui et al.
Recent advancements in Large Language Models (LLMs) have enabled the creation of fake news, particularly in complex fields like healthcare. Studies highlight the gap in the deceptive power of LLM-generated fake news with and without human assistance, yet the potential of prompting techniques has not been fully explored. Thus, this work aims to determine whether prompting strategies can effectively narrow this gap. Current LLM-based fake news attacks require human intervention for information gathering and often miss details and fail to maintain context consistency. Therefore, to better understand threat tactics, we propose a strong fake news attack method called conditional Variational-autoencoder-Like Prompt (VLPrompt). Unlike current methods, VLPrompt eliminates the need for additional data collection while maintaining contextual coherence and preserving the intricacies of the original text. To propel future research on detecting VLPrompt attacks, we created a new dataset named VLPrompt fake news (VLPFN) containing real and fake texts. Our experiments, including various detection methods and novel human study metrics, were conducted to assess their performance on our dataset, yielding numerous findings.
CRNov 3, 2025
Scam Shield: Multi-Model Voting and Fine-Tuned LLMs Against Adversarial AttacksChen-Wei Chang, Shailik Sarkar, Hossein Salemi et al.
Scam detection remains a critical challenge in cybersecurity as adversaries craft messages that evade automated filters. We propose a Hierarchical Scam Detection System (HSDS) that combines a lightweight multi-model voting front end with a fine-tuned LLaMA 3.1 8B Instruct back end to improve accuracy and robustness against adversarial attacks. An ensemble of four classifiers provides preliminary predictions through majority vote, and ambiguous cases are escalated to the fine-tuned model, which is optimized with adversarial training to reduce misclassification. Experiments show that this hierarchical design both improves adversarial scam detection and shortens inference time by routing most cases away from the LLM, outperforming traditional machine-learning baselines and proprietary LLM baselines. The findings highlight the effectiveness of a hybrid voting mechanism and adversarial fine-tuning in fortifying LLMs against evolving scam tactics, enhancing the resilience of automated scam detection systems.
CLFeb 18, 2024
Don't Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech DetectionMin Zhang, Jianfeng He, Taoran Ji et al.
The fairness and trustworthiness of Large Language Models (LLMs) are receiving increasing attention. Implicit hate speech, which employs indirect language to convey hateful intentions, occupies a significant portion of practice. However, the extent to which LLMs effectively address this issue remains insufficiently examined. This paper delves into the capability of LLMs to detect implicit hate speech (Classification Task) and express confidence in their responses (Calibration Task). Our evaluation meticulously considers various prompt patterns and mainstream uncertainty estimation methods. Our findings highlight that LLMs exhibit two extremes: (1) LLMs display excessive sensitivity towards groups or topics that may cause fairness issues, resulting in misclassifying benign statements as hate speech. (2) LLMs' confidence scores for each method excessively concentrate on a fixed range, remaining unchanged regardless of the dataset's complexity. Consequently, the calibration performance is heavily reliant on primary classification accuracy. These discoveries unveil new limitations of LLMs, underscoring the need for caution when optimizing models to ensure they do not veer towards extremes. This serves as a reminder to carefully consider sensitivity and confidence in the pursuit of model fairness.
CLDec 12, 2023
Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalizationMin Zhang, Jianfeng He, Shuo Lei et al.
The meaning of complex phrases in natural language is composed of their individual components. The task of compositional generalization evaluates a model's ability to understand new combinations of components. Previous studies trained smaller, task-specific models, which exhibited poor generalization. While large language models (LLMs) exhibit impressive generalization abilities on many tasks through in-context learning (ICL), their potential for compositional generalization remains unexplored. In this paper, we first empirically investigate prevailing ICL methods in compositional generalization. We find that they struggle with complex compositional questions due to cumulative errors in long reasoning steps and intricate logic required for tool-making. Consequently, we propose a human-guided tool manipulation framework (HTM) that generates tools for sub-questions and integrates multiple tools. Our method enhances the effectiveness of tool creation and usage with minimal human effort. Experiments show that our method achieves state-of-the-art performance on two compositional generalization benchmarks and outperforms existing methods on the most challenging test split by 70%.
AIDec 4, 2023
Stock Movement and Volatility Prediction from Tweets, Macroeconomic Factors and Historical PricesShengkun Wang, YangXiao Bai, Taoran Ji et al.
Predicting stock market is vital for investors and policymakers, acting as a barometer of the economic health. We leverage social media data, a potent source of public sentiment, in tandem with macroeconomic indicators as government-compiled statistics, to refine stock market predictions. However, prior research using tweet data for stock market prediction faces three challenges. First, the quality of tweets varies widely. While many are filled with noise and irrelevant details, only a few genuinely mirror the actual market scenario. Second, solely focusing on the historical data of a particular stock without considering its sector can lead to oversight. Stocks within the same industry often exhibit correlated price behaviors. Lastly, simply forecasting the direction of price movement without assessing its magnitude is of limited value, as the extent of the rise or fall truly determines profitability. In this paper, diverging from the conventional methods, we pioneer an ECON. The framework has following advantages: First, ECON has an adept tweets filter that efficiently extracts and decodes the vast array of tweet data. Second, ECON discerns multi-level relationships among stocks, sectors, and macroeconomic factors through a self-aware mechanism in semantic space. Third, ECON offers enhanced accuracy in predicting substantial stock price fluctuations by capitalizing on stock price movement. We showcase the state-of-the-art performance of our proposed model using a dataset, specifically curated by us, for predicting stock market movements and volatility.
CRDec 1, 2024
Exposing LLM Vulnerabilities: Adversarial Scam Detection and PerformanceChen-Wei Chang, Shailik Sarkar, Shutonu Mitra et al.
Can we trust Large Language Models (LLMs) to accurately predict scam? This paper investigates the vulnerabilities of LLMs when facing adversarial scam messages for the task of scam detection. We addressed this issue by creating a comprehensive dataset with fine-grained labels of scam messages, including both original and adversarial scam messages. The dataset extended traditional binary classes for the scam detection task into more nuanced scam types. Our analysis showed how adversarial examples took advantage of vulnerabilities of a LLM, leading to high misclassification rate. We evaluated the performance of LLMs on these adversarial scam messages and proposed strategies to improve their robustness.
AIOct 26, 2024
Rethinking the Uncertainty: A Critical Review and Analysis in the Era of Large Language ModelsMohammad Beigi, Sijia Wang, Ying Shen et al.
In recent years, Large Language Models (LLMs) have become fundamental to a broad spectrum of artificial intelligence applications. As the use of LLMs expands, precisely estimating the uncertainty in their predictions has become crucial. Current methods often struggle to accurately identify, measure, and address the true uncertainty, with many focusing primarily on estimating model confidence. This discrepancy is largely due to an incomplete understanding of where, when, and how uncertainties are injected into models. This paper introduces a comprehensive framework specifically designed to identify and understand the types and sources of uncertainty, aligned with the unique characteristics of LLMs. Our framework enhances the understanding of the diverse landscape of uncertainties by systematically categorizing and defining each type, establishing a solid foundation for developing targeted methods that can precisely quantify these uncertainties. We also provide a detailed introduction to key related concepts and examine the limitations of current methods in mission-critical and safety-sensitive applications. The paper concludes with a perspective on future directions aimed at enhancing the reliability and practical adoption of these methods in real-world scenarios.
CYOct 29, 2025
RailEstate: An Interactive System for Metro Linked Property TrendsChen-Wei Chang, Yu-Chieh Cheng, Yun-En Tsai et al.
Access to metro systems plays a critical role in shaping urban housing markets by enhancing neighborhood accessibility and driving property demand. We present RailEstate, a novel web based system that integrates spatial analytics, natural language interfaces, and interactive forecasting to analyze how proximity to metro stations influences residential property prices in the Washington metropolitan area. Unlike static mapping tools or generic listing platforms, RailEstate combines 25 years of historical housing data with transit infrastructure to support low latency geospatial queries, time series visualizations, and predictive modeling. Users can interactively explore ZIP code level price patterns, investigate long term trends, and forecast future housing values around any metro station. A key innovation is our natural language chatbot, which translates plain-English questions e.g., What is the highest price in Falls Church in the year 2000? into executable SQL over a spatial database. This unified and interactive platform empowers urban planners, investors, and residents to derive actionable insights from metro linked housing data without requiring technical expertise.
CVOct 25, 2025
SemiETPicker: Fast and Label-Efficient Particle Picking for CryoET Tomography Using Semi-Supervised LearningLinhan Wang, Jianwen Dou, Wang Li et al.
Cryogenic Electron Tomography (CryoET) combined with sub-volume averaging (SVA) is the only imaging modality capable of resolving protein structures inside cells at molecular resolution. Particle picking, the task of localizing and classifying target proteins in 3D CryoET volumes, remains the main bottleneck. Due to the reliance on time-consuming manual labels, the vast reserve of unlabeled tomograms remains underutilized. In this work, we present a fast, label-efficient semi-supervised framework that exploits this untapped data. Our framework consists of two components: (i) an end-to-end heatmap-supervised detection model inspired by keypoint detection, and (ii) a teacher-student co-training mechanism that enhances performance under sparse labeling conditions. Furthermore, we introduce multi-view pseudo-labeling and a CryoET-specific DropBlock augmentation strategy to further boost performance. Extensive evaluations on the large-scale CZII dataset show that our approach improves F1 by 10% over supervised baselines, underscoring the promise of semi-supervised learning for leveraging unlabeled CryoET data.
AIJun 1, 2025
HouseTS: A Large-Scale, Multimodal Spatiotemporal U.S. Housing DatasetShengkun Wang, Yanshen Sun, Fanglan Chen et al.
Accurate house-price forecasting is essential for investors, planners, and researchers. However, reproducible benchmarks with sufficient spatiotemporal depth and contextual richness for long horizon prediction remain scarce. To address this, we introduce HouseTS a large scale, multimodal dataset covering monthly house prices from March 2012 to December 2023 across 6,000 ZIP codes in 30 major U.S. metropolitan areas. The dataset includes over 890K records, enriched with points of Interest (POI), socioeconomic indicators, and detailed real estate metrics. To establish standardized performance baselines, we evaluate 14 models, spanning classical statistical approaches, deep neural networks (DNNs), and pretrained time-series foundation models. We further demonstrate the value of HouseTS in a multimodal case study, where a vision language model extracts structured textual descriptions of geographic change from time stamped satellite imagery. This enables interpretable, grounded insights into urban evolution. HouseTS is hosted on Kaggle, while all preprocessing pipelines, benchmark code, and documentation are openly maintained on GitHub to ensure full reproducibility and easy adoption.
AIApr 1, 2025
Collaborative LLM Numerical Reasoning with Local Data ProtectionMin Zhang, Yuzhe Lu, Yun Zhou et al.
Numerical reasoning over documents, which demands both contextual understanding and logical inference, is challenging for low-capacity local models deployed on computation-constrained devices. Although such complex reasoning queries could be routed to powerful remote models like GPT-4, exposing local data raises significant data leakage concerns. Existing mitigation methods generate problem descriptions or examples for remote assistance. However, the inherent complexity of numerical reasoning hinders the local model from generating logically equivalent queries and accurately inferring answers with remote guidance. In this paper, we present a model collaboration framework with two key innovations: (1) a context-aware synthesis strategy that shifts the query topics while preserving reasoning patterns; and (2) a tool-based answer reconstruction approach that reuses the remote-generated plug-and-play solution with code snippets. Experimental results demonstrate that our method achieves better reasoning accuracy than solely using local models while providing stronger data protection than fully relying on remote models. Furthermore, our method improves accuracy by 16.2% - 43.6% while reducing data leakage by 2.3% - 44.6% compared to existing data protection approaches.
LGFeb 21, 2025
Optimizing Product Provenance Verification using Data Valuation MethodsRaquib Bin Yousuf, Hoang Anh Just, Shengzhe Xu et al.
Determining and verifying product provenance remains a critical challenge in global supply chains, particularly as geopolitical conflicts and shifting borders create new incentives for misrepresentation of commodities, such as hiding the origin of illegally harvested timber or agriculture grown on illegally cleared land. Stable Isotope Ratio Analysis (SIRA), combined with Gaussian process regression-based isoscapes, has emerged as a powerful tool for geographic origin verification. However, the effectiveness of these models is often constrained by data scarcity and suboptimal dataset selection. In this work, we introduce a novel data valuation framework designed to enhance the selection and utilization of training data for machine learning models applied in SIRA. By prioritizing high-informative samples, our approach improves model robustness and predictive accuracy across diverse datasets and geographies. We validate our methodology with extensive experiments, demonstrating its potential to significantly enhance provenance verification, mitigate fraudulent trade practices, and strengthen regulatory enforcement of global supply chains.
LGFeb 19, 2025
Chasing the Timber Trail: Machine Learning to Reveal Harvest Location MisrepresentationShailik Sarkar, Raquib Bin Yousuf, Linhan Wang et al.
Illegal logging poses a significant threat to global biodiversity, climate stability, and depresses international prices for legal wood harvesting and responsible forest products trade, affecting livelihoods and communities across the globe. Stable isotope ratio analysis (SIRA) is rapidly becoming an important tool for determining the harvest location of traded, organic, products. The spatial pattern in stable isotope ratio values depends on factors such as atmospheric and environmental conditions and can thus be used for geographic origin identification. We present here the results of a deployed machine learning pipeline where we leverage both isotope values and atmospheric variables to determine timber harvest location. Additionally, the pipeline incorporates uncertainty estimation to facilitate the interpretation of harvest location determination for analysts. We present our experiments on a collection of oak (Quercus spp.) tree samples from its global range. Our pipeline outperforms comparable state-of-the-art models determining geographic harvest origin of commercially traded wood products, and has been used by European enforcement agencies to identify harvest location misrepresentation. We also identify opportunities for further advancement of our framework and how it can be generalized to help identify the origin of falsely labeled organic products throughout the supply chain.
CLJun 17, 2024
InternalInspector $I^2$: Robust Confidence Estimation in LLMs through Internal StatesMohammad Beigi, Ying Shen, Runing Yang et al.
Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention states, feed-forward states, and activation states of all layers. Unlike existing methods that primarily focus on the final activation state, InternalInspector conducts a comprehensive analysis across all internal states of every layer to accurately identify both correct and incorrect prediction processes. By benchmarking InternalInspector against existing confidence estimation methods across various natural language understanding and generation tasks, including factual question answering, commonsense reasoning, and reading comprehension, InternalInspector achieves significantly higher accuracy in aligning the estimated confidence scores with the correctness of the LLM's predictions and lower calibration error. Furthermore, InternalInspector excels at HaluEval, a hallucination detection benchmark, outperforming other internal-based confidence estimation methods in this task.
RODec 26, 2021
Automated Urban Planning for Reimagining City Configuration via Adversarial Learning: Quantification, Generation, and EvaluationDongjie Wang, Yanjie Fu, Kunpeng Liu et al.
Urban planning refers to the efforts of designing land-use configurations given a region. However, to obtain effective urban plans, urban experts have to spend much time and effort analyzing sophisticated planning constraints based on domain knowledge and personal experiences. To alleviate the heavy burden of them and produce consistent urban plans, we want to ask that can AI accelerate the urban planning process, so that human planners only adjust generated configurations for specific needs? The recent advance of deep generative models provides a possible answer, which inspires us to automate urban planning from an adversarial learning perspective. However, three major challenges arise: 1) how to define a quantitative land-use configuration? 2) how to automate configuration planning? 3) how to evaluate the quality of a generated configuration? In this paper, we systematically address the three challenges. Specifically, 1) We define a land-use configuration as a longitude-latitude-channel tensor. 2) We formulate the automated urban planning problem into a task of deep generative learning. The objective is to generate a configuration tensor given the surrounding contexts of a target region. 3) We provide quantitative evaluation metrics and conduct extensive experiments to demonstrate the effectiveness of our framework.
LGNov 9, 2021
Deep diffusion-based forecasting of COVID-19 by incorporating network-level mobility informationPadmaksha Roy, Shailik Sarkar, Subhodip Biswas et al.
Modeling the spatiotemporal nature of the spread of infectious diseases can provide useful intuition in understanding the time-varying aspect of the disease spread and the underlying complex spatial dependency observed in people's mobility patterns. Besides, the county level multiple related time series information can be leveraged to make a forecast on an individual time series. Adding to this challenge is the fact that real-time data often deviates from the unimodal Gaussian distribution assumption and may show some complex mixed patterns. Motivated by this, we develop a deep learning-based time-series model for probabilistic forecasting called Auto-regressive Mixed Density Dynamic Diffusion Network(ARM3Dnet), which considers both people's mobility and disease spread as a diffusion process on a dynamic directed graph. The Gaussian Mixture Model layer is implemented to consider the multimodal nature of the real-time data while learning from multiple related time series. We show that our model, when trained with the best combination of dynamic covariate features and mixture components, can outperform both traditional statistical and deep learning models in forecasting the number of Covid-19 deaths and cases at the county level in the United States.
LGSep 22, 2021
Automated Feature-Topic Pairing: Aligning Semantic and Embedding Spaces in Spatial Representation LearningDongjie Wang, Kunpeng Liu, David Mohaisen et al.
Automated characterization of spatial data is a kind of critical geographical intelligence. As an emerging technique for characterization, Spatial Representation Learning (SRL) uses deep neural networks (DNNs) to learn non-linear embedded features of spatial data for characterization. However, SRL extracts features by internal layers of DNNs, and thus suffers from lacking semantic labels. Texts of spatial entities, on the other hand, provide semantic understanding of latent feature labels, but is insensible to deep SRL models. How can we teach a SRL model to discover appropriate topic labels in texts and pair learned features with the labels? This paper formulates a new problem: feature-topic pairing, and proposes a novel Particle Swarm Optimization (PSO) based deep learning framework. Specifically, we formulate the feature-topic pairing problem into an automated alignment task between 1) a latent embedding feature space and 2) a textual semantic topic space. We decompose the alignment of the two spaces into: 1) point-wise alignment, denoting the correlation between a topic distribution and an embedding vector; 2) pair-wise alignment, denoting the consistency between a feature-feature similarity matrix and a topic-topic similarity matrix. We design a PSO based solver to simultaneously select an optimal set of topics and learn corresponding features based on the selected topics. We develop a closed loop algorithm to iterate between 1) minimizing losses of representation reconstruction and feature-topic alignment and 2) searching the best topics. Finally, we present extensive experiments to demonstrate the enhanced performance of our method.
LGJul 21, 2021
Bridging the Gap between Spatial and Spectral Domains: A Unified Framework for Graph Neural NetworksZhiqian Chen, Fanglan Chen, Lei Zhang et al.
Deep learning's performance has been extensively recognized recently. Graph neural networks (GNNs) are designed to deal with graph-structural data that classical deep learning does not easily manage. Since most GNNs were created using distinct theories, direct comparisons are impossible. Prior research has primarily concentrated on categorizing existing models, with little attention paid to their intrinsic connections. The purpose of this study is to establish a unified framework that integrates GNNs based on spectral graph and approximation theory. The framework incorporates a strong integration between spatial- and spectral-based GNNs while tightly associating approaches that exist within each respective domain.
CVOct 16, 2020
Semantic Editing On Segmentation Map Via Multi-Expansion LossJianfeng He, Xuchao Zhang, Shuo Lei et al.
Semantic editing on segmentation map has been proposed as an intermediate interface for image generation, because it provides flexible and strong assistance in various image generation tasks. This paper aims to improve quality of edited segmentation map conditioned on semantic inputs. Even though recent studies apply global and local adversarial losses extensively to generate images for higher image quality, we find that they suffer from the misalignment of the boundary area in the mask area. To address this, we propose MExGAN for semantic editing on segmentation map, which uses a novel Multi-Expansion (MEx) loss implemented by adversarial losses on MEx areas. Each MEx area has the mask area of the generation as the majority and the boundary of original context as the minority. To boost convenience and stability of MEx loss, we further propose an Approximated MEx (A-MEx) loss. Besides, in contrast to previous model that builds training data for semantic editing on segmentation map with part of the whole image, which leads to model performance degradation, MExGAN applies the whole image to build the training data. Extensive experiments on semantic editing on segmentation map and natural image inpainting show competitive results on four datasets.
AIAug 22, 2020
Reimagining City Configuration: Automated Urban Planning via Adversarial LearningDongjie Wang, Yanjie Fu, Pengyang Wang et al.
Urban planning refers to the efforts of designing land-use configurations. Effective urban planning can help to mitigate the operational and social vulnerability of a urban system, such as high tax, crimes, traffic congestion and accidents, pollution, depression, and anxiety. Due to the high complexity of urban systems, such tasks are mostly completed by professional planners. But, human planners take longer time. The recent advance of deep learning motivates us to ask: can machines learn at a human capability to automatically and quickly calculate land-use configuration, so human planners can finally adjust machine-generated plans for specific needs? To this end, we formulate the automated urban planning problem into a task of learning to configure land-uses, given the surrounding spatial contexts. To set up the task, we define a land-use configuration as a longitude-latitude-channel tensor, where each channel is a category of POIs and the value of an entry is the number of POIs. The objective is then to propose an adversarial learning framework that can automatically generate such tensor for an unplanned area. In particular, we first characterize the contexts of surrounding areas of an unplanned area by learning representations from spatial graphs using geographic and human mobility data. Second, we combine each unplanned area and its surrounding context representation as a tuple, and categorize all the tuples into positive (well-planned areas) and negative samples (poorly-planned areas). Third, we develop an adversarial land-use configuration approach, where the surrounding context representation is fed into a generator to generate a land-use configuration, and a discriminator learns to distinguish among positive and negative samples.
CVJul 3, 2020
Few-Shot Semantic Segmentation Augmented with Image-Level Weak AnnotationsShuo Lei, Xuchao Zhang, Jianfeng He et al.
Despite the great progress made by deep neural networks in the semantic segmentation task, traditional neural-networkbased methods typically suffer from a shortage of large amounts of pixel-level annotations. Recent progress in fewshot semantic segmentation tackles the issue by only a few pixel-level annotated examples. However, these few-shot approaches cannot easily be applied to multi-way or weak annotation settings. In this paper, we advance the few-shot segmentation paradigm towards a scenario where image-level annotations are available to help the training process of a few pixel-level annotations. Our key idea is to learn a better prototype representation of the class by fusing the knowledge from the image-level labeled data. Specifically, we propose a new framework, called PAIA, to learn the class prototype representation in a metric space by integrating image-level annotations. Furthermore, by considering the uncertainty of pseudo-masks, a distilled soft masked average pooling strategy is designed to handle distractions in image-level annotations. Extensive empirical results on two datasets show superior performance of PAIA.
LGFeb 27, 2020
Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural NetworksZhiqian Chen, Fanglan Chen, Lei Zhang et al.
Deep learning's success has been widely recognized in a variety of machine learning tasks, including image classification, audio recognition, and natural language processing. As an extension of deep learning beyond these domains, graph neural networks (GNNs) are designed to handle the non-Euclidean graph-structure which is intractable to previous deep learning techniques. Existing GNNs are presented using various techniques, making direct comparison and cross-reference more complex. Although existing studies categorize GNNs into spatial-based and spectral-based techniques, there hasn't been a thorough examination of their relationship. To close this gap, this study presents a single framework that systematically incorporates most GNNs. We organize existing GNNs into spatial and spectral domains, as well as expose the connections within each domain. A review of spectral graph theory and approximation theory builds a strong relationship across the spatial and spectral domains in further investigation.
LGNov 20, 2019
TITAN: A Spatiotemporal Feature Learning Framework for Traffic Incident Duration PredictionKaiqun Fu, Taoran Ji, Liang Zhao et al.
Critical incident stages identification and reasonable prediction of traffic incident duration are essential in traffic incident management. In this paper, we propose a traffic incident duration prediction model that simultaneously predicts the impact of the traffic incidents and identifies the critical groups of temporal features via a multi-task learning framework. First, we formulate a sparsity optimization problem that extracts low-level temporal features based on traffic speed readings and then generalizes higher level features as phases of traffic incidents. Second, we propose novel constraints on feature similarity exploiting prior knowledge about the spatial connectivity of the road network to predict the incident duration. The proposed problem is challenging to solve due to the orthogonality constraints, non-convexity objective, and non-smoothness penalties. We develop an algorithm based on the alternating direction method of multipliers (ADMM) framework to solve the proposed formulation. Extensive experiments and comparisons to other models on real-world traffic data and traffic incident records justify the efficacy of our model.
LGJul 17, 2019
Mitigating Uncertainty in Document ClassificationXuchao Zhang, Fanglan Chen, Chang-Tien Lu et al.
The uncertainty measurement of classifiers' predictions is especially important in applications such as medical diagnoses that need to ensure limited human resources can focus on the most uncertain predictions returned by machine learning models. However, few existing uncertainty models attempt to improve overall prediction accuracy where human resources are involved in the text classification task. In this paper, we propose a novel neural-network-based model that applies a new dropout-entropy method for uncertainty measurement. We also design a metric learning method on feature representations, which can boost the performance of dropout-based uncertainty methods with smaller prediction variance in accurate prediction trials. Extensive experiments on real-world data sets demonstrate that our method can achieve a considerable improvement in overall prediction accuracy compared to existing approaches. In particular, our model improved the accuracy from 0.78 to 0.92 when 30\% of the most uncertain predictions were handed over to human experts in "20NewsGroup" data.
DLMay 22, 2019
Patent Citation Dynamics Modeling via Multi-Attention Recurrent NetworksTaoran Ji, Zhiqian Chen, Nathan Self et al.
Modeling and forecasting forward citations to a patent is a central task for the discovery of emerging technologies and for measuring the pulse of inventive progress. Conventional methods for forecasting these forward citations cast the problem as analysis of temporal point processes which rely on the conditional intensity of previously received citations. Recent approaches model the conditional intensity as a chain of recurrent neural networks to capture memory dependency in hopes of reducing the restrictions of the parametric form of the intensity function. For the problem of patent citations, we observe that forecasting a patent's chain of citations benefits from not only the patent's history itself but also from the historical citations of assignees and inventors associated with that patent. In this paper, we propose a sequence-to-sequence model which employs an attention-of-attention mechanism to capture the dependencies of these multiple time sequences. Furthermore, the proposed model is able to forecast both the timestamp and the category of a patent's next citation. Extensive experiments on a large patent citation dataset collected from USPTO demonstrate that the proposed model outperforms state-of-the-art models at forward citation forecasting.
CRFeb 14, 2019
Estimating the Circuit Deobfuscating Runtime based on Graph Deep LearningZhiqian Chen, Gaurav Kolhe, Setareh Rafatirad et al.
Circuit obfuscation is a recently proposed defense mechanism to protect digital integrated circuits (ICs) from reverse engineering by using camouflaged gates i.e., logic gates whose functionality cannot be precisely determined by the attacker. There have been effective schemes such as satisfiability-checking (SAT)-based attacks that can potentially decrypt obfuscated circuits, called deobfuscation. Deobfuscation runtime could have a large span ranging from few milliseconds to thousands of years or more, depending on the number and layouts of the ICs and camouflaged gates. And hence accurately pre-estimating the deobfuscation runtime is highly crucial for the defenders to maximize it and optimize their defense. However, estimating the deobfuscation runtime is a challenging task due to 1) the complexity and heterogeneity of graph-structured circuit, 2) the unknown and sophisticated mechanisms of the attackers for deobfuscation. To address the above mentioned challenges, this work proposes the first machine-learning framework that predicts the deobfuscation runtime based on graph deep learning techniques. Specifically, we design a new model, ICNet with new input and convolution layers to characterize and extract graph frequencies from ICs, which are then integrated by heterogeneous deep fully-connected layers to obtain final output. ICNet is an end-to-end framework which can automatically extract the determinant features for deobfuscation runtime. Extensive experiments demonstrate its effectiveness and efficiency.
LGFeb 5, 2019
Robust Regression via Online Feature Selection under Adversarial Data CorruptionXuchao Zhang, Shuo Lei, Liang Zhao et al.
The presence of data corruption in user-generated streaming data, such as social media, motivates a new fundamental problem that learns reliable regression coefficient when features are not accessible entirely at one time. Until now, several important challenges still cannot be handled concurrently: 1) corrupted data estimation when only partial features are accessible; 2) online feature selection when data contains adversarial corruption; and 3) scaling to a massive dataset. This paper proposes a novel RObust regression algorithm via Online Feature Selection (\textit{RoOFS}) that concurrently addresses all the above challenges. Specifically, the algorithm iteratively updates the regression coefficients and the uncorrupted set via a robust online feature substitution method. We also prove that our algorithm has a restricted error bound compared to the optimal solution. Extensive empirical experiments in both synthetic and real-world datasets demonstrated that the effectiveness of our new method is superior to that of existing methods in the recovery of both feature selection and regression coefficients, with very competitive efficiency.
LGAug 30, 2018
Rational Neural Networks for Approximating Jump Discontinuities of Graph Convolution OperatorZhiqian Chen, Feng Chen, Rongjie Lai et al.
For node level graph encoding, a recent important state-of-art method is the graph convolutional networks (GCN), which nicely integrate local vertex features and graph topology in the spectral domain. However, current studies suffer from several drawbacks: (1) graph CNNs relies on Chebyshev polynomial approximation which results in oscillatory approximation at jump discontinuities; (2) Increasing the order of Chebyshev polynomial can reduce the oscillations issue, but also incurs unaffordable computational cost; (3) Chebyshev polynomials require degree $Ω$(poly(1/$ε$)) to approximate a jump signal such as $|x|$, while rational function only needs $\mathcal{O}$(poly log(1/$ε$))\cite{liang2016deep,telgarsky2017neural}. However, it's non-trivial to apply rational approximation without increasing computational complexity due to the denominator. In this paper, the superiority of rational approximation is exploited for graph signal recovering. RatioanlNet is proposed to integrate rational function and neural networks. We show that rational function of eigenvalues can be rewritten as a function of graph Laplacian, which can avoid multiplication by the eigenvector matrix. Focusing on the analysis of approximation on graph convolution operation, a graph signal regression task is formulated. Under graph signal regression task, its time complexity can be significantly reduced by graph Fourier transform. To overcome the local minimum problem of neural networks model, a relaxed Remez algorithm is utilized to initialize the weight parameters. Convergence rate of RatioanlNet and polynomial based methods on jump signal is analyzed for a theoretical guarantee. The extensive experimental results demonstrated that our approach could effectively characterize the jump discontinuities, outperforming competing methods by a substantial margin on both synthetic and real-world graphs.
LGAug 26, 2018
Water Disaggregation via Shape Features based Bayesian Discriminative Sparse CodingBingsheng Wang, Xuchao Zhang, Chang-Tien Lu et al.
As the issue of freshwater shortage is increasing daily, it is critical to take effective measures for water conservation. According to previous studies, device level consumption could lead to significant freshwater conservation. Existing water disaggregation methods focus on learning the signatures for appliances; however, they are lack of the mechanism to accurately discriminate parallel appliances' consumption. In this paper, we propose a Bayesian Discriminative Sparse Coding model using Laplace Prior (BDSC-LP) to extensively enhance the disaggregation performance. To derive discriminative basis functions, shape features are presented to describe the low-sampling-rate water consumption patterns. A Gibbs sampling based inference method is designed to extend the discriminative capability of the disaggregation dictionaries. Extensive experiments were performed to validate the effectiveness of the proposed model using both real-world and synthetic datasets.
LGJul 6, 2018
Distributed Self-Paced Learning in Alternating Direction Method of MultipliersXuchao Zhang, Liang Zhao, Zhiqian Chen et al.
Self-paced learning (SPL) mimics the cognitive process of humans, who generally learn from easy samples to hard ones. One key issue in SPL is the training process required for each instance weight depends on the other samples and thus cannot easily be run in a distributed manner in a large-scale dataset. In this paper, we reformulate the self-paced learning problem into a distributed setting and propose a novel Distributed Self-Paced Learning method (DSPL) to handle large-scale datasets. Specifically, both the model and instance weights can be optimized in parallel for each batch based on a consensus alternating direction method of multipliers. We also prove the convergence of our algorithm under mild conditions. Extensive experiments on both synthetic and real datasets demonstrate that our approach is superior to those of existing methods.