CLFeb 2Code
Kimi K2.5: Visual Agentic IntelligenceKimi Team, Tongtong Bai, Yifan Bai et al.
We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence. K2.5 emphasizes the joint optimization of text and vision so that two modalities enhance each other. This includes a series of techniques such as joint text-vision pre-training, zero-vision SFT, and joint text-vision reinforcement learning. Building on this multimodal foundation, K2.5 introduces Agent Swarm, a self-directed parallel agent orchestration framework that dynamically decomposes complex tasks into heterogeneous sub-problems and executes them concurrently. Extensive evaluations show that Kimi K2.5 achieves state-of-the-art results across various domains including coding, vision, reasoning, and agentic tasks. Agent Swarm also reduces latency by up to $4.5\times$ over single-agent baselines. We release the post-trained Kimi K2.5 model checkpoint to facilitate future research and real-world applications of agentic intelligence.
LGFeb 8, 2023
Red Teaming Deep Neural Networks with Feature Synthesis ToolsStephen Casper, Yuxiao Li, Jiawei Li et al.
Interpretable AI tools are often motivated by the goal of understanding model behavior in out-of-distribution (OOD) contexts. Despite the attention this area of study receives, there are comparatively few cases where these tools have identified previously unknown bugs in models. We argue that this is due, in part, to a common feature of many interpretability methods: they analyze model behavior by using a particular dataset. This only allows for the study of the model in the context of features that the user can sample in advance. To address this, a growing body of research involves interpreting models using \emph{feature synthesis} methods that do not depend on a dataset. In this paper, we benchmark the usefulness of interpretability tools on debugging tasks. Our key insight is that we can implant human-interpretable trojans into models and then evaluate these tools based on whether they can help humans discover them. This is analogous to finding OOD bugs, except the ground truth is known, allowing us to know when an interpretation is correct. We make four contributions. (1) We propose trojan discovery as an evaluation task for interpretability tools and introduce a benchmark with 12 trojans of 3 different types. (2) We demonstrate the difficulty of this benchmark with a preliminary evaluation of 16 state-of-the-art feature attribution/saliency tools. Even under ideal conditions, given direct access to data with the trojan trigger, these methods still often fail to identify bugs. (3) We evaluate 7 feature-synthesis methods on our benchmark. (4) We introduce and evaluate 2 new variants of the best-performing method from the previous evaluation. A website for this paper and its code is at https://benchmarking-interpretability.csail.mit.edu/
AIJul 13, 2024
CellAgent: An LLM-driven Multi-Agent Framework for Automated Single-cell Data AnalysisYihang Xiao, Jinyi Liu, Yan Zheng et al.
Single-cell RNA sequencing (scRNA-seq) data analysis is crucial for biological research, as it enables the precise characterization of cellular heterogeneity. However, manual manipulation of various tools to achieve desired outcomes can be labor-intensive for researchers. To address this, we introduce CellAgent (http://cell.agent4science.cn/), an LLM-driven multi-agent framework, specifically designed for the automatic processing and execution of scRNA-seq data analysis tasks, providing high-quality results with no human intervention. Firstly, to adapt general LLMs to the biological field, CellAgent constructs LLM-driven biological expert roles - planner, executor, and evaluator - each with specific responsibilities. Then, CellAgent introduces a hierarchical decision-making mechanism to coordinate these biological experts, effectively driving the planning and step-by-step execution of complex data analysis tasks. Furthermore, we propose a self-iterative optimization mechanism, enabling CellAgent to autonomously evaluate and optimize solutions, thereby guaranteeing output quality. We evaluate CellAgent on a comprehensive benchmark dataset encompassing dozens of tissues and hundreds of distinct cell types. Evaluation results consistently show that CellAgent effectively identifies the most suitable tools and hyperparameters for single-cell analysis tasks, achieving optimal performance. This automated framework dramatically reduces the workload for science data analyses, bringing us into the "Agent for Science" era.
CRMar 25
Analysing the Safety Pitfalls of Steering VectorsYuxiao Li, Alina Fastowski, Efstratios Zaradoukas et al.
Activation steering has emerged as a powerful tool to shape LLM behavior without the need for weight updates. While its inherent brittleness and unreliability are well-documented, its safety implications remain underexplored. In this work, we present a systematic safety audit of steering vectors obtained with Contrastive Activation Addition (CAA), a widely used steering approach, under a unified evaluation protocol. Using JailbreakBench as benchmark, we show that steering vectors consistently influence the success rate of jailbreak attacks, with stronger amplification under simple template-based attacks. Across LLM families and sizes, steering the model in specific directions can drastically increase (up to 57%) or decrease (up to 50%) its attack success rate (ASR), depending on the targeted behavior. We attribute this phenomenon to the overlap between the steering vectors and the latent directions of refusal behavior. Thus, we offer a traceable explanation for this discovery. Together, our findings reveal the previously unobserved origin of this safety gap in LLMs, highlighting a trade-off between controllability and safety.
CRNov 8, 2025
Injecting Falsehoods: Adversarial Man-in-the-Middle Attacks Undermining Factual Recall in LLMsAlina Fastowski, Bardh Prenkaj, Yuxiao Li et al.
LLMs are now an integral part of information retrieval. As such, their role as question answering chatbots raises significant concerns due to their shown vulnerability to adversarial man-in-the-middle (MitM) attacks. Here, we propose the first principled attack evaluation on LLM factual memory under prompt injection via Xmera, our novel, theory-grounded MitM framework. By perturbing the input given to "victim" LLMs in three closed-book and fact-based QA settings, we undermine the correctness of the responses and assess the uncertainty of their generation process. Surprisingly, trivial instruction-based attacks report the highest success rate (up to ~85.3%) while simultaneously having a high uncertainty for incorrectly answered questions. To provide a simple defense mechanism against Xmera, we train Random Forest classifiers on the response uncertainty levels to distinguish between attacked and unattacked queries (average AUC of up to ~96%). We believe that signaling users to be cautious about the answers they receive from black-box and potentially corrupt LLMs is a first checkpoint toward user cyberspace safety.
LGMay 20
Graph Navier Stokes NetworksZexing Zhao, Guangsi Shi, Yu Gong et al.
Graph Neural Networks (GNNs) have emerged as a cornerstone of deep learning, with most existing methods rooted in graph signal processing and diffusion equations to model message passing. However, these approaches inherently suffer from the oversmoothing problem, where node features become indistinguishable as the network depth increases. Inspired by the Navier Stokes equations, we introduce Graph Navier Stokes Networks (GNSN), a novel architecture that transcends conventional diffusion-based message passing by incorporating convection into graph structures. GNSN defines a dynamic velocity field on the graph to govern convection, enabling more efficient and direct message propagation. By adaptively balancing convection and diffusion, GNSN is able to efficiently handle datasets with varying levels of homophily. Extensive evaluations across twelve real-world datasets demonstrate that GNSN consistently outperforms state-of-the-art baselines in classification accuracy. Moreover, experimental results further emphasize its effectiveness in alleviating the oversmoothing problem.
GRApr 8
Preserving Discrete Morse-Smale Complexes in Error-Bounded Lossy CompressionYuxiao Li, Mingze Xia, Xin Liang et al.
Scientific applications are generating unprecedented volumes of data that overwhelm storage and transmission systems, posing significant challenges for the design of data management tools and scientific databases. Lossy compression has emerged as a promising strategy to address this problem, but most existing compressors fail to preserve the topology of scientific data, leading to inaccuracies in downstream analyses and potentially erroneous scientific conclusions. In this work, we present a methodology for fully preserving the topology, specifically, Morse-Smale complexes (MSCs), in lossy-compressed 2D and 3D scalar field data from scientific simulations. We generalize the edit-based strategy introduced in MSz (a previous method that preserves only segmentations and cannot preserve saddles or separatrices) by extending the framework to the full MSCs, including all critical points and separatrices. Our approach corrects the MSCs in the decompressed output of any error-bounded lossy compressor (e.g., SZ3 or ZFP), referred to as the base compressor, using an iterative editing strategy that preserves all critical points and their connectivity via separatrices. During compression, we generate a sequence of quantized edits that are applied to the decompressed output, ensuring accurate preservation of topological features while maintaining the error within prescribed bounds. The strategy iteratively fixes critical points and separatrices in alternating steps until convergence is achieved in a finite number of iterations. To meet diverse application needs, our method offers flexible options that balance compression efficiency with feature preservation. To reduce computation time, we leverage GPU parallelism to accelerate each component of the workflow. Experiments on multiple datasets demonstrate that our method achieves 100% preservation of Morse-Smale complexes.
DBMar 13
Time-varying Vector Field Compression with Preserved Critical Point TrajectoriesMingze Xia, Yuxiao Li, Pu Jiao et al.
Scientific simulations and observations are producing vast amounts of time-varying vector field data, making it hard to store them for archival purposes and transmit them for analysis. Lossy compression is considered a promising approach to reducing these data because lossless compression yields low compression ratios that barely mitigate the problem. However, directly applying existing lossy compression methods to timevarying vector fields may introduce undesired distortions in critical-point trajectories, a crucial feature that encodes key properties of the vector field. In this work, we propose an efficient lossy compression framework that exactly preserves all critical-point trajectories in time-varying vector fields. Our contributions are threefold. First, we extend the theory for preserving critical points in space to preserving critical-point trajectories in space-time, and develop a compression framework to realize the functionality. Second, we propose a semi-Lagrange predictor to exploit the spatiotemporal correlations in advectiondominated regions, and combine it with the traditional Lorenzo predictor for improved compression efficiency. Third, we evaluate our method against state-of-the-art lossy and lossless compressors using four real-world scientific datasets. Experimental results demonstrate that the proposed method delivers up to 124.48X compression ratios while effectively preserving all critical-point trajectories. This compression ratio is up to 56.07X higher than that of the best lossless compressors, and none of the existing lossy compressors can preserve all critical-point trajectories at similar compression ratios.
DCApr 1
EXaCTz: Guaranteed Extremum Graph and Contour Tree Preservation for Distributed- and GPU-Parallel Lossy CompressionYuxiao Li, Mingze Xia, Xin Liang et al.
This paper introduces EXaCTz, a parallel algorithm that concurrently preserves extremum graphs and contour trees in lossy-compressed scalar field data. While error-bounded lossy compression is essential for large-scale scientific simulations and workflows, existing topology-preserving methods suffer from (1) a significant throughput disparity, where topology correction speeds are on the order of MB/s, lagging orders of magnitude behind compression speeds on the order of GB/s, (2) limited support for diverse topological descriptors, and (3) a lack of theoretical convergence bounds. To address these challenges, EXaCTz introduces a high-performance, bounded-iteration algorithm that enforces topological consistency by deriving targeted edits for decompressed data. Unlike prior methods that rely on explicit topology reconstruction, EXaCTz enforces consistent min/max neighbors of all vertices, along with global ordering among critical points. As such, the algorithm enforces consistent critical-point classification, saddle extremum connectivity, and the preservation of merge/split events. We theoretically prove the convergence of our algorithm, bounded by the longest path in a vulnerability graph that characterizes potential cascading effects during correction. Experiments on real-world datasets show that EXaCTz achieves a single-GPU throughput of up to 4.52 GB/s, outperforming the state-of-the-art contour-tree-preserving method (Gorski et al.) by up to 213x (with a single-core CPU implementation for fair comparison) and 3,285x (with a single-GPU version). In distributed environments, EXaCTz scales to 128 GPUs with 55.6\% efficiency (compared with 6.4\% for a naive parallelization), processing datasets of up to 512 GB in under 48 seconds and achieving an aggregate correction throughput of up to 32.69 GB/s.
LGJan 15, 2025
CT-PatchTST: Channel-Time Patch Time-Series Transformer for Long-Term Renewable Energy ForecastingKuan Lu, Menghao Huo, Yuxiao Li et al.
Accurate forecasting of renewable energy generation is fundamental to enhancing the dynamic performance of modern power grids, especially under high renewable penetration. This paper presents Channel-Time Patch Time-Series Transformer (CT-PatchTST), a novel deep learning model designed to provide long-term, high-fidelity forecasts of wind and solar power. Unlike conventional time-series models, CT-PatchTST captures both temporal dependencies and inter-channel correlations-features that are critical for effective energy storage planning, control, and dispatch. Reliable forecasting enables proactive deployment of energy storage systems (ESSs), helping to mitigate uncertainties in renewable output, reduce system response time, and optimize storage operation based on location-specific flow and voltage conditions. Evaluated on real-world datasets from Denmark's offshore wind, onshore wind, and solar generation, CT-PatchTST outperforms existing methods in both accuracy and robustness. By enabling predictive, data-driven coordination of ESSs across integrated source-grid-load-storage systems, this work contributes to the design of more stable, responsive, and cost-efficient power networks.
CVMay 21, 2025
Image-to-Image Translation with Diffusion Transformers and CLIP-Based Image ConditioningQiang Zhu, Kuan Lu, Menghao Huo et al.
Image-to-image translation aims to learn a mapping between a source and a target domain, enabling tasks such as style transfer, appearance transformation, and domain adaptation. In this work, we explore a diffusion-based framework for image-to-image translation by adapting Diffusion Transformers (DiT), which combine the denoising capabilities of diffusion models with the global modeling power of transformers. To guide the translation process, we condition the model on image embeddings extracted from a pre-trained CLIP encoder, allowing for fine-grained and structurally consistent translations without relying on text or class labels. We incorporate both a CLIP similarity loss to enforce semantic consistency and an LPIPS perceptual loss to enhance visual fidelity during training. We validate our approach on two benchmark datasets: face2comics, which translates real human faces to comic-style illustrations, and edges2shoes, which translates edge maps to realistic shoe images. Experimental results demonstrate that DiT, combined with CLIP-based conditioning and perceptual similarity objectives, achieves high-quality, semantically faithful translations, offering a promising alternative to GAN-based models for paired image-to-image translation tasks.
CVDec 17, 2025
A Modular Framework for Single-View 3D Reconstruction of Indoor EnvironmentsYuxiao Li
We propose a modular framework for single-view indoor scene 3D reconstruction, where several core modules are powered by diffusion techniques. Traditional approaches for this task often struggle with the complex instance shapes and occlusions inherent in indoor environments. They frequently overshoot by attempting to predict 3D shapes directly from incomplete 2D images, which results in limited reconstruction quality. We aim to overcome this limitation by splitting the process into two steps: first, we employ diffusion-based techniques to predict the complete views of the room background and occluded indoor instances, then transform them into 3D. Our modular framework makes contributions to this field through the following components: an amodal completion module for restoring the full view of occluded instances, an inpainting model specifically trained to predict room layouts, a hybrid depth estimation technique that balances overall geometric accuracy with fine detail expressiveness, and a view-space alignment method that exploits both 2D and 3D cues to ensure precise placement of instances within the scene. This approach effectively reconstructs both foreground instances and the room background from a single image. Extensive experiments on the 3D-Front dataset demonstrate that our method outperforms current state-of-the-art (SOTA) approaches in terms of both visual quality and reconstruction accuracy. The framework holds promising potential for applications in interior design, real estate, and augmented reality.
LGSep 26, 2025
Analysis of Variational Sparse AutoencodersZachary Baker, Yuxiao Li
Sparse Autoencoders (SAEs) have emerged as a promising approach for interpreting neural network representations by learning sparse, human-interpretable features from dense activations. We investigate whether incorporating variational methods into SAE architectures can improve feature organization and interpretability. We introduce the Variational Sparse Autoencoder (vSAE), which replaces deterministic ReLU gating with stochastic sampling from learned Gaussian posteriors and incorporates KL divergence regularization toward a standard normal prior. Our hypothesis is that this probabilistic sampling creates dispersive pressure, causing features to organize more coherently in the latent space while avoiding overlap. We evaluate a TopK vSAE against a standard TopK SAE on Pythia-70M transformer residual stream activations using comprehensive benchmarks including SAE Bench, individual feature interpretability analysis, and global latent space visualization through t-SNE. The vSAE underperforms standard SAE across core evaluation metrics, though excels at feature independence and ablation metrics. The KL divergence term creates excessive regularization pressure that substantially reduces the fraction of living features, leading to observed performance degradation. While vSAE features demonstrate improved robustness, they exhibit many more dead features than baseline. Our findings suggest that naive application of variational methods to SAEs does not improve feature organization or interpretability.
AIJul 18, 2025
Cross-modal Causal Intervention for Alzheimer's Disease PredictionYutao Jin, Haowen Xiao, Junyong Zhai et al.
Mild Cognitive Impairment (MCI) serves as a prodromal stage of Alzheimer's Disease (AD), where early identification and intervention can effectively slow the progression to dementia. However, diagnosing AD remains a significant challenge in neurology due to the confounders caused mainly by the selection bias of multi-modal data and the complex relationships between variables. To address these issues, we propose a novel visual-language causality-inspired framework named Cross-modal Causal Intervention with Mediator for Alzheimer's Disease Diagnosis (MediAD) for diagnostic assistance. Our MediAD employs Large Language Models (LLMs) to summarize clinical data under strict templates, therefore enriching textual inputs. The MediAD model utilizes Magnetic Resonance Imaging (MRI), clinical data, and textual data enriched by LLMs to classify participants into Cognitively Normal (CN), MCI, and AD categories. Because of the presence of confounders, such as cerebral vascular lesions and age-related biomarkers, non-causal models are likely to capture spurious input-output correlations, generating less reliable results. Our framework implicitly mitigates the effect of both observable and unobservable confounders through a unified causal intervention method. Experimental results demonstrate the outstanding performance of our method in distinguishing CN/MCI/AD cases, outperforming other methods in most evaluation metrics. The study showcases the potential of integrating causal reasoning with multi-modal learning for neurological disease diagnosis.
IRJun 22, 2024
LLM-Powered Explanations: Unraveling Recommendations Through Subgraph ReasoningGuangsi Shi, Xiaofeng Deng, Linhao Luo et al.
Recommender systems are pivotal in enhancing user experiences across various web applications by analyzing the complicated relationships between users and items. Knowledge graphs(KGs) have been widely used to enhance the performance of recommender systems. However, KGs are known to be noisy and incomplete, which are hard to provide reliable explanations for recommendation results. An explainable recommender system is crucial for the product development and subsequent decision-making. To address these challenges, we introduce a novel recommender that synergies Large Language Models (LLMs) and KGs to enhance the recommendation and provide interpretable results. Specifically, we first harness the power of LLMs to augment KG reconstruction. LLMs comprehend and decompose user reviews into new triples that are added into KG. In this way, we can enrich KGs with explainable paths that express user preferences. To enhance the recommendation on augmented KGs, we introduce a novel subgraph reasoning module that effectively measures the importance of nodes and discovers reasoning for recommendation. Finally, these reasoning paths are fed into the LLMs to generate interpretable explanations of the recommendation results. Our approach significantly enhances both the effectiveness and interpretability of recommender systems, especially in cross-selling scenarios where traditional methods falter. The effectiveness of our approach has been rigorously tested on four open real-world datasets, with our methods demonstrating a superior performance over contemporary state-of-the-art techniques by an average improvement of 12%. The application of our model in a multinational engineering and technology company cross-selling recommendation system further underscores its practical utility and potential to redefine recommendation practices through improved accuracy and user trust.
LGMay 23, 2023
A Deep Learning Approach for Generating Soft Range Information from RF DataYuxiao Li, Santiago Mazuelas, Yuan Shen
Radio frequency (RF)-based techniques are widely adopted for indoor localization despite the challenges in extracting sufficient information from measurements. Soft range information (SRI) offers a promising alternative for highly accurate localization that gives all probable range values rather than a single estimate of distance. We propose a deep learning approach to generate accurate SRI from RF measurements. In particular, the proposed approach is implemented by a network with two neural modules and conducts the generation directly from raw data. Extensive experiments on a case study with two public datasets are conducted to quantify the efficiency in different indoor localization tasks. The results show that the proposed approach can generate highly accurate SRI, and significantly outperforms conventional techniques in both non-line-of-sight (NLOS) detection and ranging error mitigation.
LGMay 23, 2023
Deep GEM-Based Network for Weakly Supervised UWB Ranging Error MitigationYuxiao Li, Santiago Mazuelas, Yuan Shen
Ultra-wideband (UWB)-based techniques, while becoming mainstream approaches for high-accurate positioning, tend to be challenged by ranging bias in harsh environments. The emerging learning-based methods for error mitigation have shown great performance improvement via exploiting high semantic features from raw data. However, these methods rely heavily on fully labeled data, leading to a high cost for data acquisition. We present a learning framework based on weak supervision for UWB ranging error mitigation. Specifically, we propose a deep learning method based on the generalized expectation-maximization (GEM) algorithm for robust UWB ranging error mitigation under weak supervision. Such method integrate probabilistic modeling into the deep learning scheme, and adopt weakly supervised labels as prior information. Extensive experiments in various supervision scenarios illustrate the superiority of the proposed method.
SPMay 23, 2023
Deep Generative Model for Simultaneous Range Error Mitigation and Environment IdentificationYuxiao Li, Santiago Mazuelas, Yuan Shen
Received waveforms contain rich information for both range information and environment semantics. However, its full potential is hard to exploit under multipath and non-line-of-sight conditions. This paper proposes a deep generative model (DGM) for simultaneous range error mitigation and environment identification. In particular, we present a Bayesian model for the generative process of the received waveform composed by latent variables for both range-related features and environment semantics. The simultaneous range error mitigation and environment identification is interpreted as an inference problem based on the DGM, and implemented in a unique end-to-end learning scheme. Comprehensive experiments on a general Ultra-wideband dataset demonstrate the superior performance on range error mitigation, scalability to different environments, and novel capability on simultaneous environment identification.
SPMay 23, 2023
A Semi-Supervised Learning Approach for Ranging Error Mitigation Based on UWB WaveformYuxiao Li, Santiago Mazuelas, Yuan Shen
Localization systems based on ultra-wide band (UWB) measurements can have unsatisfactory performance in harsh environments due to the presence of non-line-of-sight (NLOS) errors. Learning-based methods for error mitigation have shown great performance improvement via directly exploiting the wideband waveform instead of handcrafted features. However, these methods require data samples fully labeled with actual measurement errors for training, which leads to time-consuming data collection. In this paper, we propose a semi-supervised learning method based on variational Bayes for UWB ranging error mitigation. Combining deep learning techniques and statistic tools, our method can efficiently accumulate knowledge from both labeled and unlabeled data samples. Extensive experiments illustrate the effectiveness of the proposed method under different supervision rates, and the superiority compared to other fully supervised methods even at a low supervision rate.
CVMay 23, 2023
Generalized Expectation Maximization Framework for Blind Image Super ResolutionYuxiao Li, Zhiming Wang, Yuan Shen
Learning-based methods for blind single image super resolution (SISR) conduct the restoration by a learned mapping between high-resolution (HR) images and their low-resolution (LR) counterparts degraded with arbitrary blur kernels. However, these methods mostly require an independent step to estimate the blur kernel, leading to error accumulation between steps. We propose an end-to-end learning framework for the blind SISR problem, which enables image restoration within a unified Bayesian framework with either full- or semi-supervision. The proposed method, namely SREMN, integrates learning techniques into the generalized expectation-maximization (GEM) algorithm and infers HR images from the maximum likelihood estimation (MLE). Extensive experiments show the superiority of the proposed method with comparison to existing work and novelty in semi-supervised learning.
CVMay 23, 2023
Variational Bayesian Framework for Advanced Image Generation with Domain-Related VariablesYuxiao Li, Santiago Mazuelas, Yuan Shen
Deep generative models (DGMs) and their conditional counterparts provide a powerful ability for general-purpose generative modeling of data distributions. However, it remains challenging for existing methods to address advanced conditional generative problems without annotations, which can enable multiple applications like image-to-image translation and image editing. We present a unified Bayesian framework for such problems, which introduces an inference stage on latent variables within the learning process. In particular, we propose a variational Bayesian image translation network (VBITN) that enables multiple image translation and editing tasks. Comprehensive experiments show the effectiveness of our method on unsupervised image-to-image translation, and demonstrate the novel advanced capabilities for semantic editing and mixed domain translation.
MLJul 23, 2020
DeepKriging: Spatially Dependent Deep Neural Networks for Spatial PredictionWanfang Chen, Yuxiao Li, Brian J Reich et al.
In spatial statistics, a common objective is to predict values of a spatial process at unobserved locations by exploiting spatial dependence. Kriging provides the best linear unbiased predictor using covariance functions and is often associated with Gaussian processes. However, when considering non-linear prediction for non-Gaussian and categorical data, the Kriging prediction is no longer optimal, and the associated variance is often overly optimistic. Although deep neural networks (DNNs) are widely used for general classification and prediction, they have not been studied thoroughly for data with spatial dependence. In this work, we propose a novel DNN structure for spatial prediction, where the spatial dependence is captured by adding an embedding layer of spatial coordinates with basis functions. We show in theory and simulation studies that the proposed DeepKriging method has a direct link to Kriging in the Gaussian case, and it has multiple advantages over Kriging for non-Gaussian and non-stationary data, i.e., it provides non-linear predictions and thus has smaller approximation errors, it does not require operations on covariance matrices and thus is scalable for large datasets, and with sufficiently many hidden neurons, it provides the optimal prediction in terms of model capacity. We further explore the possibility of quantifying prediction uncertainties based on density prediction without assuming any data distribution. Finally, we apply the method to predicting PM2.5 concentrations across the continental United States.