CVApr 25, 2023
Segment anything, from space?Simiao Ren, Francesco Luzi, Saad Lahrichi et al.
Recently, the first foundation model developed specifically for image segmentation tasks was developed, termed the "Segment Anything Model" (SAM). SAM can segment objects in input imagery based on cheap input prompts, such as one (or more) points, a bounding box, or a mask. The authors examined the \textit{zero-shot} image segmentation accuracy of SAM on a large number of vision benchmark tasks and found that SAM usually achieved recognition accuracy similar to, or sometimes exceeding, vision models that had been trained on the target tasks. The impressive generalization of SAM for segmentation has major implications for vision researchers working on natural imagery. In this work, we examine whether SAM's performance extends to overhead imagery problems and help guide the community's response to its development. We examine SAM's performance on a set of diverse and widely studied benchmark tasks. We find that SAM does often generalize well to overhead imagery, although it fails in some cases due to the unique characteristics of overhead imagery and its common target objects. We report on these unique systematic failure cases for remote sensing imagery that may comprise useful future research for the community.
CVOct 23, 2022
Transformers For Recognition In Overhead Imagery: A Reality CheckFrancesco Luzi, Aneesh Gupta, Leslie Collins et al.
There is evidence that transformers offer state-of-the-art recognition performance on tasks involving overhead imagery (e.g., satellite imagery). However, it is difficult to make unbiased empirical comparisons between competing deep learning models, making it unclear whether, and to what extent, transformer-based models are beneficial. In this paper we systematically compare the impact of adding transformer structures into state-of-the-art segmentation models for overhead imagery. Each model is given a similar budget of free parameters, and their hyperparameters are optimized using Bayesian Optimization with a fixed quantity of data and computation time. We conduct our experiments with a large and diverse dataset comprising two large public benchmarks: Inria and DeepGlobe. We perform additional ablation studies to explore the impact of specific transformer-based modeling choices. Our results suggest that transformers provide consistent, but modest, performance improvements. We only observe this advantage however in hybrid models that combine convolutional and transformer-based structures, while fully transformer-based models achieve relatively poor performance.
AIAug 13, 2025Code
Improving and Evaluating Open Deep Research AgentsDoaa Allabadi, Kyle Bradbury, Jordan M. Malof
We focus here on Deep Research Agents (DRAs), which are systems that can take a natural language prompt from a user, and then autonomously search for, and utilize, internet-based content to address the prompt. Recent DRAs have demonstrated impressive capabilities on public benchmarks however, recent research largely involves proprietary closed-source systems. At the time of this work, we only found one open-source DRA, termed Open Deep Research (ODR). In this work we adapt the challenging recent BrowseComp benchmark to compare ODR to existing proprietary systems. We propose BrowseComp-Small (BC-Small), comprising a subset of BrowseComp, as a more computationally-tractable DRA benchmark for academic labs. We benchmark ODR and two other proprietary systems on BC-Small: one system from Anthropic and one system from Google. We find that all three systems achieve 0% accuracy on the test set of 60 questions. We introduce three strategic improvements to ODR, resulting in the ODR+ model, which achieves a state-of-the-art 10% success rate on BC-Small among both closed-source and open-source systems. We report ablation studies indicating that all three of our improvements contributed to the success of ODR+.
CVFeb 15, 2025
Is Self-Supervised Pre-training on Satellite Imagery Better than ImageNet? A Systematic Study with Sentinel-2Saad Lahrichi, Zion Sheng, Shufan Xia et al.
Self-supervised learning (SSL) has demonstrated significant potential in pre-training robust models with limited labeled data, making it particularly valuable for remote sensing (RS) tasks. A common assumption is that pre-training on domain-aligned data provides maximal benefits on downstream tasks, particularly when compared to ImageNet-pretraining (INP). In this work, we investigate this assumption by collecting GeoNet, a large and diverse dataset of global optical Sentinel-2 imagery, and pre-training SwAV and MAE on both GeoNet and ImageNet. Evaluating these models on six downstream tasks in the few-shot setting reveals that SSL pre-training on RS data offers modest performance improvements over INP, and that it remains competitive in multiple scenarios. This indicates that the presumed benefits of SSL pre-training on RS data may be overstated, and the additional costs of data curation and pre-training could be unjustified.
LGNov 24, 2025
Closing Gaps in Emissions Monitoring with Climate TRACEBrittany V. Lancellotti, Jordan M. Malof, Aaron Davitt et al.
Global greenhouse gas emissions estimates are essential for monitoring and mitigation planning. Yet most datasets lack one or more characteristics that enhance their actionability, such as accuracy, global coverage, high spatial and temporal resolution, and frequent updates. To address these gaps, we present Climate TRACE (climatetrace.org), an open-access platform delivering global emissions estimates with enhanced detail, coverage, and timeliness. Climate TRACE synthesizes existing emissions data, prioritizing accuracy, coverage, and resolution, and fills gaps using sector-specific estimation approaches. The dataset is the first to provide globally comprehensive emissions estimates for individual sources (e.g., individual power plants) for all anthropogenic emitting sectors. The dataset spans January 1, 2021, to the present, with a two-month reporting lag and monthly updates. The open-access platform enables non-technical audiences to engage with detailed emissions datasets for most subnational governments worldwide. Climate TRACE supports data-driven climate action at scales where decisions are made, representing a major breakthrough for emissions accounting and mitigation.
SPFeb 18, 2022
Automated Extraction of Energy Systems Information from Remotely Sensed Data: A Review and AnalysisSimiao Ren, Wei Hu, Kyle Bradbury et al.
High quality energy systems information is a crucial input to energy systems research, modeling, and decision-making. Unfortunately, actionable information about energy systems is often of limited availability, incomplete, or only accessible for a substantial fee or through a non-disclosure agreement. Recently, remotely sensed data (e.g., satellite imagery, aerial photography) have emerged as a potentially rich source of energy systems information. However, the use of these data is frequently challenged by its sheer volume and complexity, precluding manual analysis. Recent breakthroughs in machine learning have enabled automated and rapid extraction of useful information from remotely sensed data, facilitating large-scale acquisition of critical energy system variables. Here we present a systematic review of the literature on this emerging topic, providing an in-depth survey and review of papers published within the past two decades. We first taxonomize the existing literature into ten major areas, spanning the energy value chain. Within each research area, we distill and critically discuss major features that are relevant to energy researchers, including, for example, key challenges regarding the accessibility and reliability of the methods. We then synthesize our findings to identify limitations and trends in the literature as a whole, and discuss opportunities for innovation. These include the opportunity to extend the methods beyond electricity to broader energy systems and wider geographic areas; and the ability to expand the use of these methods in research and decision making as satellite data become cheaper and easier to access. We also find that there are persistent challenges: limited standardization and rigor of performance assessments; limited sharing of code, which would improve replicability; and a limited consideration of the ethics and privacy of data.
CVJun 29, 2021
SIMPL: Generating Synthetic Overhead Imagery to Address Zero-shot and Few-Shot Detection ProblemsYang Xu, Bohao Huang, Xiong Luo et al.
Recently deep neural networks (DNNs) have achieved tremendous success for object detection in overhead (e.g., satellite) imagery. One ongoing challenge however is the acquisition of training data, due to high costs of obtaining satellite imagery and annotating objects in it. In this work we present a simple approach - termed Synthetic object IMPLantation (SIMPL) - to easily and rapidly generate large quantities of synthetic overhead training data for custom target objects. We demonstrate the effectiveness of using SIMPL synthetic imagery for training DNNs in zero-shot scenarios where no real imagery is available; and few-shot learning scenarios, where limited real-world imagery is available. We also conduct experiments to study the sensitivity of SIMPL's effectiveness to some key design parameters, providing users for insights when designing synthetic imagery for custom objects. We release a software implementation of our SIMPL approach so that others can build upon it, or use it for their own custom problems.
CVApr 28, 2021
Randomized Histogram Matching: A Simple Augmentation for Unsupervised Domain Adaptation in Overhead ImageryCan Yaras, Kaleb Kassaw, Bohao Huang et al.
Modern deep neural networks (DNNs) are highly accurate on many recognition tasks for overhead (e.g., satellite) imagery. However, visual domain shifts (e.g., statistical changes due to geography, sensor, or atmospheric conditions) remain a challenge, causing the accuracy of DNNs to degrade substantially and unpredictably when testing on new sets of imagery. In this work, we model domain shifts caused by variations in imaging hardware, lighting, and other conditions as non-linear pixel-wise transformations, and we perform a systematic study indicating that modern DNNs can become largely robust to these types of transformations, if provided with appropriate training data augmentation. In general, however, we do not know the transformation between two sets of imagery. To overcome this, we propose a fast real-time unsupervised training augmentation technique, termed randomized histogram matching (RHM). We conduct experiments with two large benchmark datasets for building segmentation and find that despite its simplicity, RHM consistently yields similar or superior performance compared to state-of-the-art unsupervised domain adaptation approaches, while being significantly simpler and more computationally efficient. RHM also offers substantially better performance than other comparably simple approaches that are widely used for overhead imagery.
CVJan 16, 2021
GridTracer: Automatic Mapping of Power Grids using Deep Learning and Overhead ImageryBohao Huang, Jichen Yang, Artem Streltsov et al.
Energy system information valuable for electricity access planning such as the locations and connectivity of electricity transmission and distribution towers, termed the power grid, is often incomplete, outdated, or altogether unavailable. Furthermore, conventional means for collecting this information is costly and limited. We propose to automatically map the grid in overhead remotely sensed imagery using deep learning. Towards this goal, we develop and publicly-release a large dataset ($263km^2$) of overhead imagery with ground truth for the power grid, to our knowledge this is the first dataset of its kind in the public domain. Additionally, we propose scoring metrics and baseline algorithms for two grid mapping tasks: (1) tower recognition and (2) power line interconnection (i.e., estimating a graph representation of the grid). We hope the availability of the training data, scoring metrics, and baselines will facilitate rapid progress on this important problem to help decision-makers address the energy needs of societies around the world.
CVJan 15, 2020
The Synthinel-1 dataset: a collection of high resolution synthetic overhead imagery for building segmentationFanjie Kong, Bohao Huang, Kyle Bradbury et al.
Recently deep learning - namely convolutional neural networks (CNNs) - have yielded impressive performance for the task of building segmentation on large overhead (e.g., satellite) imagery benchmarks. However, these benchmark datasets only capture a small fraction of the variability present in real-world overhead imagery, limiting the ability to properly train, or evaluate, models for real-world application. Unfortunately, developing a dataset that captures even a small fraction of real-world variability is typically infeasible due to the cost of imagery, and manual pixel-wise labeling of the imagery. In this work we develop an approach to rapidly and cheaply generate large and diverse virtual environments from which we can capture synthetic overhead imagery for training segmentation CNNs. Using this approach, generate and publicly-release a collection of synthetic overhead imagery - termed Synthinel-1 with full pixel-wise building labels. We use several benchmark dataset to demonstrate that Synthinel-1 is consistently beneficial when used to augment real-world training imagery, especially when CNNs are tested on novel geographic locations or conditions.
CVFeb 28, 2019
What you get is not always what you see: pitfalls in solar array assessment using overhead imageryWei Hu, Kyle Bradbury, Jordan M. Malof et al.
Effective integration planning for small, distributed solar photovoltaic (PV) arrays into electric power grids requires access to high quality data: the location and power capacity of individual solar PV arrays. Unfortunately, national databases of small-scale solar PV do not exist; those that do are limited in their spatial resolution, typically aggregated up to state or national levels. While several promising approaches for solar PV detection have been published, strategies for evaluating the performance of these models are often highly heterogeneous from study to study. The resulting comparison of these methods for practical applications for energy assessments becomes challenging and may imply that the reported performance evaluations are overly optimistic. The heterogeneity comes in many forms, each of which we explore in this work: the level of spatial aggregation, the validation of ground truth, inconsistencies in the training and validation datasets, and the degree of diversity of the locations and sensors from which the training and validation data originate. For each, we discuss emerging practices from the literature to address them or suggest directions of future research. As part of our investigation, we evaluate solar PV identification performance in two large regions. Our findings suggest that traditional performance evaluation of the automated identification of solar PV from satellite imagery may be optimistic due to common limitations in the validation process. The takeaways from this work are intended to inform and catalyze the large-scale practical application of automated solar PV assessment techniques by energy researchers and professionals.
CVMay 30, 2018
Tiling and Stitching Segmentation Output for Remote Sensing: Basic Challenges and RecommendationsBohao Huang, Daniel Reichman, Leslie M. Collins et al.
In this work we consider the application of convolutional neural networks (CNNs) for pixel-wise labeling (a.k.a., semantic segmentation) of remote sensing imagery (e.g., aerial color or hyperspectral imagery). Remote sensing imagery is usually stored in the form of very large images, referred to as "tiles", which are too large to be segmented directly using most CNNs and their associated hardware. As a result, during label inference, smaller sub-images, called "patches", are processed individually and then "stitched" (concatenated) back together to create a tile-sized label map. This approach suffers from computational ineffiency and can result in discontinuities at output boundaries. We propose a simple alternative approach in which the input size of the CNN is dramatically increased only during label inference. This does not avoid stitching altogether, but substantially mitigates its limitations. We evaluate the performance of the proposed approach against a vonventional stitching approach using two popular segmentation CNN models and two large-scale remote sensing imagery datasets. The results suggest that the proposed approach substantially reduces label inference time, while also yielding modest overall label accuracy increases. This approach contributed to our wining entry (overall performance) in the INRIA building labeling competition.
CVJan 11, 2018
Application of a semantic segmentation convolutional neural network for accurate automatic detection and mapping of solar photovoltaic arrays in aerial imageryJoseph Camilo, Rui Wang, Leslie M. Collins et al.
We consider the problem of automatically detecting small-scale solar photovoltaic arrays for behind-the-meter energy resource assessment in high resolution aerial imagery. Such algorithms offer a faster and more cost-effective solution to collecting information on distributed solar photovoltaic (PV) arrays, such as their location, capacity, and generated energy. The surface area of PV arrays, a characteristic which can be estimated from aerial imagery, provides an important proxy for array capacity and energy generation. In this work, we employ a state-of-the-art convolutional neural network architecture, called SegNet (Badrinarayanan et. al., 2015), to semantically segment (or map) PV arrays in aerial imagery. This builds on previous work focused on identifying the locations of PV arrays, as opposed to their specific shapes and sizes. We measure the ability of our SegNet implementation to estimate the surface area of PV arrays on a large, publicly available, dataset that has been employed in several previous studies. The results indicate that the SegNet model yields substantial performance improvements with respect to estimating shape and size as compared to a recently proposed convolutional neural network PV detection algorithm.
CVJul 20, 2016
Automatic Detection of Solar Photovoltaic Arrays in High Resolution Aerial ImageryJordan M. Malof, Kyle Bradbury, Leslie M. Collins et al.
The quantity of small scale solar photovoltaic (PV) arrays in the United States has grown rapidly in recent years. As a result, there is substantial interest in high quality information about the quantity, power capacity, and energy generated by such arrays, including at a high spatial resolution (e.g., counties, cities, or even smaller regions). Unfortunately, existing methods for obtaining this information, such as surveys and utility interconnection filings, are limited in their completeness and spatial resolution. This work presents a computer algorithm that automatically detects PV panels using very high resolution color satellite imagery. The approach potentially offers a fast, scalable method for obtaining accurate information on PV array location and size, and at much higher spatial resolutions than are currently available. The method is validated using a very large (135 km^2) collection of publicly available [1] aerial imagery, with over 2,700 human annotated PV array locations. The results demonstrate the algorithm is highly effective on a per-pixel basis. It is likewise effective at object-level PV array detection, but with significant potential for improvement in estimating the precise shape/size of the PV arrays. These results are the first of their kind for the detection of solar PV in aerial imagery, demonstrating the feasibility of the approach and establishing a baseline performance for future investigations.