Christopher Stewart

CV
h-index27
6papers
38citations
Novelty37%
AI Score34

6 Papers

LGAug 1, 2022
A Case for Dataset Specific Profiling

Seth Ockerman, John Wu, Christopher Stewart

Data-driven science is an emerging paradigm where scientific discoveries depend on the execution of computational AI models against rich, discipline-specific datasets. With modern machine learning frameworks, anyone can develop and execute computational models that reveal concepts hidden in the data that could enable scientific applications. For important and widely used datasets, computing the performance of every computational model that can run against a dataset is cost prohibitive in terms of cloud resources. Benchmarking approaches used in practice use representative datasets to infer performance without actually executing models. While practicable, these approaches limit extensive dataset profiling to a few datasets and introduce bias that favors models suited for representative datasets. As a result, each dataset's unique characteristics are left unexplored and subpar models are selected based on inference from generalized datasets. This necessitates a new paradigm that introduces dataset profiling into the model selection process. To demonstrate the need for dataset-specific profiling, we answer two questions:(1) Can scientific datasets significantly permute the rank order of computational models compared to widely used representative datasets? (2) If so, could lightweight model execution improve benchmarking accuracy? Taken together, the answers to these questions lay the foundation for a new dataset-aware benchmarking paradigm.

DCMay 22, 2024
Carbon Connect: An Ecosystem for Sustainable Computing

Benjamin C. Lee, David Brooks, Arthur van Benthem et al.

Computing is at a moment of profound opportunity. Emerging applications -- such as capable artificial intelligence, immersive virtual realities, and pervasive sensor systems -- drive unprecedented demand for computer. Despite recent advances toward net zero carbon emissions, the computing industry's gross energy usage continues to rise at an alarming rate, outpacing the growth of new energy installations and renewable energy deployments. A shift towards sustainability is needed to spark a transformation in how computer systems are manufactured, allocated, and consumed. Carbon Connect envisions coordinated research thrusts that produce design and management strategies for sustainable, next-generation computer systems. These strategies must flatten and then reverse growth trajectories for computing power and carbon for society's most rapidly growing applications such as artificial intelligence and virtual spaces. We will require accurate models for carbon accounting in computing technology. For embodied carbon, we must re-think conventional design strategies -- over-provisioned monolithic servers, frequent hardware refresh cycles, custom silicon -- and adopt life-cycle design strategies that more effectively reduce, reuse and recycle hardware at scale. For operational carbon, we must not only embrace renewable energy but also design systems to use that energy more efficiently. Finally, new hardware design and management strategies must be cognizant of economic policy and regulatory landscape, aligning private initiatives with societal goals. Many of these broader goals will require computer scientists to develop deep, enduring collaborations with researchers in economics, law, and industrial ecology to spark change in broader practice.

CVSep 23, 2025
SmartWilds: Multimodal Wildlife Monitoring Dataset

Jenna Kline, Anirudh Potlapally, Bharath Pillai et al.

We present the first release of SmartWilds, a multimodal wildlife monitoring dataset. SmartWilds is a synchronized collection of drone imagery, camera trap photographs and videos, and bioacoustic recordings collected during summer 2025 at The Wilds safari park in Ohio. This dataset supports multimodal AI research for comprehensive environmental monitoring, addressing critical needs in endangered species research, conservation ecology, and habitat management. Our pilot deployment captured four days of synchronized monitoring across three modalities in a 220-acre pasture containing Pere David's deer, Sichuan takin, Przewalski's horses, as well as species native to Ohio. We provide a comparative analysis of sensor modality performance, demonstrating complementary strengths for landuse patterns, species detection, behavioral analysis, and habitat monitoring. This work establishes reproducible protocols for multimodal wildlife monitoring while contributing open datasets to advance conservation computer vision research. Future releases will include synchronized GPS tracking data from tagged individuals, citizen science data, and expanded temporal coverage across multiple seasons.

CVOct 11, 2025
Ortho-Fuse: Orthomosaic Generation for Sparse High-Resolution Crop Health Maps Through Intermediate Optical Flow Estimation

Rugved Katole, Christopher Stewart

AI-driven crop health mapping systems offer substantial advantages over conventional monitoring approaches through accelerated data acquisition and cost reduction. However, widespread farmer adoption remains constrained by technical limitations in orthomosaic generation from sparse aerial imagery datasets. Traditional photogrammetric reconstruction requires 70-80\% inter-image overlap to establish sufficient feature correspondences for accurate geometric registration. AI-driven systems operating under resource-constrained conditions cannot consistently achieve these overlap thresholds, resulting in degraded reconstruction quality that undermines user confidence in autonomous monitoring technologies. In this paper, we present Ortho-Fuse, an optical flow-based framework that enables the generation of a reliable orthomosaic with reduced overlap requirements. Our approach employs intermediate flow estimation to synthesize transitional imagery between consecutive aerial frames, artificially augmenting feature correspondences for improved geometric reconstruction. Experimental validation demonstrates a 20\% reduction in minimum overlap requirements. We further analyze adoption barriers in precision agriculture to identify pathways for enhanced integration of AI-driven monitoring systems.

SEJun 28, 2021
Avis: In-Situ Model Checking for Unmanned Aerial Vehicles

Max Taylor, Haicheng Chen, Feng Qin et al.

Control firmware in unmanned aerial vehicles (UAVs) uses sensors to model and manage flight operations, from takeoff to landing to flying between waypoints. However, sensors can fail at any time during a flight. If control firmware mishandles sensor failures, UAVs can crash, fly away, or suffer other unsafe conditions. In-situ model checking finds sensor failures that could lead to unsafe conditions by systematically failing sensors. However, the type of sensor failure and its timing within a flight affect its manifestation, creating a large search space. We propose Avis, an in-situ model checker to quickly uncover UAV sensor failures that lead to unsafe conditions. Widely used control firmware already support operating modes. Avis injects sensor failures as the control firmware transitions between modes - a key execution point where mishandled software exceptions can trigger unsafe conditions. We implemented Avis and applied it to ArduPilot and PX4. Avis found unsafe conditions 2.4X faster than Bayesian Fault Injection, the leading, state-of-the-art approach. Within the current code base of ArduPilot and PX4, Avis discovered 10 previously unknown software bugs that lead to unsafe conditions. Additionally, we reinserted 5 known bugs that caused serious, unsafe conditions and Avis correctly reported all of them.

SIMay 15, 2014
Topic words analysis based on LDA model

Xi Qiu, Christopher Stewart

Social network analysis (SNA), which is a research field describing and modeling the social connection of a certain group of people, is popular among network services. Our topic words analysis project is a SNA method to visualize the topic words among emails from Obama.com to accounts registered in Columbus, Ohio. Based on Latent Dirichlet Allocation (LDA) model, a popular topic model of SNA, our project characterizes the preference of senders for target group of receptors. Gibbs sampling is used to estimate topic and word distribution. Our training and testing data are emails from the carbon-free server Datagreening.com. We use parallel computing tool BashReduce for word processing and generate related words under each latent topic to discovers typical information of political news sending specially to local Columbus receptors. Running on two instances using paralleling tool BashReduce, our project contributes almost 30% speedup processing the raw contents, comparing with processing contents on one instance locally. Also, the experimental result shows that the LDA model applied in our project provides precision rate 53.96% higher than TF-IDF model finding target words, on the condition that appropriate size of topic words list is selected.