Ram Rajagopal

LG
h-index18
51papers
1,336citations
Novelty48%
AI Score50

51 Papers

OCApr 22, 2014
Network Risk Limiting Dispatch: Optimal Control and Price of Uncertainty

Baosen Zhang, Ram Rajagopal, David Tse · stanford

Increased uncertainty due to high penetration of renewables imposes significant costs to the system operators. The added costs depend on several factors including market design, performance of renewable generation forecasting and the specific dispatch procedure. Quantifying these costs has been limited to small sample Monte Carlo approaches applied specific dispatch algorithms. The computational complexity and accuracy of these approaches has limited the understanding of tradeoffs between different factors. {In this work we consider a two-stage stochastic economic dispatch problem. Our goal is to provide an analytical quantification and an intuitive understanding of the effects of uncertainties and network congestion on the dispatch procedure and the optimal cost.} We first consider an uncongested network and calculate the risk limiting dispatch. In addition, we derive the price of uncertainty, a number that characterizes the intrinsic impact of uncertainty on the integration cost of renewables. Then we extend the results to a network where one link can become congested. Under mild conditions, we calculate price of uncertainty even in this case. We show that risk limiting dispatch is given by a set of deterministic equilibrium equations. The dispatch solution yields an important insight: congested links do not create isolated nodes, even in a two-node network. In fact, the network can support backflows in congested links, that are useful to reduce the uncertainty by averaging supply across the network. We demonstrate the performance of our approach in standard IEEE benchmark networks.

OCFeb 2, 2015
Competition and Coalition Formation of Renewable Power Producers

Baosen Zhang, Ramesh Johari, Ram Rajagopal · stanford

We investigate group formations and strategic behaviors of renewable power producers in electricity markets. These producers currently bid into the day-ahead market in a conservative fashion because of the real-time risk associated with not meeting their bid amount. It has been suggested in the literature that producers would bid less conservatively if they can form large groups to take advantages of spatial diversity to reduce the uncertainty in their aggregate output. We show that large groups of renewable producers would act strategically to lower the aggregate output because of market power. To maximize renewable power production, we characterize the trade-off between market power and generation uncertainty as a function of the size of the groups. We show there is a sweet spot in the sense that there exists groups that are large enough to achieve the uncertainty reduction of the grand coalition, but are small enough such that they have no significant market power.We consider both independent and correlated forecast errors under a fixed real-time penalty. We also consider a real-time market where both selling and buying of energy are allowed. We validate our claims using PJM and NREL data.

SYDec 17, 2018
PaToPaEM: A Data-Driven Parameter and Topology Joint Estimation Framework for Time Varying System in Distribution Grids

Jiafan Yu, Yang Weng, Ram Rajagopal

Grid topology and line parameters are essential for grid operation and planning, which may be missing or inaccurate in distribution grids. Existing data-driven approaches for recovering such information usually suffer from ignoring 1) input measurement errors and 2) possible state changes among historical measurements. While using the errors-in-variables (EIV) model and letting the parameter and topology estimation interact with each other (PaToPa) can address input and output measurement error modeling, it only works when all measurements are from a single system state. To solve the two challenges simultaneously, we propose the PaToPaEM framework for joint line parameter and topology estimation with historical measurements from different unknown states. We improve the static framework that only works when measurements are from one single state, by further treating state changes in historical measurements as an unobserved latent variable. We then systematically analyze the new mathematical modeling, decouple the optimization problem, and incorporate the expectation-maximization (EM) algorithm to recover different hidden states in measurements. Combining these, PaToPaEM framework enables joint topology and line parameter estimation using noisy measurements from multiple system states. It lays a solid foundation for data-driven system identification in distribution grids. Superior numerical results validate the practicability of the PaToPaEM framework.

ITDec 6, 2010
Simultaneous Sequential Detection of Multiple Interacting Faults

Ram Rajagopal, XuanLong Nguyen, Sinem Coleri Ergen et al.

Single fault sequential change point problems have become important in modeling for various phenomena in large distributed systems, such as sensor networks. But such systems in many situations present multiple interacting faults. For example, individual sensors in a network may fail and detection is performed by comparing measurements between sensors, resulting in statistical dependency among faults. We present a new formulation for multiple interacting faults in a distributed system. The formulation includes specifications of how individual subsystems composing the large system may fail, the information that can be shared among these subsystems and the interaction pattern between faults. We then specify a new sequential algorithm for detecting these faults. The main feature of the algorithm is that it uses composite stopping rules for a subsystem that depend on the decision of other subsystems. We provide asymptotic false alarm and detection delay analysis for this algorithm in the Bayesian setting and show that under certain conditions the algorithm is optimal. The analysis methodology relies on novel detailed comparison techniques between stopping times. We validate the approach with some simulations.

SYNov 17, 2016
Distribution System Outage Detection using Consumer Load and Line Flow Measurements

Raffi Sevlian, Yue Zhao, Andrea Goldsmith et al.

An outage detection framework for power distribution networks is proposed. Given the tree structure of the distribution system, a method is developed combining the use of real-time power flow measurements on edges of the tree with load forecasts at the nodes of the tree. A maximum a posteriori detector {\color{black} (MAP)} is formulated for arbitrary number and location of outages on trees which is shown to have an efficient detector. A framework relying on the maximum missed detection probability is used for optimal sensor placement and is solved for tree networks. Finally, a set of case studies is considered using feeder data from the Pacific Northwest National Laboratories. We show that a 10\% loss in mean detection reliability network wide reduces the required sensor density by 60 \% for a typical feeder if efficient use of measurements is performed.

SYSep 15, 2017
Distribution System Topology Detection Using Consumer Load and Line Flow Measurements

Raffi Avo Sevlian, Ram Rajagopal

This work presents a topology detection method combining home smart meter information and sparse line flow measurements. The problem is formulated as a spanning tree detection problem over a graph given partial nodal and edge flow information in a deterministic and stochastic setting. In the deterministic case of known nodal power consumption and edge flows we provide sensor placement criterion which guarantees correct identification of all spanning trees. We then present a detection method which is polynomial in complexity to the size of the graph. In the stochastic case where loads are given by forecasts derived from delayed smart meter data, we provide a combinatorial Maximum a Posteriori (MAP) detector and a polynomial complexity approximate MAP detector which is shown to work near optimum in low noise regime numerical cases and moderately well in higher noise regime.

SPNov 1, 2018
A Two-layer Decentralized Control Architecture for DER Coordination

Thomas Navidi, Abbas El Gamal, Ram Rajagopal

This paper presents a two-layer distributed energy resource (DER) coordination architecture that allows for separate ownership of data, operates with data subjected to a large buffering delay, and employs a new measure of power quality. The two-layer architecture comprises a centralized model predictive controller (MPC) and several decentralized MPCs each operating independently with no direct communication between them and with infrequent communication with the centralized controller. The goal is to minimize a combination of total energy cost and a measure of power quality while obeying cyber-physical constraints. The global controller utilizes a fast AC optimal power flow (OPF) solver and extensive parallelization to scale the solution to large networks. Each local controller attempts to maximize arbitrage profit while following the load profile and constraints dictated by the global controller. Extensive simulations are performed for two distribution networks under a wide variety of possible storage and solar penetrations enabled by the controller speed. The simulations show that (i) the two-layer architecture can achieve tenfold improvement in power quality relative to no coordination, while capturing nearly all of the available arbitrage profit for a moderate amount of storage penetration, and (ii) both power quality and arbitrage profits are optimized when the solar and storage are distributed more widely over the network, hence it is more effective to install storage closer to the consumer.

CYApr 22, 2022
Constructing dynamic residential energy lifestyles using Latent Dirichlet Allocation

Xiao Chen, Chad Zanocco, June Flora et al.

The rapid expansion of Advanced Meter Infrastructure (AMI) has dramatically altered the energy information landscape. However, our ability to use this information to generate actionable insights about residential electricity demand remains limited. In this research, we propose and test a new framework for understanding residential electricity demand by using a dynamic energy lifestyles approach that is iterative and highly extensible. To obtain energy lifestyles, we develop a novel approach that applies Latent Dirichlet Allocation (LDA), a method commonly used for inferring the latent topical structure of text data, to extract a series of latent household energy attributes. By doing so, we provide a new perspective on household electricity consumption where each household is characterized by a mixture of energy attributes that form the building blocks for identifying a sparse collection of energy lifestyles. We examine this approach by running experiments on one year of hourly smart meter data from 60,000 households and we extract six energy attributes that describe general daily use patterns. We then use clustering techniques to derive six distinct energy lifestyle profiles from energy attribute proportions. Our lifestyle approach is also flexible to varying time interval lengths, and we test our lifestyle approach seasonally (Autumn, Winter, Spring, and Summer) to track energy lifestyle dynamics within and across households and find that around 73% of households manifest multiple lifestyles across a year. These energy lifestyles are then compared to different energy use characteristics, and we discuss their practical applications for demand response program design and lifestyle change analysis.

SYJul 16, 2019
Electric vehicle charging during the day or at night: a perspective on carbon emissions

Xiao Chen, Chin-Woo Tan, Sila Kiliccote et al.

We propose an emission-oriented charging scheme to evaluate the emissions of electric vehicle (EV) charging from the electricity sector at the region of Electric Reliability Council of Texas (ERCOT). We investigate both day- and night-charging scenarios combined with realistic system load demand under the emission-oriented vs direct charging schemes. Our emission-oriented charging scheme reduces carbon emissions in the day by 13.8% on average. We also find that emission-oriented charging results in a significant CO2 reduction in 30% of the days in a year compared with direct charging. Apart from offering a flat rebate for EV owners, our analysis reveals that certain policy incentives (e.g. pricing) regarding EV charging should be taken into account in order to reflect the benefits of emissions reduction that haven't been incorporated in the current market of electricity transactions.

OCDec 3, 2012
Risk Limiting Dispatch with Fast Ramping Storage

Junjie Qin, Han-I Su, Ram Rajagopal

Risk Limiting Dispatch (RLD) was proposed recently as a mechanism that utilizes information and market recourse to reduce reserve capacity requirements, emissions and achieve other system operator objectives. It induces a set of simple dispatch rules that can be easily embedded into the existing dispatch systems to provide computationally efficient and reliable decisions. Storage is emerging as an alternative to mitigate the uncertainty in the grid. This paper extends the RLD framework to incorporate fast-ramping storage. It developed a closed form threshold rule for the optimal stochastic dispatch incorporating a sequence of markets and real-time information. An efficient algorithm to evaluate the thresholds is developed based on analysis of the optimal storage operation. Simple approximations that rely on continuous-time approximations of the solution for the discrete time control problem are also studied. The benefits of storage with respect to prediction quality and storage capacity are examined, and the overall effect on dispatch is quantified. Numerical experiments illustrate the proposed procedures.

53.8ARApr 16
EasyRider: Mitigating Power Transients in Datacenter-Scale Training Workloads

Dillon Jensen, Obi Nnorom, Grant Wilkins et al.

Large-scale AI model training workloads use thousands of GPUs operating in tightly synchronized loops. During synchronous communication, start-up, shut-down, and checkpointing, GPU power consumption can swing from peak to idle within milliseconds. These large and rapid load swings endanger grid infrastructure as they induce steep power ramp rates, voltage and frequency shifts, and reactive power transients that can damage transformers, converters, and protection equipment. To solve this problem, we introduce EasyRider, a power architecture to mitigate power fluctuations at the rack level. EasyRider uses passive components and actively-controlled auxiliary energy storage to attenuate rack power swings. A software system continually monitors the energy storage system to maximize its lifetime in the presence of frequent charge/discharge cycles. EasyRider filters rack power variations to be within grid safety requirements without requiring software modifications to AI training frameworks or wasting energy. We evaluate EasyRider on a 400VDC-rated prototype system against published workload traces and our own GPU testbed, demonstrating its effectiveness across heterogeneous power levels and workload power profiles.

CVJan 4, 2023
Detecting Neighborhood Gentrification at Scale via Street-level Visual Data

Tianyuan Huang, Timothy Dai, Zhecheng Wang et al.

Neighborhood gentrification plays a significant role in shaping the social and economic well-being of both individuals and communities at large. While some efforts have been made to detect gentrification in cities, existing approaches rely mainly on estimated measures from survey data, require substantial work of human labeling, and are limited in characterizing the neighborhood as a whole. We propose a novel approach to detecting neighborhood gentrification at a large-scale based on the physical appearance of neighborhoods by incorporating historical street-level visual data. We show the effectiveness of the proposed method by comparing results from our approach with gentrification measures from previous literature and case studies. Our approach has the potential to supplement existing indicators of gentrification and become a valid resource for urban researchers and policy makers.

CVJun 5, 2022
Estimating building energy efficiency from street view imagery, aerial imagery, and land surface temperature data

Kevin Mayer, Lukas Haas, Tianyuan Huang et al.

Current methods to determine the energy efficiency of buildings require on-site visits of certified energy auditors which makes the process slow, costly, and geographically incomplete. To accelerate the identification of promising retrofit targets on a large scale, we propose to estimate building energy efficiency from widely available and remotely sensed data sources only, namely street view, aerial view, footprint, and satellite-borne land surface temperature (LST) data. After collecting data for almost 40,000 buildings in the United Kingdom, we combine these data sources by training multiple end-to-end deep learning models with the objective to classify buildings as energy efficient (EU rating A-D) or inefficient (EU rating E-G). After evaluating the trained models quantitatively as well as qualitatively, we extend our analysis by studying the predictive power of each data source in an ablation study. We find that the end-to-end deep learning model trained on all four data sources achieves a macro-averaged F1 score of 64.64% and outperforms the k-NN and SVM-based baseline models by 14.13 to 12.02 percentage points, respectively. Thus, this work shows the potential and complementary nature of remotely sensed data in predicting energy efficiency and opens up new opportunities for future work to integrate additional data sources.

83.8DCMar 19
From Servers to Sites: Compositional Power Trace Generation of LLM Inference for Infrastructure Planning

Grant Wilkins, Fiodar Kazhamiaka, Ram Rajagopal

Datacenter operators and electrical utilities rely on power traces at different spatiotemporal scales. Operators use fine-grained traces for provisioning, facility management, and scheduling, while utilities use site-level load profiles for capacity and interconnection planning. Existing datacenter power models do not capture LLM inference workloads, in which GPUs shift rapidly among compute-intensive prefill, lower-power decode, and idle states, and facility demand depends on how these states evolve and synchronize across many devices. We show that LLM inference power can be represented compositionally through two components: workload-driven transitions among operating states and configuration-specific power distributions within those states. Building on this observation, we develop a trace-generation framework that learns from measured traces and synthesizes power profiles for new traffic conditions and serving configurations. These traces aggregate from GPU servers to rack-, row-, and facility-scale load profiles at the temporal granularity required by the study. Across multiple LLMs, tensor-parallel settings, and GPU generations, our framework achieves median absolute energy error below 5% for most configurations while preserving temporal autocorrelation structure. The resulting traces support downstream analyses including oversubscription, power modulation, and utility-facing load characterization, enabling infrastructure evaluations that flat nameplate assumptions and static trace replay cannot support.

LGJun 29, 2023
TemperatureGAN: Generative Modeling of Regional Atmospheric Temperatures

Emmanuel Balogun, Ram Rajagopal, Arun Majumdar

Stochastic generators are useful for estimating climate impacts on various sectors. Projecting climate risk in various sectors, e.g. energy systems, requires generators that are accurate (statistical resemblance to ground-truth), reliable (do not produce erroneous examples), and efficient. Leveraging data from the North American Land Data Assimilation System, we introduce TemperatureGAN, a Generative Adversarial Network conditioned on months, locations, and time periods, to generate 2m above ground atmospheric temperatures at an hourly resolution. We propose evaluation methods and metrics to measure the quality of generated samples. We show that TemperatureGAN produces high-fidelity examples with good spatial representation and temporal dynamics consistent with known diurnal cycles.

CVDec 20, 2023
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing

Zhecheng Wang, Rajanie Prabha, Tianyuan Huang et al.

Remote sensing imagery, despite its broad applications in helping achieve Sustainable Development Goals and tackle climate change, has not yet benefited from the recent advancements of versatile, task-agnostic vision language models (VLMs). A key reason is that the large-scale, semantically diverse image-text dataset required for developing VLMs is still absent for remote sensing images. Unlike natural images, remote sensing images and their associated text descriptions cannot be efficiently collected from the public Internet at scale. In this work, we bridge this gap by using geo-coordinates to automatically connect open, unlabeled remote sensing images with rich semantics covered in OpenStreetMap, and thus construct SkyScript, a comprehensive vision-language dataset for remote sensing images, comprising 2.6 million image-text pairs covering 29K distinct semantic tags. With continual pre-training on this dataset, we obtain a VLM that surpasses baseline models with a 6.2% average accuracy gain in zero-shot scene classification across seven benchmark datasets. It also demonstrates the ability of zero-shot transfer for fine-grained object attribute classification and cross-modal retrieval. We hope this dataset can support the advancement of VLMs for various multi-modal tasks in remote sensing, such as open-vocabulary classification, retrieval, captioning, and text-to-image synthesis.

CVJan 2, 2024
CityPulse: Fine-Grained Assessment of Urban Change with Street View Time Series

Tianyuan Huang, Zejia Wu, Jiajun Wu et al.

Urban transformations have profound societal impact on both individuals and communities at large. Accurately assessing these shifts is essential for understanding their underlying causes and ensuring sustainable urban planning. Traditional measurements often encounter constraints in spatial and temporal granularity, failing to capture real-time physical changes. While street view imagery, capturing the heartbeat of urban spaces from a pedestrian point of view, can add as a high-definition, up-to-date, and on-the-ground visual proxy of urban change. We curate the largest street view time series dataset to date, and propose an end-to-end change detection model to effectively capture physical alterations in the built environment at scale. We demonstrate the effectiveness of our proposed method by benchmark comparisons with previous literature and implementing it at the city-wide level. Our approach has the potential to supplement existing dataset and serve as a fine-grained and accurate assessment of urban change.

LGDec 13, 2024
Towards Using Machine Learning to Generatively Simulate EV Charging in Urban Areas

Marek Miltner, Jakub Zíka, Daniel Vašata et al.

This study addresses the challenge of predicting electric vehicle (EV) charging profiles in urban locations with limited data. Utilizing a neural network architecture, we aim to uncover latent charging profiles influenced by spatio-temporal factors. Our model focuses on peak power demand and daily load shapes, providing insights into charging behavior. Our results indicate significant impacts from the type of Basic Administrative Units on predicted load curves, which contributes to the understanding and optimization of EV charging infrastructure in urban settings and allows Distribution System Operators (DSO) to more efficiently plan EV charging infrastructure expansion.

LGSep 2, 2025
Extending Load Forecasting from Zonal Aggregates to Individual Nodes for Transmission System Operators

Oskar Triebe, Fletcher Passow, Simon Wittner et al.

The reliability of local power grid infrastructure is challenged by sustainable energy developments increasing electric load uncertainty. Transmission System Operators (TSOs) need load forecasts of higher spatial resolution, extending current forecasting operations from zonal aggregates to individual nodes. However, nodal loads are less accurate to forecast and require a large number of individual forecasts, which are hard to manage for the human experts assessing risks in the control room's daily operations (operator). In collaboration with a TSO, we design a multi-level system that meets the needs of operators for hourly day-ahead load forecasting. Utilizing a uniquely extensive dataset of zonal and nodal net loads, we experimentally evaluate our system components. First, we develop an interpretable and scalable forecasting model that allows for TSOs to gradually extend zonal operations to include nodal forecasts. Second, we evaluate solutions to address the heterogeneity and volatility of nodal load, subject to a trade-off. Third, our system is manageable with a fully parallelized single-model forecasting workflow. Our results show accuracy and interpretability improvements for zonal forecasts, and substantial improvements for nodal forecasts. In practice, our multi-level forecasting system allows operators to adjust forecasts with unprecedented confidence and accuracy, and to diagnose otherwise opaque errors precisely.

HCApr 22, 2024
Designing forecasting software for forecast users: Empowering non-experts to create and understand their own forecasts

Richard Stromer, Oskar Triebe, Chad Zanocco et al.

Forecasts inform decision-making in nearly every domain. Forecasts are often produced by experts with rare or hard to acquire skills. In practice, forecasts are often used by domain experts and managers with little forecasting expertise. Our study focuses on how to design forecasting software that empowers non-expert users. We study how users can make use of state-of-the-art forecasting methods, embed their domain knowledge, and how they build understanding and trust towards generated forecasts. To do so, we co-designed a forecasting software prototype using feedback from users and then analyzed their interactions with our prototype. Our results identified three main considerations for non-expert users: (1) a safe stepwise approach facilitating causal understanding and trust; (2) a white box model supporting human-reasoning-friendly components; (3) the inclusion of domain knowledge. This paper contributes insights into how non-expert users interact with forecasting software and by recommending ways to design more accessible forecasting software.

LGNov 29, 2021
NeuralProphet: Explainable Forecasting at Scale

Oskar Triebe, Hansika Hewamalage, Polina Pilyugina et al.

We introduce NeuralProphet, a successor to Facebook Prophet, which set an industry standard for explainable, scalable, and user-friendly forecasting frameworks. With the proliferation of time series data, explainable forecasting remains a challenging task for business and operational decision making. Hybrid solutions are needed to bridge the gap between interpretable classical methods and scalable deep learning models. We view Prophet as a precursor to such a solution. However, Prophet lacks local context, which is essential for forecasting the near-term future and is challenging to extend due to its Stan backend. NeuralProphet is a hybrid forecasting framework based on PyTorch and trained with standard deep learning methods, making it easy for developers to extend the framework. Local context is introduced with auto-regression and covariate modules, which can be configured as classical linear regression or as Neural Networks. Otherwise, NeuralProphet retains the design philosophy of Prophet and provides the same basic model components. Our results demonstrate that NeuralProphet produces interpretable forecast components of equivalent or superior quality to Prophet on a set of generated time series. NeuralProphet outperforms Prophet on a diverse collection of real-world datasets. For short to medium-term forecasts, NeuralProphet improves forecast accuracy by 55 to 92 percent.

LGAug 9, 2021
EVGen: Adversarial Networks for Learning Electric Vehicle Charging Loads and Hidden Representations

Robert Buechler, Emmanuel Balogun, Arun Majumdar et al.

The nexus between transportation, the power grid, and consumer behavior is more pronounced than ever before as the race to decarbonize the transportation sector intensifies. Electrification in the transportation sector has led to technology shifts and rapid deployment of electric vehicles (EVs). The potential increase in stochastic and spatially heterogeneous charging load presents a unique challenge that is not well studied, and will have significant impacts on grid operations, emissions, and system reliability if not managed effectively. Realistic scenario generators can help operators prepare, and machine learning can be leveraged to this end. In this work, we develop generative adversarial networks (GANs) to learn distributions of electric vehicle (EV) charging sessions and disentangled representations. We show that this model structure successfully parameterizes unlabeled temporal and power patterns without supervision and is able to generate synthetic data conditioned on these parameters. We benchmark the generation capability of this model with Gaussian Mixture Models (GMMs), and empirically show that our proposed model framework is better at capturing charging distributions and temporal dynamics.

SYJul 1, 2021
Joint Optimization of Autonomous Electric Vehicle Fleet Operations and Charging Station Siting

Justin Luke, Mauro Salazar, Ram Rajagopal et al.

Charging infrastructure is the coupling link between power and transportation networks, thus determining charging station siting is necessary for planning of power and transportation systems. While previous works have either optimized for charging station siting given historic travel behavior, or optimized fleet routing and charging given an assumed placement of the stations, this paper introduces a linear program that optimizes for station siting and macroscopic fleet operations in a joint fashion. Given an electricity retail rate and a set of travel demand requests, the optimization minimizes total cost for an autonomous EV fleet comprising of travel costs, station procurement costs, fleet procurement costs, and electricity costs, including demand charges. Specifically, the optimization returns the number of charging plugs for each charging rate (e.g., Level 2, DC fast charging) at each candidate location, as well as the optimal routing and charging of the fleet. From a case-study of an electric vehicle fleet operating in San Francisco, our results show that, albeit with range limitations, small EVs with low procurement costs and high energy efficiencies are the most cost-effective in terms of total ownership costs. Furthermore, the optimal siting of charging stations is more spatially distributed than the current siting of stations, consisting mainly of high-power Level 2 AC stations (16.8 kW) with a small share of DC fast charging stations and no standard 7.7kW Level 2 stations. Optimal siting reduces the total costs, empty vehicle travel, and peak charging load by up to 10%.

LGMay 6, 2021
Learning Neighborhood Representation from Multi-Modal Multi-Graph: Image, Text, Mobility Graph and Beyond

Tianyuan Huang, Zhecheng Wang, Hao Sheng et al.

Recent urbanization has coincided with the enrichment of geotagged data, such as street view and point-of-interest (POI). Region embedding enhanced by the richer data modalities has enabled researchers and city administrators to understand the built environment, socioeconomics, and the dynamics of cities better. While some efforts have been made to simultaneously use multi-modal inputs, existing methods can be improved by incorporating different measures of 'proximity' in the same embedding space - leveraging not only the data that characterizes the regions (e.g., street view, local businesses pattern) but also those that depict the relationship between regions (e.g., trips, road network). To this end, we propose a novel approach to integrate multi-modal geotagged inputs as either node or edge features of a multi-graph based on their relations with the neighborhood region (e.g., tiles, census block, ZIP code region, etc.). We then learn the neighborhood representation based on a contrastive-sampling scheme from the multi-graph. Specifically, we use street view images and POI features to characterize neighborhoods (nodes) and use human mobility to characterize the relationship between neighborhoods (directed edges). We show the effectiveness of the proposed methods with quantitative downstream tasks as well as qualitative analysis of the embedding space: The embedding we trained outperforms the ones using only unimodal data as regional inputs.

SPApr 1, 2021
Quick Line Outage Identification in Urban Distribution Grids via Smart Meters

Yizheng Liao, Yang Weng, Chin-woo Tan et al.

The growing integration of distributed energy resources (DERs) in distribution grids raises various reliability issues due to DER's uncertain and complex behaviors. With a large-scale DER penetration in distribution grids, traditional outage detection methods, which rely on customers report and smart meters' last gasp signals, will have poor performance, because the renewable generators and storages and the mesh structure in urban distribution grids can continue supplying power after line outages. To address these challenges, we propose a data-driven outage monitoring approach based on the stochastic time series analysis with a theoretical guarantee. Specifically, we prove via power flow analysis that the dependency of time-series voltage measurements exhibits significant statistical changes after line outages. This makes the theory on optimal change-point detection suitable to identify line outages. However, existing change point detection methods require post-outage voltage distribution, which is unknown in distribution systems. Therefore, we design a maximum likelihood estimator to directly learn the distribution parameters from voltage data. We prove that the estimated parameters-based detection also achieves the optimal performance, making it extremely useful for fast distribution grid outage identifications. Furthermore, since smart meters have been widely installed in distribution grids and advanced infrastructure (e.g., PMU) has not widely been available, our approach only requires voltage magnitude for quick outage identification. Simulation results show highly accurate outage identification in eight distribution grids with 14 configurations with and without DERs using smart meter data.

CVDec 7, 2020
An Enriched Automated PV Registry: Combining Image Recognition and 3D Building Data

Benjamin Rausch, Kevin Mayer, Marie-Louise Arlt et al.

While photovoltaic (PV) systems are installed at an unprecedented rate, reliable information on an installation level remains scarce. As a result, automatically created PV registries are a timely contribution to optimize grid planning and operations. This paper demonstrates how aerial imagery and three-dimensional building data can be combined to create an address-level PV registry, specifying area, tilt, and orientation angles. We demonstrate the benefits of this approach for PV capacity estimation. In addition, this work presents, for the first time, a comparison between automated and officially-created PV registries. Our results indicate that our enriched automated registry proves to be useful to validate, update, and complement official registries.

SPDec 2, 2020
Generating private data with user customization

Xiao Chen, Thomas Navidi, Ram Rajagopal

Personal devices such as mobile phones can produce and store large amounts of data that can enhance machine learning models; however, this data may contain private information specific to the data owner that prevents the release of the data. We want to reduce the correlation between user-specific private information and the data while retaining the useful information. Rather than training a large model to achieve privatization from end to end, we first decouple the creation of a latent representation, and then privatize the data that allows user-specific privatization to occur in a setting with limited computation and minimal disturbance on the utility of the data. We leverage a Variational Autoencoder (VAE) to create a compact latent representation of the data that remains fixed for all devices and all possible private labels. We then train a small generative filter to perturb the latent representation based on user specified preferences regarding the private and utility information. The small filter is trained via a GAN-type robust optimization that can take place on a distributed device such as a phone or tablet. Under special conditions of our linear filter, we disclose the connections between our generative approach and renyi differential privacy. We conduct experiments on multiple datasets including MNIST, UCI-Adult, and CelebA, and give a thorough evaluation including visualizing the geometry of the latent embeddings and estimating the empirical mutual information to show the effectiveness of our approach.

HCOct 26, 2020
Activity Detection And Modeling Using Smart Meter Data: Concept And Case Studies

Hao Wang, Gonzague Henri, Chin-Woo Tan et al.

Electricity consumed by residential consumers counts for a significant part of global electricity consumption and utility companies can collect high-resolution load data thanks to the widely deployed advanced metering infrastructure. There has been a growing research interest toward appliance load disaggregation via nonintrusive load monitoring. As the electricity consumption of appliances is directly associated with the activities of consumers, this paper proposes a new and more effective approach, i.e., activity disaggregation. We present the concept of activity disaggregation and discuss its advantage over traditional appliance load disaggregation. We develop a framework by leverage machine learning for activity detection based on residential load data and features. We show through numerical case studies to demonstrate the effectiveness of the activity detection method and analyze consumer behaviors by time-dependent activity modeling. Last but not least, we discuss some potential use cases that can benefit from activity disaggregation and some future research directions.

LGOct 9, 2020
Short-Term Solar Irradiance Forecasting Using Calibrated Probabilistic Models

Eric Zelikman, Sharon Zhou, Jeremy Irvin et al.

Advancing probabilistic solar forecasting methods is essential to supporting the integration of solar energy into the electricity grid. In this work, we develop a variety of state-of-the-art probabilistic models for forecasting solar irradiance. We investigate the use of post-hoc calibration techniques for ensuring well-calibrated probabilistic predictions. We train and evaluate the models using public data from seven stations in the SURFRAD network, and demonstrate that the best model, NGBoost, achieves higher performance at an intra-hourly resolution than the best benchmark solar irradiance forecasting model across all stations. Further, we show that NGBoost with CRUDE post-hoc calibration achieves comparable performance to a numerical weather prediction model on hourly-resolution forecasting.

LGJun 12, 2020
FedGAN: Federated Generative Adversarial Networks for Distributed Data

Mohammad Rasouli, Tao Sun, Ram Rajagopal

We propose Federated Generative Adversarial Network (FedGAN) for training a GAN across distributed sources of non-independent-and-identically-distributed data sources subject to communication and privacy constraints. Our algorithm uses local generators and discriminators which are periodically synced via an intermediary that averages and broadcasts the generator and discriminator parameters. We theoretically prove the convergence of FedGAN with both equal and two time-scale updates of generator and discriminator, under standard assumptions, using stochastic approximations and communication efficient stochastic gradient descents. We experiment FedGAN on toy examples (2D system, mixed Gaussian, and Swiss role), image datasets (MNIST, CIFAR-10, and CelebA), and time series datasets (household electricity consumption and electric vehicle charging sessions). We show FedGAN converges and has similar performance to general distributed GAN, while reduces communication complexity. We also show its robustness to reduced communications.

CVMay 7, 2020
Effective Data Fusion with Generalized Vegetation Index: Evidence from Land Cover Segmentation in Agriculture

Hao Sheng, Xiao Chen, Jingyi Su et al.

How can we effectively leverage the domain knowledge from remote sensing to better segment agriculture land cover from satellite images? In this paper, we propose a novel, model-agnostic, data-fusion approach for vegetation-related computer vision tasks. Motivated by the various Vegetation Indices (VIs), which are introduced by domain experts, we systematically reviewed the VIs that are widely used in remote sensing and their feasibility to be incorporated in deep neural networks. To fully leverage the Near-Infrared channel, the traditional Red-Green-Blue channels, and Vegetation Index or its variants, we propose a Generalized Vegetation Index (GVI), a lightweight module that can be easily plugged into many neural network architectures to serve as an additional information input. To smoothly train models with our GVI, we developed an Additive Group Normalization (AGN) module that does not require extra parameters of the prescribed neural networks. Our approach has improved the IoUs of vegetation-related classes by 0.9-1.3 percent and consistently improves the overall mIoU by 2 percent on our baseline.

CVApr 21, 2020
The 1st Agriculture-Vision Challenge: Methods and Results

Mang Tik Chiu, Xingqian Xu, Kai Wang et al.

The first Agriculture-Vision Challenge aims to encourage research in developing novel and effective algorithms for agricultural pattern recognition from aerial images, especially for the semantic segmentation task associated with our challenge dataset. Around 57 participating teams from various countries compete to achieve state-of-the-art in aerial agriculture semantic segmentation. The Agriculture-Vision Challenge Dataset was employed, which comprises of 21,061 aerial and multi-spectral farmland images. This paper provides a summary of notable methods and results in the challenge. Our submission server and leaderboard will continue to open for researchers that are interested in this challenge dataset and task; the link can be found here.

LGJan 29, 2020
Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding

Zhecheng Wang, Haoyuan Li, Ram Rajagopal

Understanding intrinsic patterns and predicting spatiotemporal characteristics of cities require a comprehensive representation of urban neighborhoods. Existing works relied on either inter- or intra-region connectivities to generate neighborhood representations but failed to fully utilize the informative yet heterogeneous data within neighborhoods. In this work, we propose Urban2Vec, an unsupervised multi-modal framework which incorporates both street view imagery and point-of-interest (POI) data to learn neighborhood embeddings. Specifically, we use a convolutional neural network to extract visual features from street view images while preserving geospatial similarity. Furthermore, we model each POI as a bag-of-words containing its category, rating, and review information. Analog to document embedding in natural language processing, we establish the semantic similarity between neighborhood ("document") and the words from its surrounding POIs in the vector space. By jointly encoding visual, textual, and geospatial information into the neighborhood representation, Urban2Vec can achieve performances better than baseline models and comparable to fully-supervised methods in downstream prediction tasks. Extensive experiments on three U.S. metropolitan areas also demonstrate the model interpretability, generalization capability, and its value in neighborhood similarity analysis.

LGNov 27, 2019
AR-Net: A simple Auto-Regressive Neural Network for time-series

Oskar Triebe, Nikolay Laptev, Ram Rajagopal

In this paper we present a new framework for time-series modeling that combines the best of traditional statistical models and neural networks. We focus on time-series with long-range dependencies, needed for monitoring fine granularity data (e.g. minutes, seconds, milliseconds), prevalent in operational use-cases. Traditional models, such as auto-regression fitted with least squares (Classic-AR) can model time-series with a concise and interpretable model. When dealing with long-range dependencies, Classic-AR models can become intractably slow to fit for large data. Recently, sequence-to-sequence models, such as Recurrent Neural Networks, which were originally intended for natural language processing, have become popular for time-series. However, they can be overly complex for typical time-series data and lack interpretability. A scalable and interpretable model is needed to bridge the statistical and deep learning-based approaches. As a first step towards this goal, we propose modelling AR-process dynamics using a feed-forward neural network approach, termed AR-Net. We show that AR-Net is as interpretable as Classic-AR but also scales to long-range dependencies. Our results lead to three major conclusions: First, AR-Net learns identical AR-coefficients as Classic-AR, thus being equally interpretable. Second, the computational complexity with respect to the order of the AR process, is linear for AR-Net as compared to a quadratic for Classic-AR. This makes it possible to model long-range dependencies within fine granularity data. Third, by introducing regularization, AR-Net automatically selects and learns sparse AR-coefficients. This eliminates the need to know the exact order of the AR-process and allows to learn sparse weights for a model with long-range dependencies.

SYMay 1, 2019
On the Interaction between Autonomous Mobility on Demand Systems and Power Distribution Networks -- An Optimal Power Flow Approach

Alvaro Estandia, Maximilian Schiffer, Federico Rossi et al.

In future transportation systems, the charging behavior of electric Autonomous Mobility on Demand (AMoD) fleets, i.e., fleets of electric self-driving cars that service on-demand trip requests, will likely challenge power distribution networks (PDNs), causing overloads or voltage drops. In this paper, we show that these challenges can be significantly attenuated if the PDNs' operational constraints and exogenous loads (e.g., from homes or businesses) are accounted for when operating an electric AMoD fleet. We focus on a system-level perspective, assuming full coordination between the AMoD and the PDN operators. From this single entity perspective, we assess potential coordination benefits. Specifically, we extend previous results on an optimization-based modeling approach for electric AMoD systems to jointly control an electric AMoD fleet and a series of PDNs, and analyze the benefit of coordination under load balancing constraints. For a case study of Orange County, CA, we show that the coordination between the electric AMoD fleet and the PDNs eliminates 99% of the overloads and 50% of the voltage drops that the electric AMoD fleet would cause in an uncoordinated setting. Our results show that coordinating electric AMoD and PDNs can help maintain the reliability of PDNs under added electric AMoD charging load, thus significantly mitigating or deferring the need for PDN capacity upgrades.

LGApr 20, 2019
Distributed generation of privacy preserving data with user customization

Xiao Chen, Thomas Navidi, Stefano Ermon et al.

Distributed devices such as mobile phones can produce and store large amounts of data that can enhance machine learning models; however, this data may contain private information specific to the data owner that prevents the release of the data. We wish to reduce the correlation between user-specific private information and data while maintaining the useful information. Rather than learning a large model to achieve privatization from end to end, we introduce a decoupling of the creation of a latent representation and the privatization of data that allows user-specific privatization to occur in a distributed setting with limited computation and minimal disturbance on the utility of the data. We leverage a Variational Autoencoder (VAE) to create a compact latent representation of the data; however, the VAE remains fixed for all devices and all possible private labels. We then train a small generative filter to perturb the latent representation based on individual preferences regarding the private and utility information. The small filter is trained by utilizing a GAN-type robust optimization that can take place on a distributed device. We conduct experiments on three popular datasets: MNIST, UCI-Adult, and CelebA, and give a thorough evaluation including visualizing the geometry of the latent embeddings and estimating the empirical mutual information to show the effectiveness of our approach.

APNov 14, 2018
Structural Damage Detection and Localization with Unknown Post-Damage Feature Distribution Using Sequential Change-Point Detection Method

Yizheng Liao, Anne S. Kiremidjian, Ram Rajagopal et al.

The high structural deficient rate poses serious risks to the operation of many bridges and buildings. To prevent critical damage and structural collapse, a quick structural health diagnosis tool is needed during normal operation or immediately after extreme events. In structural health monitoring (SHM), many existing works will have limited performance in the quick damage identification process because 1) the damage event needs to be identified with short delay and 2) the post-damage information is usually unavailable. To address these drawbacks, we propose a new damage detection and localization approach based on stochastic time series analysis. Specifically, the damage sensitive features are extracted from vibration signals and follow different distributions before and after a damage event. Hence, we use the optimal change point detection theory to find damage occurrence time. As the existing change point detectors require the post-damage feature distribution, which is unavailable in SHM, we propose a maximum likelihood method to learn the distribution parameters from the time-series data. The proposed damage detection using estimated parameters also achieves the optimal performance. Also, we utilize the detection results to find damage location without any further computation. Validation results show highly accurate damage identification in American Society of Civil Engineers benchmark structure and two shake table experiments.

SYNov 14, 2018
Fast Distribution Grid Line Outage Identification with $μ$PMU

Yizheng Liao, Yang Weng, Chin-Woo Tan et al.

The growing integration of distributed energy resources (DERs) in urban distribution grids raises various reliability issues due to DER's uncertain and complex behaviors. With a large-scale DER penetration, traditional outage detection methods, which rely on customers making phone calls and smart meters' "last gasp" signals, will have limited performance, because the renewable generators can supply powers after line outages and many urban grids are mesh so line outages do not affect power supply. To address these drawbacks, we propose a data-driven outage monitoring approach based on the stochastic time series analysis from micro phasor measurement unit ($μ$PMU). Specifically, we prove via power flow analysis that the dependency of time-series voltage measurements exhibits significant statistical changes after line outages. This makes the theory on optimal change-point detection suitable to identify line outages via $μ$PMUs with fast and accurate sampling. However, existing change point detection methods require post-outage voltage distribution unknown in distribution systems. Therefore, we design a maximum likelihood-based method to directly learn the distribution parameters from $μ$PMU data. We prove that the estimated parameters-based detection still achieves the optimal performance, making it extremely useful for distribution grid outage identifications. Simulation results show highly accurate outage identification in eight distribution grids with 14 configurations with and without DERs using $μ$PMU data.

LGSep 21, 2018
Understanding Compressive Adversarial Privacy

Xiao Chen, Peter Kairouz, Ram Rajagopal

Designing a data sharing mechanism without sacrificing too much privacy can be considered as a game between data holders and malicious attackers. This paper describes a compressive adversarial privacy framework that captures the trade-off between the data privacy and utility. We characterize the optimal data releasing mechanism through convex optimization when assuming that both the data holder and attacker can only modify the data using linear transformations. We then build a more realistic data releasing mechanism that can rely on a nonlinear compression model while the attacker uses a neural network. We demonstrate in a series of empirical applications that this framework, consisting of compressive adversarial privacy, can preserve sensitive information.

SYSep 18, 2018
Unbalanced Multi-Phase Distribution Grid Topology Estimation and Bus Phase Identification

Yizheng Liao, Yang Weng, Guangyi Liu et al.

There is an increasing need for monitoring and controlling uncertainties brought by distributed energy resources in distribution grids. For such goal, accurate multi-phase topology is the basis for correlating measurements in unbalanced distribution networks. Unfortunately, such topology knowledge is often unavailable due to limited investment, especially for \revv{low-voltage} distribution grids. Also, the bus phase labeling information is inaccurate due to human errors or outdated records. For this challenge, this paper utilizes smart meter data for an information-theoretic approach to learn the topology of distribution grids. Specifically, multi-phase unbalanced systems are converted into symmetrical components, namely positive, negative, and zero sequences. Then, this paper proves that the Chow-Liu algorithm finds the topology by utilizing power flow equations and the conditional independence relationships implied by the radial multi-phase structure of distribution grids with the presence of incorrect bus phase labels. At last, by utilizing Carson's equation, this paper proves that the bus phase connection can be correctly identified using voltage measurements. For validation, IEEE systems are simulated using three real data sets. The simulation results demonstrate that the algorithm is highly accurate for finding multi-phase topology even with strong load unbalancing condition and DERs. This ensures close monitoring and controlling DERs in distribution grids.

LGJul 13, 2018
Generative Adversarial Privacy

Chong Huang, Peter Kairouz, Xiao Chen et al.

We present a data-driven framework called generative adversarial privacy (GAP). Inspired by recent advancements in generative adversarial networks (GANs), GAP allows the data holder to learn the privatization mechanism directly from the data. Under GAP, finding the optimal privacy mechanism is formulated as a constrained minimax game between a privatizer and an adversary. We show that for appropriately chosen adversarial loss functions, GAP provides privacy guarantees against strong information-theoretic adversaries. We also evaluate GAP's performance on the GENKI face database.

SYSep 13, 2018
The Value of Distributed Energy Resources for Heterogeneous Residential Consumers

Siddharth Patel, Mohammad Rasouli, Junjie Qin et al.

The presence of behind-the-meter rooftop photovoltaics and storage in the residential sector is poised to increase significantly. Here we quantify in detail the value of these technologies to consumers and service providers. We characterize the heterogeneity in household electricity cost savings under time-varying prices due to consumption behavior differences. Different pricing policies significantly alter how households fare with respect to one another. Furthermore, household savings in absolute terms are not strongly correlated with savings normalized by PV and storage system size. We characterize the financial value of improved forecasting capabilities for a household, finding that it is a relatively small fraction of a household's cost savings. Coordination services that combine the resources available at all households can reduce costs by an additional 10% to 15% of the original total cost. Surprisingly, coordination service providers will not encourage adoption beyond 35-55% within a group. We present a simple model that explains the value of coordination and its relationship to the pricing of distribution services.

CVApr 23, 2018
Siamese Generative Adversarial Privatizer for Biometric Data

Witold Oleszkiewicz, Peter Kairouz, Karol Piczak et al.

State-of-the-art machine learning algorithms can be fooled by carefully crafted adversarial examples. As such, adversarial examples present a concrete problem in AI safety. In this work we turn the tables and ask the following question: can we harness the power of adversarial examples to prevent malicious adversaries from learning identifying information from data while allowing non-malicious entities to benefit from the utility of the same data? For instance, can we use adversarial examples to anonymize biometric dataset of faces while retaining usefulness of this data for other purposes, such as emotion recognition? To address this question, we propose a simple yet effective method, called Siamese Generative Adversarial Privatizer (SGAP), that exploits the properties of a Siamese neural network to find discriminative features that convey identifying information. When coupled with a generative model, our approach is able to correctly locate and disguise identifying information, while minimally reducing the utility of the privatized dataset. Extensive evaluation on a biometric dataset of fingerprints and cartoon faces confirms usefulness of our simple yet effective method.

LGOct 26, 2017
Context-Aware Generative Adversarial Privacy

Chong Huang, Peter Kairouz, Xiao Chen et al.

Preserving the utility of published datasets while simultaneously providing provable privacy guarantees is a well-known challenge. On the one hand, context-free privacy solutions, such as differential privacy, provide strong privacy guarantees, but often lead to a significant reduction in utility. On the other hand, context-aware privacy solutions, such as information theoretic privacy, achieve an improved privacy-utility tradeoff, but assume that the data holder has access to dataset statistics. We circumvent these limitations by introducing a novel context-aware privacy framework called generative adversarial privacy (GAP). GAP leverages recent advancements in generative adversarial networks (GANs) to allow the data holder to learn privatization schemes from the dataset itself. Under GAP, learning the privacy mechanism is formulated as a constrained minimax game between two players: a privatizer that sanitizes the dataset in a way that limits the risk of inference attacks on the individuals' private variables, and an adversary that tries to infer the private variables from the sanitized dataset. To evaluate GAP's performance, we investigate two simple (yet canonical) statistical dataset models: (a) the binary data model, and (b) the binary Gaussian mixture model. For both models, we derive game-theoretically optimal minimax privacy mechanisms, and show that the privacy mechanisms learned from data (in a generative adversarial fashion) match the theoretically optimal ones. This demonstrates that our framework can be easily applied in practice, even in the absence of dataset statistics.

GTAug 14, 2017
Competition and Efficiency of Coalitions in Cournot Games with Uncertainty

Baosen Zhang, Ramesh Johari, Ram Rajagopal

We investigate the impact of coalition formation on the efficiency of Cournot games where producers face uncertainties. In particular, we study a market model where firms must determine their output before an uncertain production capacity is realized. In contrast to standard Cournot models, we show that the game is not efficient when there are many small firms. Instead, producers tend to act conservatively to hedge against their risks. We show that in the presence of uncertainty, the game becomes efficient when firms are allowed to take advantage of diversity to form groups of certain sizes. We characterize the tradeoff between market power and uncertainty reduction as a function of group size. In particular, we compare the welfare and output obtained with coalitional competition, with the same benchmarks when output is controlled by a single system operator. We show when there are $N$ firms present, competition between groups of size $Ω(\sqrt{N})$ results in equilibria that are socially optimal in terms of welfare and groups of size $Ω(N^{2/3})$ are socially optimal in terms of production. We also extend our results to the case of uncertain demand by establishing an equivalency between Cournot oligopoly and Cournot Oligopsony. We demonstrate our results with real data from electricity markets with significant wind power penetration.

OCAug 5, 2017
Pricing Residential Electricity Based on Individual Consumption Behaviors

Siddharth Patel, Raffi Sevlian, Baosen Zhang et al.

The conventional practice of retail electric utilities is to aggregate customers geographically. The utility purchases electricity for its customers via bulk transactions on the wholesale market, and it passes these costs along to its customers, the end consumers, through their rate plan. Typically, all residential consumers are offered the same per unit rate plan, which leads to cost sharing. Some consumers use their electricity at peak hours, when it is more expensive on the wholesale market, and others consume mostly at off peak hours, when it is cheaper, but they all enjoy the same per unit rate through their utility. This paper proposed a method for the utility to segment a population of consumers on the basis of their individual consumption patterns. An optimal recruitment algorithm was developed to aggregate consumers into groups with a relatively low per unit cost of electricity on the wholesale market. It was then proposed that the utility should group together enough consumers to ensure an adequately low forecast error, which is related to risks it faces in wholesale market transactions. Finally, it was shown that by repeated application of this process, the utility could segment the entire population into groups and offer them differentiated rate plans based on their actual consumption behavior. These groupings are stable in the sense that no one consumer can unilaterally improve her outcome.

SYJun 2, 2017
Mapping Rule Estimation for Power Flow Analysis in Distribution Grids

Jiafan Yu, Yang Weng, Ram Rajagopal

The increasing integration of distributed energy resources (DERs) calls for new monitoring and operational planning tools to ensure stability and sustainability in distribution grids. One idea is to use existing monitoring tools in transmission grids and some primary distribution grids. However, they usually depend on the knowledge of the system model, e.g., the topology and line parameters, which may be unavailable in primary and secondary distribution grids. Furthermore, a utility usually has limited modeling ability of active controllers for solar panels as they may belong to a third party like residential customers. To solve the modeling problem in traditional power flow analysis, we propose a support vector regression (SVR) approach to reveal the mapping rules between different variables and recover useful variables based on physical understanding and data mining. We illustrate the advantages of using the SVR model over traditional regression method which finds line parameters in distribution grids. Specifically, the SVR model is robust enough to recover the mapping rules while the regression method fails when 1) there are measurement outliers and missing data, 2) there are active controllers, or 3) measurements are only available at some part of a distribution grid. We demonstrate the superior performance of our method through extensive numerical validation on different scales of distribution grids.

SYMay 24, 2017
PaToPa: A Data-Driven Parameter and Topology Joint Estimation Framework in Distribution Grids

Jiafan Yu, Yang Weng, Ram Rajagopal

The increasing integration of distributed energy resources (DERs) calls for new planning and operational tools. However, such tools depend on system topology and line parameters, which may be missing or inaccurate in distribution grids. With abundant data, one idea is to use linear regression to find line parameters, based on which topology can be identified. Unfortunately, the linear regression method is accurate only if there is no noise in both the input measurements (e.g., voltage magnitude and phase angle) and output measurements (e.g., active and reactive power). For topology estimation, even with a small error in measurements, the regression-based method is incapable of finding the topology using non-zero line parameters with a proper metric. To model input and output measurement errors simultaneously, we propose the error-in-variables (EIV) model in a maximum likelihood estimation (MLE) framework for joint line parameter and topology estimation. While directly solving the problem is NP-hard, we successfully adapt the problem into a generalized low-rank approximation problem via variable transformation and noise decorrelation. For accurate topology estimation, we let it interact with parameter estimation in a fashion that is similar to expectation-maximization fashion in machine learning. The proposed PaToPa approach does not require a radial network setting and works for mesh networks. We demonstrate the superior performance in accuracy for our method on IEEE test cases with actual feeder data from South California Edison.

MLNov 6, 2016
Urban MV and LV Distribution Grid Topology Estimation via Group Lasso

Yizheng Liao, Yang Weng, Guangyi Liu et al.

The increasing penetration of distributed energy resources poses numerous reliability issues to the urban distribution grid. The topology estimation is a critical step to ensure the robustness of distribution grid operation. However, the bus connectivity and grid topology estimation are usually hard in distribution grids. For example, it is technically challenging and costly to monitor the bus connectivity in urban grids, e.g., underground lines. It is also inappropriate to use the radial topology assumption exclusively because the grids of metropolitan cities and regions with dense loads could be with many mesh structures. To resolve these drawbacks, we propose a data-driven topology estimation method for MV and LV distribution grids by only utilizing the historical smart meter measurements. Particularly, a probabilistic graphical model is utilized to capture the statistical dependencies amongst bus voltages. We prove that the bus connectivity and grid topology estimation problems, in radial and mesh structures, can be formulated as a linear regression with a least absolute shrinkage regularization on grouped variables (\textit{group lasso}). Simulations show highly accurate results in eight MV and LV distribution networks at different sizes and 22 topology configurations using PG\&E residential smart meter data.

MLNov 5, 2015
A Sparse Linear Model and Significance Test for Individual Consumption Prediction

Pan Li, Baosen Zhang, Yang Weng et al.

Accurate prediction of user consumption is a key part not only in understanding consumer flexibility and behavior patterns, but in the design of robust and efficient energy saving programs as well. Existing prediction methods usually have high relative errors that can be larger than 30% and have difficulties accounting for heterogeneity between individual users. In this paper, we propose a method to improve prediction accuracy of individual users by adaptively exploring sparsity in historical data and leveraging predictive relationship between different users. Sparsity is captured by popular least absolute shrinkage and selection estimator, while user selection is formulated as an optimal hypothesis testing problem and solved via a covariance test. Using real world data from PG&E, we provide extensive simulation validation of the proposed method against well-known techniques such as support vector machine, principle component analysis combined with linear regression, and random forest. The results demonstrate that our proposed methods are operationally efficient because of linear nature, and achieve optimal prediction performance.