Matthias Kahl

CV
6papers
40citations
Novelty28%
AI Score34

6 Papers

LGApr 7, 2022
So2Sat POP -- A Curated Benchmark Data Set for Population Estimation from Space on a Continental Scale

Sugandha Doda, Yuanyuan Wang, Matthias Kahl et al.

Obtaining a dynamic population distribution is key to many decision-making processes such as urban planning, disaster management and most importantly helping the government to better allocate socio-technical supply. For the aspiration of these objectives, good population data is essential. The traditional method of collecting population data through the census is expensive and tedious. In recent years, statistical and machine learning methods have been developed to estimate population distribution. Most of the methods use data sets that are either developed on a small scale or not publicly available yet. Thus, the development and evaluation of new methods become challenging. We fill this gap by providing a comprehensive data set for population estimation in 98 European cities. The data set comprises a digital elevation model, local climate zone, land use proportions, nighttime lights in combination with multi-spectral Sentinel-2 imagery, and data from the Open Street Map initiative. We anticipate that it would be a valuable addition to the research community for the development of sophisticated approaches in the field of population estimation.

SPAug 26, 2022
Representation Learning for Appliance Recognition: A Comparison to Classical Machine Learning

Matthias Kahl, Daniel Jorde, Hans-Arno Jacobsen

Non-intrusive load monitoring (NILM) aims at energy consumption and appliance state information retrieval from aggregated consumption measurements, with the help of signal processing and machine learning algorithms. Representation learning with deep neural networks is successfully applied to several related disciplines. The main advantage of representation learning lies in replacing an expert-driven, hand-crafted feature extraction with hierarchical learning from many representations in raw data format. In this paper, we show how the NILM processing-chain can be improved, reduced in complexity and alternatively designed with recent deep learning algorithms. On the basis of an event-based appliance recognition approach, we evaluate seven different classification models: a classical machine learning approach that is based on a hand-crafted feature extraction, three different deep neural network architectures for automated feature extraction on raw waveform data, as well as three baseline approaches for raw data processing. We evaluate all approaches on two large-scale energy consumption datasets with more than 50,000 events of 44 appliances. We show that with the use of deep learning, we are able to reach and surpass the performance of the state-of-the-art classical machine learning approach for appliance recognition with an F-Score of 0.75 and 0.86 compared to 0.69 and 0.87 of the classical approach.

54.5CVMay 8
LAMES: A Large-Scale and Artisanal Mining Environmental Segmentation Dataset

Matthias Kahl, Zhaiyu Chen, Sudipan Saha et al.

Mining operations are of utmost importance to the economy of some nations. However, such operations result in land-use change, very high energy consumption, and negative impacts on the environment, including soil erosion and deforestation. The mining process can impact an area much larger than the mining site itself. Adding to the negative externalities linked to mining is the fact that, in addition to government-sanctioned legal mining operations, illegal mining is widespread, including in various countries of Africa. The ability to monitor remote mining site activities can be useful, e.g., for the detection of illegal artisanal mining activities and their environmental impacts. An important outcome of such monitoring could include a better understanding of the interrelationship between mine facility attributes (e.g., mining types, processing methods, commodities, etc.) and their impact on the natural environment. In this work, we present a data set that contains 150 Large Scale Mining (LSM) sites and 870km^2 annotated area of Artisanal Small-scale Mining (ASM) sites. The metadata includes nine eminent LSM sections and 27 mining site attributes for each LSM site. We also discuss the data set's possible contribution to the research community, social and environmental consequences, and researchers' responsibilities from an ethics perspective.

CVApr 26, 2021
Generative modeling of spatio-temporal weather patterns with extreme event conditioning

Konstantin Klemmer, Sudipan Saha, Matthias Kahl et al.

Deep generative models are increasingly used to gain insights in the geospatial data domain, e.g., for climate data. However, most existing approaches work with temporal snapshots or assume 1D time-series; few are able to capture spatio-temporal processes simultaneously. Beyond this, Earth-systems data often exhibit highly irregular and complex patterns, for example caused by extreme weather events. Because of climate change, these phenomena are only increasing in frequency. Here, we proposed a novel GAN-based approach for generating spatio-temporal weather patterns conditioned on detected extreme events. Our approach augments GAN generator and discriminator with an encoded extreme weather event segmentation mask. These segmentation masks can be created from raw input using existing event detection frameworks. As such, our approach is highly modular and can be combined with custom GAN architectures. We highlight the applicability of our proposed approach in experiments with real-world surface radiation and zonal wind data.

CVJul 2, 2019
Training Auto-encoder-based Optimizers for Terahertz Image Reconstruction

Tak Ming Wong, Matthias Kahl, Peter Haring Bolívar et al.

Terahertz (THz) sensing is a promising imaging technology for a wide variety of different applications. Extracting the interpretable and physically meaningful parameters for such applications, however, requires solving an inverse problem in which a model function determined by these parameters needs to be fitted to the measured data. Since the underlying optimization problem is nonconvex and very costly to solve, we propose learning the prediction of suitable parameters from the measured data directly. More precisely, we develop a model-based autoencoder in which the encoder network predicts suitable parameters and the decoder is fixed to a physically meaningful model function, such that we can train the encoding network in an unsupervised way. We illustrate numerically that the resulting network is more than 140 times faster than classical optimization techniques while making predictions with only slightly higher objective values. Using such predictions as starting points of local optimization techniques allows us to converge to better local minima about twice as fast as optimization without the network-based initialization.

OHApr 24, 2019
Appliance Event Detection -- A Multivariate, Supervised Classification Approach

Matthias Kahl, Thomas Kriechbaumer, Daniel Jorde et al.

Non-intrusive load monitoring (NILM) is a modern and still expanding technique, helping to understand fundamental energy consumption patterns and appliance characteristics. Appliance event detection is an elementary step in the NILM pipeline. Unfortunately, several types of appliances (e.g., switching mode power supply (SMPS) or multi-state) are known to challenge state-of-the-art event detection systems due to their noisy consumption profiles. Classical rule-based event detection system become infeasible and complex for these appliances. By stepping away from distinct event definitions, we can learn from a consumer-configured event model to differentiate between relevant and irrelevant event transients. We introduce a boosting oriented adaptive training, that uses false positives from the initial training area to reduce the number of false positives on the test area substantially. The results show a false positive decrease by more than a factor of eight on a dataset that has a strong focus on SMPS-driven appliances. To obtain a stable event detection system, we applied several experiments on different parameters to measure its performance. These experiments include the evaluation of six event features from the spectral and time domain, different types of feature space normalization to eliminate undesired feature weighting, the conventional and adaptive training, and two common classifiers with its optimal parameter settings. The evaluations are performed on two publicly available energy datasets with high sampling rates: BLUED and BLOND-50.