LGJul 11, 2022Code
Keep your Distance: Determining Sampling and Distance Thresholds in Machine Learning MonitoringAl-Harith Farhad, Ioannis Sorokos, Andreas Schmidt et al.
Machine Learning~(ML) has provided promising results in recent years across different applications and domains. However, in many cases, qualities such as reliability or even safety need to be ensured. To this end, one important aspect is to determine whether or not ML components are deployed in situations that are appropriate for their application scope. For components whose environments are open and variable, for instance those found in autonomous vehicles, it is therefore important to monitor their operational situation to determine its distance from the ML components' trained scope. If that distance is deemed too great, the application may choose to consider the ML component outcome unreliable and switch to alternatives, e.g. using human operator input instead. SafeML is a model-agnostic approach for performing such monitoring, using distance measures based on statistical testing of the training and operational datasets. Limitations in setting SafeML up properly include the lack of a systematic approach for determining, for a given application, how many operational samples are needed to yield reliable distance information as well as to determine an appropriate distance threshold. In this work, we address these limitations by providing a practical approach and demonstrate its use in a well known traffic sign recognition problem, and on an example using the CARLA open-source automotive simulator.
DBMar 8, 2022
It's AI Match: A Two-Step Approach for Schema Matching Using EmbeddingsBenjamin Hättasch, Michael Truong-Ngoc, Andreas Schmidt et al.
Since data is often stored in different sources, it needs to be integrated to gather a global view that is required in order to create value and derive knowledge from it. A critical step in data integration is schema matching which aims to find semantic correspondences between elements of two schemata. In order to reduce the manual effort involved in schema matching, many solutions for the automatic determination of schema correspondences have already been developed. In this paper, we propose a novel end-to-end approach for schema matching based on neural embeddings. The main idea is to use a two-step approach consisting of a table matching step followed by an attribute matching step. In both steps we use embeddings on different levels either representing the whole table or single attributes. Our results show that our approach is able to determine correspondences in a robust and reliable way and compared to traditional schema matching approaches can find non-trivial correspondences.
AINov 14, 2025
A Workflow for Full Traceability of AI DecisionsJulius Wenzel, Syeda Umaima Alam, Andreas Schmidt et al.
An ever increasing number of high-stake decisions are made or assisted by automated systems employing brittle artificial intelligence technology. There is a substantial risk that some of these decision induce harm to people, by infringing their well-being or their fundamental human rights. The state-of-the-art in AI systems makes little effort with respect to appropriate documentation of the decision process. This obstructs the ability to trace what went into a decision, which in turn is a prerequisite to any attempt of reconstructing a responsibility chain. Specifically, such traceability is linked to a documentation that will stand up in court when determining the cause of some AI-based decision that inadvertently or intentionally violates the law. This paper takes a radical, yet practical, approach to this problem, by enforcing the documentation of each and every component that goes into the training or inference of an automated decision. As such, it presents the first running workflow supporting the generation of tamper-proof, verifiable and exhaustive traces of AI decisions. In doing so, we expand the DBOM concept into an effective running workflow leveraging confidential computing technology. We demonstrate the inner workings of the workflow in the development of an app to tell poisonous and edible mushrooms apart, meant as a playful example of high-stake decision support.
NISep 27, 2018
Cross-Layer Effects on Training Neural Algorithms for Video StreamingPablo Gil Pereira, Andreas Schmidt, Thorsten Herfet
Nowadays Dynamic Adaptive Streaming over HTTP (DASH) is the most prevalent solution on the Internet for multimedia streaming and responsible for the majority of global traffic. DASH uses adaptive bit rate (ABR) algorithms, which select the video quality considering performance metrics such as throughput and playout buffer level. Pensieve is a system that allows to train ABR algorithms using reinforcement learning within a simulated network environment and is outperforming existing approaches in terms of achieved performance. In this paper, we demonstrate that the performance of the trained ABR algorithms depends on the implementation of the simulated environment used to train the neural network. We also show that the used congestion control algorithm impacts the algorithms' performance due to cross-layer effects.
OCSep 28, 2018
Feedback control of parametrized PDEs via model order reduction and dynamic programming principleAlessandro Alla, Bernard Haasdonk, Andreas Schmidt
In this paper we investigate infinite horizon optimal control problems for parametrized partial differential equations. We are interested in feedback control via dynamic programming equations which is well-known to suffer from the curse of dimensionality. Thus, we apply parametric model order reduction techniques to construct low-dimensional subspaces with suitable information on the control problem, where the dynamic programming equations can be approximated. To guarantee a low number of basis functions, we combine recent basis generation methods and parameter partitioning techniques. Furthermore, we present a novel technique to construct nonuniform grids in the reduced domain, which is based on statistical information. Finally, we discuss numerical examples to illustrate the effectiveness of the proposed methods for PDEs in two space dimensions.