Wiebke Toussaint

8papers

56citations

Novelty26%

AI Score19

Ranked #196,700 of 205,806 authors (top 96%)#41,521 in LG (top 98%)

8 Papers

SDJul 26, 2021Code

SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification

Wiebke Toussaint, Aaron Yi Ding

Despite the success of deep neural networks (DNNs) in enabling on-device voice assistants, increasing evidence of bias and discrimination in machine learning is raising the urgency of investigating the fairness of these systems. Speaker verification is a form of biometric identification that gives access to voice assistants. Due to a lack of fairness metrics and evaluation frameworks that are appropriate for testing the fairness of speaker verification components, little is known about how model performance varies across subgroups, and what factors influence performance variation. To tackle this emerging challenge, we design and develop SVEva Fair, an accessible, actionable and model-agnostic framework for evaluating the fairness of speaker verification components. The framework provides evaluation measures and visualisations to interrogate model performance across speaker subgroups and compare fairness between models. We demonstrate SVEva Fair in a case study with end-to-end DNNs trained on the VoxCeleb datasets to reveal potential bias in existing embedded speech recognition systems based on the demographic attributes of speakers. Our evaluation shows that publicly accessible benchmark models are not fair and consistently produce worse predictions for some nationalities, and for female speakers of most nationalities. To pave the way for fair and reliable embedded speaker verification, SVEva Fair has been implemented as an open-source python library and can be integrated into the embedded ML development pipeline to facilitate developers and researchers in troubleshooting unreliable speaker verification performance, and selecting high impact approaches for mitigating fairness challenges

LGJan 19, 2022

Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows

Wiebke Toussaint, Aaron Yi Ding, Fahim Kawsar et al.

Billions of distributed, heterogeneous and resource constrained IoT devices deploy on-device machine learning (ML) for private, fast and offline inference on personal data. On-device ML is highly context dependent, and sensitive to user, usage, hardware and environment attributes. This sensitivity and the propensity towards bias in ML makes it important to study bias in on-device settings. Our study is one of the first investigations of bias in this emerging domain, and lays important foundations for building fairer on-device ML. We apply a software engineering lens, investigating the propagation of bias through design choices in on-device ML workflows. We first identify reliability bias as a source of unfairness and propose a measure to quantify it. We then conduct empirical experiments for a keyword spotting task to show how complex and interacting technical design choices amplify and propagate reliability bias. Our results validate that design choices made during model training, like the sample rate and input feature type, and choices made to optimize models, like light-weight architectures, the pruning learning rate and pruning sparsity, can result in disparate predictive performance across male and female groups. Based on our findings we suggest low effort strategies for engineers to mitigate bias in on-device ML.

HCJun 28, 2021

Design Considerations for Data Daemons: Co-creating Design Futures to Explore Ethical Personal Data Management

Wiebke Toussaint, Alejandra Gomez Ortega, Jered Vroon et al.

Mobile applications and online service providers track our virtual and physical behaviour more actively and with a broader scope than ever before. This has given rise to growing concerns about ethical personal data management. Even though regulation and awareness around data ethics are increasing, end-users are seldom engaged when defining and designing what a future with ethical personal data management should look like. We explore a participatory process that uses design futures, the Future workshop method and design fictions to envision ethical personal data management with end-users and designers. To engage participants effectively, we needed to bridge their differential expertise and make the abstract concepts of data and ethics tangible. By concretely presenting personal data management and control as fictitious entities called Data Daemons, we created a shared understanding of these abstract concepts, and empowered non-expert end-users and designers to become actively engaged in the design process.

LGDec 1, 2020

Machine Learning Systems in the IoT: Trustworthiness Trade-offs for Edge Intelligence

Wiebke Toussaint, Aaron Yi Ding

Machine learning systems (MLSys) are emerging in the Internet of Things (IoT) to provision edge intelligence, which is paving our way towards the vision of ubiquitous intelligence. However, despite the maturity of machine learning systems and the IoT, we are facing severe challenges when integrating MLSys and IoT in practical context. For instance, many machine learning systems have been developed for large-scale production (e.g., cloud environments), but IoT introduces additional demands due to heterogeneous and resource-constrained devices and decentralized operation environment. To shed light on this convergence of MLSys and IoT, this paper analyzes the trade-offs by covering the latest developments (up to 2020) on scaling and distributing ML across cloud, edge, and IoT devices. We position machine learning systems as a component of the IoT, and edge intelligence as a socio-technical system. On the challenges of designing trustworthy edge intelligence, we advocate a holistic design approach that takes multi-stakeholder concerns, design requirements and trade-offs into consideration, and highlight the future research opportunities in edge intelligence.

LGJun 11, 2020

Clustering Residential Electricity Consumption Data to Create Archetypes that Capture Household Behaviour in South Africa

Wiebke Toussaint, Deshendran Moodley

Clustering is frequently used in the energy domain to identify dominant electricity consumption patterns of households, which can be used to construct customer archetypes for long term energy planning. Selecting a useful set of clusters however requires extensive experimentation and domain knowledge. While internal clustering validation measures are well established in the electricity domain, they are limited for selecting useful clusters. Based on an application case study in South Africa, we present an approach for formalising implicit expert knowledge as external evaluation measures to create customer archetypes that capture variability in residential electricity consumption behaviour. By combining internal and external validation measures in a structured manner, we were able to evaluate clustering structures based on the utility they present for our application. We validate the selected clusters in a use case where we successfully reconstruct customer archetypes previously developed by experts. Our approach shows promise for transparent and repeatable cluster ranking and selection by data scientists, even if they have limited domain knowledge.

CYJun 11, 2020

Design Considerations for High Impact, Automated Echocardiogram Analysis

Wiebke Toussaint, Dave Van Veen, Courtney Irwin et al.

Deep learning has the potential to automate echocardiogram analysis for early detection of heart disease. Based on a qualitative analysis of design concerns, this study suggests that predicting normal heart function instead of disease accounts for data quality bias and significantly increases efficiency in cardiologists' workflows.

LGJun 1, 2020

Using competency questions to select optimal clustering structures for residential energy consumption patterns

Wiebke Toussaint, Deshendran Moodley

During cluster analysis domain experts and visual analysis are frequently relied on to identify the optimal clustering structure. This process tends to be adhoc, subjective and difficult to reproduce. This work shows how competency questions can be used to formalise expert knowledge and application requirements for context specific evaluation of a clustering application in the residential energy consumption sector.

CYMay 29, 2020

Machine Learning Systems for Intelligent Services in the IoT: A Survey

Wiebke Toussaint, Aaron Yi Ding

Machine learning (ML) technologies are emerging in the Internet of Things (IoT) to provision intelligent services. This survey moves beyond existing ML algorithms and cloud-driven design to investigate the less-explored systems, scaling and socio-technical aspects for consolidating ML and IoT. It covers the latest developments (up to 2020) on scaling and distributing ML across cloud, edge, and IoT devices. With a multi-layered framework to classify and illuminate system design choices, this survey exposes fundamental concerns of developing and deploying ML systems in the rising cloud-edge-device continuum in terms of functionality, stakeholder alignment and trustworthiness.