CL CYSep 30, 2020

Ethically Collecting Multi-Modal Spontaneous Conversations with People that have Cognitive Impairments

arXiv:2009.14361v10.2

Originality Synthesis-oriented

AI Analysis

This work tackles the problem of data scarcity and ethical concerns for researchers aiming to make AI assistants more accessible to vulnerable populations, offering a practical framework to facilitate such collections.

The paper addresses the challenge of ethically collecting multi-modal spontaneous conversations from people with cognitive impairments to improve spoken dialogue systems, by providing expert-derived guidance and a secure system called CUSCO for data capture and sharing.

In order to make spoken dialogue systems (such as Amazon Alexa or Google Assistant) more accessible and naturally interactive for people with cognitive impairments, appropriate data must be obtainable. Recordings of multi-modal spontaneous conversations with vulnerable user groups are scarce however and this valuable data is challenging to collect. Researchers that call for this data are commonly inexperienced in ethical and legal issues around working with vulnerable participants. Additionally, standard recording equipment is insecure and should not be used to capture sensitive data. We spent a year consulting experts on how to ethically capture and share recordings of multi-modal spontaneous conversations with vulnerable user groups. In this paper we provide guidance, collated from these experts, on how to ethically collect such data and we present a new system - "CUSCO" - to capture, transport and exchange sensitive data securely. This framework is intended to be easily followed and implemented to encourage further publications of similar corpora. Using this guide and secure recording system, researchers can review and refine their ethical measures.

View on arXiv PDF

Similar