Marcos Baez

HC
20papers
313citations
Novelty25%
AI Score19

20 Papers

CLSep 20, 2021
Crowdsourcing Diverse Paraphrases for Training Task-oriented Bots

Jorge Ramírez, Auday Berro, Marcos Baez et al.

A prominent approach to build datasets for training task-oriented bots is crowd-based paraphrasing. Current approaches, however, assume the crowd would naturally provide diverse paraphrases or focus only on lexical diversity. In this WiP we addressed an overlooked aspect of diversity, introducing an approach for guiding the crowdsourcing process towards paraphrases that are syntactically diverse.

HCJul 28, 2021
On the state of reporting in crowdsourcing experiments and a checklist to aid current practices

Jorge Ramírez, Burcu Sayin, Marcos Baez et al.

Crowdsourcing is being increasingly adopted as a platform to run studies with human subjects. Running a crowdsourcing experiment involves several choices and strategies to successfully port an experimental design into an otherwise uncontrolled research environment, e.g., sampling crowd workers, mapping experimental conditions to micro-tasks, or ensure quality contributions. While several guidelines inform researchers in these choices, guidance of how and what to report from crowdsourcing experiments has been largely overlooked. If under-reported, implementation choices constitute variability sources that can affect the experiment's reproducibility and prevent a fair assessment of research outcomes. In this paper, we examine the current state of reporting of crowdsourcing experiments and offer guidance to address associated reporting issues. We start by identifying sensible implementation choices, relying on existing literature and interviews with experts, to then extensively analyze the reporting of 171 crowdsourcing experiments. Informed by this process, we propose a checklist for reporting crowdsourcing experiments.

DLDec 15, 2020
On how Cognitive Computing will plan your next Systematic Review

Maisie Badami, Marcos Baez, Shayan Zamanirad et al.

Systematic literature reviews (SLRs) are at the heart of evidence-based research, setting the foundation for future research and practice. However, producing good quality timely contributions is a challenging and highly cognitive endeavor, which has lately motivated the exploration of automation and support in the SLR process. In this paper we address an often overlooked phase in this process, that of planning literature reviews, and explore under the lenses of cognitive process augmentation how to overcome its most salient challenges. In doing so, we report on the insights from 24 SLR authors on planning practices, its challenges as well as feedback on support strategies inspired by recent advances in cognitive computing. We frame our findings under the cognitive augmentation framework, and report on a prototype implementation and evaluation focusing on further informing the technical feasibility.

CYDec 7, 2020
Bringing Cognitive Augmentation to Web Browsing Accessibility

Alessandro Pina, Marcos Baez, Florian Daniel

In this paper we explore the opportunities brought by cognitive augmentation to provide a more natural and accessible web browsing experience. We explore these opportunities through \textit{conversational web browsing}, an emerging interaction paradigm for the Web that enables blind and visually impaired users (BVIP), as well as regular users, to access the contents and features of websites through conversational agents. Informed by the literature, our previous work and prototyping exercises, we derive a conceptual framework for supporting BVIP conversational web browsing needs, to then focus on the challenges of automatically providing this support, describing our early work and prototype that leverage heuristics that consider structural and content features only.

HCNov 8, 2020
Chatbots as conversational healthcare services

Mlađan Jovanović, Marcos Baez, Fabio Casati

Chatbots are emerging as a promising platform for accessing and delivering healthcare services. The evidence is in the growing number of publicly available chatbots aiming at taking an active role in the provision of prevention, diagnosis, and treatment services. This article takes a closer look at how these emerging chatbots address design aspects relevant to healthcare service provision, emphasizing the Human-AI interaction aspects and the transparency in AI automation and decision making.

HCNov 5, 2020
On the impact of predicate complexity in crowdsourced classification tasks

Jorge Ramírez, Marcos Baez, Fabio Casati et al.

This paper explores and offers guidance on a specific and relevant problem in task design for crowdsourcing: how to formulate a complex question used to classify a set of items. In micro-task markets, classification is still among the most popular tasks. We situate our work in the context of information retrieval and multi-predicate classification, i.e., classifying a set of items based on a set of conditions. Our experiments cover a wide range of tasks and domains, and also consider crowd workers alone and in tandem with machine learning classifiers. We provide empirical evidence into how the resulting classification performance is affected by different predicate formulation strategies, emphasizing the importance of predicate formulation as a task design dimension in crowdsourcing.

HCNov 5, 2020
Challenges and strategies for running controlled crowdsourcing experiments

Jorge Ramírez, Marcos Baez, Fabio Casati et al.

This paper reports on the challenges and lessons we learned while running controlled experiments in crowdsourcing platforms. Crowdsourcing is becoming an attractive technique to engage a diverse and large pool of subjects in experimental research, allowing researchers to achieve levels of scale and completion times that would otherwise not be feasible in lab settings. However, the scale and flexibility comes at the cost of multiple and sometimes unknown sources of bias and confounding factors that arise from technical limitations of crowdsourcing platforms and from the challenges of running controlled experiments in the "wild". In this paper, we take our experience in running systematic evaluations of task design as a motivating example to explore, describe, and quantify the potential impact of running uncontrolled crowdsourcing experiments and derive possible coping strategies. Among the challenges identified, we can mention sampling bias, controlling the assignment of subjects to experimental conditions, learning effects, and reliability of crowdsourcing results. According to our empirical studies, the impact of potential biases and confounding factors can amount to a 38\% loss in the utility of the data collected in uncontrolled settings; and it can significantly change the outcome of experiments. These issues ultimately inspired us to implement CrowdHub, a system that sits on top of major crowdsourcing platforms and allows researchers and practitioners to run controlled crowdsourcing projects.

SESep 7, 2020
Chatbot integration in few patterns

Marcos Baez, Florian Daniel, Fabio Casati et al.

Chatbots are software agents that are able to interact with humans in natural language. Their intuitive interaction paradigm is expected to significantly reshape the software landscape of tomorrow, while already today chatbots are invading a multitude of scenarios and contexts. This article takes a developer's perspective, identifies a set of architectural patterns that capture different chatbot integration scenarios, and reviews state-of-the-art development aids.

CYAug 19, 2020
Automatic Generation of Chatbots for Conversational Web Browsing

Pietro Chittò, Marcos Baez, Florian Daniel et al.

In this paper, we describe the foundations for generating a chatbot out of a website equipped with simple, bot-specific HTML annotations. The approach is part of what we call conversational web browsing, i.e., a dialog-based, natural language interaction with websites. The goal is to enable users to use content and functionality accessible through rendered UIs by "talking to websites" instead of by operating the graphical UI using keyboard and mouse. The chatbot mediates between the user and the website, operates its graphical UI on behalf of the user, and informs the user about the state of interaction. We describe the conceptual vocabulary and annotation format, the supporting conversational middleware and techniques, and the implementation of a demo able to deliver conversational web browsing experiences through Amazon Alexa.

HCSep 6, 2019
CrowdHub: Extending crowdsourcing platforms for the controlled evaluation of tasks designs

Jorge Ramírez, Simone Degiacomi, Davide Zanella et al.

We present CrowdHub, a tool for running systematic evaluations of task designs on top of crowdsourcing platforms. The goal is to support the evaluation process, avoiding potential experimental biases that, according to our empirical studies, can amount to 38% loss in the utility of the collected dataset in uncontrolled settings. Using CrowdHub, researchers can map their experimental design and automate the complex process of managing task execution over time while controlling for returning workers and crowd demographics, thus reducing bias, increasing utility of collected data, and making more efficient use of a limited pool of subjects.

HCSep 6, 2019
Understanding the Impact of Text Highlighting in Crowdsourcing Tasks

Jorge Ramírez, Marcos Baez, Fabio Casati et al.

Text classification is one of the most common goals of machine learning (ML) projects, and also one of the most frequent human intelligence tasks in crowdsourcing platforms. ML has mixed success in such tasks depending on the nature of the problem, while crowd-based classification has proven to be surprisingly effective, but can be expensive. Recently, hybrid text classification algorithms, combining human computation and machine learning, have been proposed to improve accuracy and reduce costs. One way to do so is to have ML highlight or emphasize portions of text that it believes to be more relevant to the decision. Humans can then rely only on this text or read the entire text if the highlighted information is insufficient. In this paper, we investigate if and under what conditions highlighting selected parts of the text can (or cannot) improve classification cost and/or accuracy, and in general how it affects the process and outcome of the human intelligence tasks. We study this through a series of crowdsourcing experiments running over different datasets and with task designs imposing different cognitive demands. Our findings suggest that highlighting is effective in reducing classification effort but does not improve accuracy - and in fact, low-quality highlighting can decrease it.

IRApr 1, 2019
Combining Crowd and Machines for Multi-predicate Item Screening

Evgeny Krivosheev, Fabio Casati, Marcos Baez et al.

This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. We further show how, given a new classification problem and a set of classifiers of unknown accuracy for the problem at hand, we can identify how to manage the cost-accuracy trade off by progressively determining if we should spend budget to obtain test data (to assess the accuracy of the given classifiers), or to train an ensemble of classifiers, or whether we should leverage the existing machine classifiers with the crowd, and in this case how to efficiently combine them based on their estimated characteristics to obtain the classification. We demonstrate that the techniques we propose obtain significant cost/accuracy improvements with respect to the leading classification algorithms.

HCJan 14, 2019
Technologies for promoting social participation in later life

Marcos Baez, Radoslaw Nielek, Fabio Casati et al.

Social participation is known to bring great benefits to the health and well-being of people as they age. From being in contact with others to engaging in group activities, keeping socially active can help slow down the effects of age-related declines, reduce risks of loneliness and social isolation and even mortality in old age. There are unfortunately a variety of barriers that make it difficult for older adults to engage in social activities in a regular basis. In this chapter, we give an overview of the challenges to social participation and discuss how technology can help overcome these barriers and promote participation in social activities. We examine two particular research threads and designs, exploring ways in which technology can support co-located and virtual participation: i) an application that motivates the virtual participation in group training programs, and ii) a location-based game that supports co-located intergenerational ICT training classes. We discuss the effectiveness and limitations of various design choices in the two use cases and outline the lessons learned

CYMay 31, 2018
Designing for Co-located and Virtual Social Interactions in Residential Care

Francisco Ibarra, Marcos Baez, Francesca Fiore et al.

In this paper we explore the feasibility and design challenges in supporting co-located and virtual social interactions in residential care by building on the practice of reminiscence. Motivated by the challenges of social interaction in this context, we first explore the feasibility of a reminiscence-based social interaction tool designed to stimulate conversation in residential care with different stakeholders. Then, we explore the design challenges in supporting an assisting role in co-located reminiscence sessions, by running pilot studies with a technology probe. Our findings point to the feasibility of the tool and the willingness of stakeholders to contribute in the process, although with some skepticism about virtual interactions. The reminiscence sessions showed that compromises are needed when designing for both story collection and conversation stimulation, evidencing specific design areas where further exploration is needed.

HCMay 31, 2018
CrowdRev: A platform for Crowd-based Screening of Literature Reviews

Jorge Ramirez, Evgeny Krivosheev, Marcos Baez et al.

In this paper and demo we present a crowd and crowd+AI based system, called CrowdRev, supporting the screening phase of literature reviews and achieving the same quality as author classification at a fraction of the cost, and near-instantly. CrowdRev makes it easy for authors to leverage the crowd, and ensures that no money is wasted even in the face of difficult papers or criteria: if the system detects that the task is too hard for the crowd, it just gives up trying (for that paper, or for that criteria, or altogether), without wasting money and never compromising on quality.

HCMay 31, 2018
Crowdsourcing for Reminiscence Chatbot Design

Svetlana Nikitina, Florian Daniel, Marcos Baez et al.

In this work-in-progress paper we discuss the challenges in identifying effective and scalable crowd-based strategies for designing content, conversation logic, and meaningful metrics for a reminiscence chatbot targeted at older adults. We formalize the problem and outline the main research questions that drive the research agenda in chatbot design for reminiscence and for relational agents for older adults in general.

HCApr 18, 2018
Smart Conversational Agents for Reminiscence

Svetlana Nikitina, Sara Callaioli, Marcos Baez

In this paper we describe the requirements and early system design for a smart conversational agent that can assist older adults in the reminiscence process. The practice of reminiscence has well documented benefits for the mental, social and emotional well-being of older adults. However, the technology support, valuable in many different ways, is still limited in terms of need of co-located human presence, data collection capabilities, and ability to support sustained engagement, thus missing key opportunities to improve care practices, facilitate social interactions, and bring the reminiscence practice closer to those with less opportunities to engage in co-located sessions with a (trained) companion. We discuss conversational agents and cognitive services as the platform for building the next generation of reminiscence applications, and introduce the concept application of a smart reminiscence agent.

HCMar 18, 2017
Designing for older adults: review of touchscreen design guidelines

Leysan Nurgalieva, Juan Jose Jara Laconich, Marcos Baez et al.

The distinct abilities of older adults to interact with computers has motivated a wide range of contributions in the the form of design guidelines for making technologies usable and accessible for the elderly population. However, despite the growing effort by the research community, the adoption of guidelines by developers and designers has been scant or not properly translated into more accessible interaction systems. In this paper we explore this issue by reporting on a qualitative outcomes of a systematic review of 204 research-derived design guidelines for touchscreen applications. We report first on the different definitions of "elderly" and assess the reliability, organization and accessibility of the guidelines. Then we present our early attempt at facilitating the reporting and access of such guidelines to researchers and practitioners.

HCSep 17, 2016
Online Group-exercises for Older Adults of Different Physical Abilities

Marcos Baez, Francisco Ibarra, Iman Khaghani Far et al.

In this paper we describe the design and validation of a virtual fitness environment aiming at keeping older adults physically and socially active. We target particularly older adults who are socially more isolated, physically less active, and with less chances of training in a gym. The virtual fitness environment, namely Gymcentral, was designed to enable and motivate older adults to follow personalised exercises from home, with a (heterogeneous) group of remote friends and under the remote supervision of a Coach. We take the training activity as an opportunity to create social interactions, by complementing training features with social instruments. Finally, we report on the feasibility and effectiveness of the virtual environment, as well as its effects on the usage and social interactions, from an intervention study in Trento, Italy

CYMar 9, 2016
Personalized Persuasion for Social Interactions in Nursing Homes

Marcos Baez, Chiara Dalpiaz, Fatbardha Hoxha et al.

This paper presents our preliminary investigation and approach towards a mixed physical-virtual technology for stimulating social interactions among and with older adults in nursing homes. We report on set of surveys, apps and focus groups aiming at understanding the different motivations and obstacles in promoting social interactions in institutionalised care. We then present our approach to address some of the key themes found, e.g., the technological disparity, lack of conversation topics and opportunities to interact