Four Years in Review: Statistical Practices of Likert Scales in Human-Robot Interaction Studies
This identifies critical methodological flaws in HRI research, potentially affecting the reliability of conclusions about human perceptions and attitudes toward robots.
The paper reviewed statistical practices for Likert scales in human-robot interaction studies from 2016 to 2019, finding that only 3 out of 110 papers applied proper testing to correctly-designed scales, highlighting widespread issues.
As robots become more prevalent, the importance of the field of human-robot interaction (HRI) grows accordingly. As such, we should endeavor to employ the best statistical practices. Likert scales are commonly used metrics in HRI to measure perceptions and attitudes. Due to misinformation or honest mistakes, most HRI researchers do not adopt best practices when analyzing Likert data. We conduct a review of psychometric literature to determine the current standard for Likert scale design and analysis. Next, we conduct a survey of four years of the International Conference on Human-Robot Interaction (2016 through 2019) and report on incorrect statistical practices and design of Likert scales. During these years, only 3 of the 110 papers applied proper statistical testing to correctly-designed Likert scales. Our analysis suggests there are areas for meaningful improvement in the design and testing of Likert scales. Lastly, we provide recommendations to improve the accuracy of conclusions drawn from Likert data.