IRSep 30, 2021
Library of Congress Subject Heading (LCSH) Browsing and Natural Language SearchingCharles-Antoine Julien, Banafsheh Asadi, Jesse David Dinneen et al.
Controlled topical vocabularies (CVs) are built into information systems to aid browsing and retrieval of items that may be unfamiliar, but it is unclear how this feature should be integrated with standard keyword searching. Few systems or scholarly prototypes have attempted this, and none have used the most widely used CV, the Library of Congress Subject Headings (LCSH), which organizes monograph collections in academic libraries throughout the world. This paper describes a working prototype of a Web application that concurrently allows topic exploration using an outline tree view of the LCSH hierarchy and natural language keyword searching of a real-world Science and Engineering bibliographic collection. Pilot testing shows the system is functional, and work to fit the complex LCSH structure into a usable hierarchy is ongoing. This study contributes to knowledge of the practical design decisions required when developing linked interactions between topical hierarchy browsing and natural language searching, which promise to facilitate information discovery and exploration.
HCSep 30, 2021
Mac Users Do It Differently: the Role of Operating System and Individual Differences in File ManagementJesse David Dinneen, Ilja Frissen
Despite much discussion in HCI research about how individual differences likely determine computer users' personal information management (PIM) practices, the extent of the influence of several important factors remains unclear, including users' personalities, spatial abilities, and the different software used to manage their collections. We therefore analyse data from prior CHI work to explore (1) associations of people's file collections with personality and spatial ability, and (2) differences between collections managed with different operating systems and file managers. We find no notable associations between users' attributes and their collections, and minimal predictive power, but do find considerable and surprising differences across operating systems. We discuss these findings and how they can inform future research.
AISep 20, 2021
Actionable Approaches to Promote Ethical AI in LibrariesHelen Bubinger, Jesse David Dinneen
The widespread use of artificial intelligence (AI) in many domains has revealed numerous ethical issues from data and design to deployment. In response, countless broad principles and guidelines for ethical AI have been published, and following those, specific approaches have been proposed for how to encourage ethical outcomes of AI. Meanwhile, library and information services too are seeing an increase in the use of AI-powered and machine learning-powered information systems, but no practical guidance currently exists for libraries to plan for, evaluate, or audit the ethics of intended or deployed AI. We therefore report on several promising approaches for promoting ethical AI that can be adapted from other contexts to AI-powered information services and in different stages of the software lifecycle.
HCSep 20, 2021
The ubiquitous digital file: A review of file management researchJesse David Dinneen, Charles-Antoine Julien
Computer users spend time every day interacting with digital files and folders, including downloading, moving, naming, navigating to, searching for, sharing, and deleting them. Such file management has been the focus of many studies across various fields, but has not been explicitly acknowledged nor made the focus of dedicated review. In this article we present the first dedicated review of this topic and its research, synthesizing more than 230 publications from various research domains to establish what is known and what remains to be investigated, particularly by examining the common motivations, methods, and findings evinced by the previously furcate body of work. We find three typical research motivations in the literature reviewed: understanding how and why users store, organize, retrieve, and share files and folders, understanding factors that determine their behavior, and attempting to improve the user experience through novel interfaces and information services. Relevant conceptual frameworks and approaches to designing and testing systems are described, and open research challenges and the significance for other research areas are discussed. We conclude that file management is a ubiquitous, challenging, and relatively unsupported activity that invites and has received attention from several disciplines and has broad importance for topics across information science.
HCJul 7, 2021
Personal Information ManagementWilliam Jones, Jesse David Dinneen, Robert Capra et al.
Personal Information Management (PIM) refers to the practice and the study of the activities a person performs in order to acquire or create, store, organize, maintain, retrieve, use, and distribute information in each of its many forms (paper and digital, in e-mails, files, Web pages, text messages, tweets, posts, etc.) as needed to meet life's many goals (everyday and long-term, work-related and not) and to fulfill life's many roles and responsibilities (as parent, spouse, friend, employee, member of community, etc.). PIM activities are an effort to establish, use, and maintain a mapping between information and need. Activities of finding (and re-finding) move from a current need toward information while activities of keeping move from encountered information toward anticipated need. Meta-level activities such as maintaining, organizing, and managing the flow of information focus on the mapping itself. Tools and techniques of PIM can promote information integration with benefits for each kind of PIM activity and across the life cycle of personal information. Understanding how best to accomplish this integration without inadvertently creating problems along the way is a key challenge of PIM.
DLJul 7, 2021
Not Quite 'Ask a Librarian': AI on the Nature, Value, and Future of LISJesse David Dinneen, Helen Bubinger
AI language models trained on Web data generate prose that reflects human knowledge and public sentiments, but can also contain novel insights and predictions. We asked the world's best language model, GPT-3, fifteen difficult questions about the nature, value, and future of library and information science (LIS), topics that receive perennial attention from LIS scholars. We present highlights from its 45 different responses, which range from platitudes and caricatures to interesting perspectives and worrisome visions of the future, thus providing an LIS-tailored demonstration of the current performance of AI language models. We also reflect on the viability of using AI to forecast or generate research ideas in this way today. Finally, we have shared the full response log online for readers to consider and evaluate for themselves.
HCJul 7, 2021
How Big Are Peoples' Computer Files? File Size Distributions Among User-managed CollectionsJesse David Dinneen, Ba Xuan Nguyen
Improving file management interfaces and optimising system performance requires current data about users' digital collections and particularly about the file size distributions of such collections. However, prior works have examined only the sizes of system files and users' work files in varied contexts, and there has been no such study since 2013; it therefore remains unclear how today's file sizes are distributed, particularly personal files, and further if distributions differ among the major operating systems or common occupations. Here we examine such differences among 49 million files in 348 user collections. We find that the average file size has grown more than ten-fold since the mid-2000s, though most files are still under 8 MB, and that there are demographic and technological influences in the size distributions. We discuss the implications for user interfaces, system optimisation, and PIM research.