Dagmar Kern

IR
9papers
632citations
Novelty17%
AI Score19

9 Papers

CLFeb 13, 2023
Evaluation of Word Embeddings for the Social Sciences

Ricardo Schiffers, Dagmar Kern, Daniel Hienert

Word embeddings are an essential instrument in many NLP tasks. Most available resources are trained on general language from Web corpora or Wikipedia dumps. However, word embeddings for domain-specific language are rare, in particular for the social science domain. Therefore, in this work, we describe the creation and evaluation of word embedding models based on 37,604 open-access social science research papers. In the evaluation, we compare domain-specific and general language models for (i) language coverage, (ii) diversity, and (iii) semantic relationships. We found that the created domain-specific model, even with a relatively small vocabulary size, covers a large part of social science concepts, their neighborhoods are diverse in comparison to more general models. Across all relation types, we found a more extensive coverage of semantic relationships.

HCAug 5, 2020
'A Modern Up-To-Date Laptop' -- Vagueness in Natural Language Queries for Product Search

Andrea Papenmeier, Alfred Sliwa, Dagmar Kern et al.

With the rise of voice assistants and an increase in mobile search usage, natural language has become an important query language. So far, most of the current systems are not able to process these queries because of the vagueness and ambiguity in natural language. Users have adapted their query formulation to what they think the search engine is capable of, which adds to their cognitive burden. With our research, we contribute to the design of interactive search systems by investigating the genuine information need in a product search scenario. In a crowd-sourcing experiment, we collected 132 information needs in natural language. We examine the vagueness of the formulations and their match to retailer-generated content and user-generated product reviews. Our findings reveal high variance on the level of vagueness and the potential of user reviews as a source for supporting users with rather vague search intents.

IRAug 5, 2020
The Role of Word-Eye-Fixations for Query Term Prediction

Masoud Davari, Daniel Hienert, Dagmar Kern et al.

Throughout the search process, the user's gaze on inspected SERPs and websites can reveal his or her search interests. Gaze behavior can be captured with eye tracking and described with word-eye-fixations. Word-eye-fixations contain the user's accumulated gaze fixation duration on each individual word of a web page. In this work, we analyze the role of word-eye-fixations for predicting query terms. We investigate the relationship between a range of in-session features, in particular, gaze data, with the query terms and train models for predicting query terms. We use a dataset of 50 search sessions obtained through a lab study in the social sciences domain. Using established machine learning models, we can predict query terms with comparably high accuracy, even with only little training data. Feature analysis shows that the categories Fixation, Query Relevance and Session Topic contain the most effective features for our task.

DLSep 24, 2019
Recognizing Topic Change in Search Sessions of Digital Libraries based on Thesaurus and Classification System

Daniel Hienert, Dagmar Kern

Log analysis in Web search showed that user sessions often contain several different topics. This means sessions need to be segmented into parts which handle the same topic in order to give appropriate user support based on the topic, and not on a mixture of topics. Different methods have been proposed to segment a user session to different topics based on timeouts, lexical analysis, query similarity or external knowledge sources. In this paper, we study the problem in a digital library for the social sciences. We present a method based on a thesaurus and a classification system which are typical knowledge organization systems in digital libraries. Five experts evaluated our approach and rated it as good for the segmentation of search sessions into parts that treat the same topic.

IRSep 19, 2019
Understanding the Information needs of Social Scientists in Germany

Dagmar Kern, Daniel Hienert

The information needs of social science researchers are manifold and almost studied in every decade since the 1950s. With this paper, we contribute to this series and present the results of three studies. We asked 367 social science researchers in Germany for their information needs and identified needs in different categories: literature, research data, measurement instruments, support for data analysis, support for data collection, variables in research data, software support, networking/cooperation, and illustrative material. Thereby, the search for literature and research data is still the main information need with more than three-quarter of our participants expressing needs in these categories. With comprehensive lists of altogether 154 concrete information needs, even those that are only expressed by one participant, we contribute to the holistic understanding of the information needs of social science researchers of today.

IRFeb 12, 2019
Reading Protocol: Understanding what has been Read in Interactive Information Retrieval Tasks

Daniel Hienert, Dagmar Kern, Matthew Mitsui et al.

In Interactive Information Retrieval (IIR) experiments the user's gaze motion on web pages is often recorded with eye tracking. The data is used to analyze gaze behavior or to identify Areas of Interest (AOI) the user has looked at. So far, tools for analyzing eye tracking data have certain limitations in supporting the analysis of gaze behavior in IIR experiments. Experiments often consist of a huge number of different visited web pages. In existing analysis tools the data can only be analyzed in videos or images and AOIs for every single web page have to be specified by hand, in a very time consuming process. In this work, we propose the reading protocol software which breaks eye tracking data down to the textual level by considering the HTML structure of the web pages. This has a lot of advantages for the analyst. First and foremost, it can easily be identified on a large scale what has actually been viewed and read on the stimuli pages by the subjects. Second, the web page structure can be used to filter to AOIs. Third, gaze data of multiple users can be presented on the same page, and fourth, fixation times on text can be exported and further processed in other tools. We present the software, its validation, and example use cases with data from three existing IIR experiments.

IRSep 7, 2018
Term-Mouse-Fixations as an Additional Indicator for Topical User Interests in Domain-Specific Search

Daniel Hienert, Dagmar Kern

Models in Interactive Information Retrieval (IIR) are grounded very much on the user's task in order to give system support based on different task types and topics. However, the automatic recognition of user interests from log data in search systems is not trivial. Search queries entered by users a surely one such source. However, queries may be short, or users are only browsing. In this paper, we propose a method of term-mouse-fixations which takes the fixations on terms users are hovering over with the mouse into consideration to estimate topical user interests. We analyzed 22,259 search sessions of a domain-specific digital library over a period of about four months. We compared these mouse fixations to user-entered search terms and to titles and keywords from documents the user showed an interest in. These terms were found in 87.12% of all analyzed sessions; in this subset of sessions, per session on average 11.46 term-mouse-fixations from queries and viewed documents were found. These terms were fixated significantly longer with about 7 seconds than other terms with about 4.4 seconds. This means, term-mouse-fixations provide indicators for topical user interests and it is possible to extract them based on fixation time.

IRApr 27, 2015
WHOSE - A Tool for Whole-Session Analysis in IIR

Daniel Hienert, Wilko van Hoek, Alina Weber et al.

One of the main challenges in Interactive Information Retrieval (IIR) evaluation is the development and application of re-usable tools that allow researchers to analyze search behavior of real users in different environments and different domains, but with comparable results. Furthermore, IIR recently focuses more on the analysis of whole sessions, which includes all user interactions that are carried out within a session but also across several sessions by the same user. Some frameworks have already been proposed for the evaluation of controlled experiments in IIR, but yet no framework is available for interactive evaluation of search behavior from real-world information retrieval (IR) systems with real users. In this paper we present a framework for whole-session evaluation that can also utilize these uncontrolled data sets. The logging component can easily be integrated into real-world IR systems for generating and analyzing new log data. Furthermore, due to a supplementary mapping it is also possible to analyze existing log data. For every IR system different actions and filters can be defined. This allows system operators and researchers to use the framework for the analysis of user search behavior in their IR systems and to compare it with others. Using a graphical user interface they have the possibility to interactively explore the data set from a broad overview down to individual sessions.

IRJul 6, 2014
Establishing an Online Access Panel for Interactive Information Retrieval Research

Dagmar Kern, Peter Mutschke, Philipp Mayr

We propose an online access panel to support the evaluation process of Interactive Information Retrieval (IIR) systems - called IIRpanel. By maintaining an online access panel with users of IIR systems we assume that the recurring effort to recruit participants for web-based as well as for lab studies can be minimized. We target on using the online access panel not only for our own development processes but to open it for other interested researchers in the field of IIR. In this paper we present the concept of IIRpanel as well as first implementation details.