IRFeb 14, 2014
Designing an Ontology for the Data Documentation InitiativeThomas Bosch, Andias Wira-Alam, Brigitte Mathiak
An ontology of the DDI 3 data model will be designed by following the ontology engineering methodology to be evolved based on state-of-the-art methodologies. Hence DDI 3 data and metadata can be represented in form of a standard web interchange format RDF and processed by highly available RDF tools. As a consequence the DDI community has the possibility to publish and link LOD data sets to become part of the LOD cloud.
IRAug 20, 2012
Dealing with Sparse Document and Topic Representations: Lab Report for CHiC 2012Philipp Schaer, Daniel Hienert, Frank Sawitzki et al.
We will report on the participation of GESIS at the first CHiC workshop (Cultural Heritage in CLEF). Being held for the first time, no prior experience with the new data set, a document dump of Europeana with ca. 23 million documents, exists. The most prominent issues that arose from pretests with this test collection were the very unspecific topics and sparse document representations. Only half of the topics (26/50) contained a description and the titles were usually short with just around two words. Therefore we focused on three different term suggestion and query expansion mechanisms to surpass the sparse topical description. We used two methods that build on concept extraction from Wikipedia and on a method that applied co-occurrence statistics on the available Europeana corpus. In the following paper we will present the approaches and preliminary results from their assessments.