Fedor Kozlov

1paper

1 Paper

IRAug 6, 2014
Unstable markup: A template-based information extraction from web sites with unstable markup

Maxim Kolchin, Fedor Kozlov

This paper presents results of a work on crawling CEUR Workshop proceedings web site to a Linked Open Data (LOD) dataset in the framework of ESWC 2014 Semantic Publishing Challenge 2014. Our approach is based on using an extensible template-dependent crawler and DBpedia for linking extracted entities, such as the names of universities and countries.