Jan Sawicki

2papers

2 Papers

IROct 5, 2021Code
Exploring usability of Reddit in data science and knowledge processing

Jan Sawicki, Maria Ganzha, Marcin Paprzycki et al.

This contribution argues that Reddit, as a massive, categorized, open-access dataset, is a useful data source, for "almost any topic". Hence, it can be used in data science, e.g. for knowledge exploration. This statement is backed-up with presented analysis, based on 180 manually annotated papers, related to Reddit itself, and data acquired from popular databases of scientific papers. Finally, an open source tool is introduced, which provides an easy access to Reddit resources, and an exploratory data analysis of how Reddit covers selected topics. These functions can be used as a prelude analysis to a broader exploration of Reddit's applicability.

HCJan 21, 2022
VisQualdex -- the comprehensive guide to good data visualization

Jan Sawicki, Michał Burdukiewicz

The rapid influx of low-quality data visualisations is one of the main challenges in today's communication. Misleading, unreadable, or confusing visualisations spread misinformation, failing to fulfill their purpose. The lack of proper tooling further heightens the problem of the quality assessment process. Therefore, we propose VisQualdex, a systematic set of guidelines isnpired by the Grammar of Graphics for evaluating the quality of data visualisations. To increase the practical impact of VisQualdex, we make these guidelines available in the form of the web server, visqual.info.