NewsHomepages: Homepage Layouts Capture Information Prioritization Decisions
This work addresses the problem of quantifying information prioritization for researchers and organizations, but it is incremental as it builds on existing concepts of layout analysis.
The authors tackled the problem of understanding information prioritization by analyzing homepage layouts, creating a dataset of over 3,000 news website homepages captured twice daily over three years, and developed models to infer the relative significance of news items, applying them to rank local city council policies in San Francisco.
Information prioritization plays an important role in how humans perceive and understand the world. Homepage layouts serve as a tangible proxy for this prioritization. In this work, we present NewsHomepages, a large dataset of over 3,000 new website homepages (including local, national and topic-specific outlets) captured twice daily over a three-year period. We develop models to perform pairwise comparisons between news items to infer their relative significance. To illustrate that modeling organizational hierarchies has broader implications, we applied our models to rank-order a collection of local city council policies passed over a ten-year period in San Francisco, assessing their "newsworthiness". Our findings lay the groundwork for leveraging implicit organizational cues to deepen our understanding of information prioritization.