Leopoldo Lusquino Filho

2papers

2 Papers

40.9CYApr 23
Brazilian Social Media Anti-vaccine Information Disorder Dataset -- Telegram (2020-2025)

João Phillipe Cardenuto, Ana Carolina Monari, Michelle Diniz Lopes et al.

Over the past decade, Brazil has experienced a decline in vaccination coverage, reversing decades of public health progress achieved through the National Immunization Program (PNI). Growing evidence points to the widespread circulation of vaccine-related misinformation -- particularly on social media platforms -- as a key factor driving this decline. Among these platforms, Telegram remains the only major platform permitting accessible and ethical data collection, offering insight into public channels where vaccine misinformation circulates extensively. This data paper introduces a curated dataset of about four million Telegram posts collected from 119 prominent Brazilian anti-vaccine channels between 2020 and 2025. The dataset includes message content, metadata, associated media, and classification related to vaccine posts, enabling researchers to examine how false or misleading information spreads, evolves, and influences public sentiment. By providing this resource, our aim is to support the scientific and public health community in developing evidence-based strategies to counter misinformation, promote trust in vaccination, and engage compassionately with individuals and communities affected by false narratives. The dataset and documentation are openly available for non-commercial research, under strict ethical and privacy guidelines at https://doi.org/10.25824/redu/5JIVDT

31.1LGMay 9
METBRA25Y: Brazil Surface Meteorology Archive with Harmonized Variables and Quality Control

Matheus Lima Castro, William Dantas Vichete, Leopoldo Lusquino Filho

This data paper describes METBRA25Y, a harmonized archive of hourly surface meteorological observations from Brazil derived from public historical records of the Instituto Nacional de Meteorologia (INMET). The dataset was designed to support reproducible environmental, climatological, hydrological, agricultural, urban-risk, and machine-learning studies that require station-level meteorological time series with standardized variable names and explicit quality-control metadata. The processing workflow ingests annual INMET archives, parses station metadata from raw file headers, normalizes heterogeneous Portuguese column names into a canonical schema, constructs hourly timestamps, consolidates observations by city and station, and exports compressed CSV files together with station manifests, per-station quality flags, daily precipitation aggregates, variable-level failure summaries, and missing-data audits. The quality-control protocol follows a two-stage strategy: first, physically implausible values are converted to missing values and flagged; second, temporal and cross-variable consistency checks generate diagnostic flags without necessarily overwriting the original measurements. The resulting package covers observations between 2000 and 2025, with stationspecific temporal coverage, and includes key meteorological variables such as precipitation, air temperature, dew point, relative humidity, atmospheric pressure, wind speed, wind gust, wind direction, and global solar radiation. Based on the summary files included in the current release snapshot, the archive contains 616 unique station codes across variable summaries, of which 605 have coordinates within a broad Brazil plausibility envelope. This paper documents the dataset provenance, file organization, harmonized schema, quality-control rules, technical validation outputs, limitations, and recommended usage practices.