An autonomous living database for perovskite photovoltaics
This addresses the knowledge gap for researchers in photovoltaics by enabling real-time data-driven discovery, though it is incremental as it applies existing methods to a new domain.
The authors tackled the bottleneck of manual curation in perovskite photovoltaics by creating an autonomous living database (PERLA) that extracts device data from literature with over 90% precision, revealing a shift toward inverted architectures and formamidinium-rich compositions that reduce voltage loss.
Scientific discovery is severely bottlenecked by the inability of manual curation to keep pace with exponential publication rates. This creates a widening knowledge gap. This is especially stark in photovoltaics, where the leading database for perovskite solar cells has been stagnant since 2021 despite massive ongoing research output. Here, we resolve this challenge by establishing an autonomous, self-updating living database (PERLA). Our pipeline integrates large language models with physics-aware validation to extract complex device data from the continuous literature stream, achieving human-level precision (>90%) and eliminating annotator variance. By employing this system on the previously inaccessible post-2021 literature, we uncover critical evolutionary trends hidden by data lag: the field has decisively shifted toward inverted architectures employing self-assembled monolayers and formamidinium-rich compositions, driving a clear trajectory of sustained voltage loss reduction. PERLA transforms static publications into dynamic knowledge resources that enable data-driven discovery to operate at the speed of publication.