AGGA: A Dataset of Academic Guidelines for Generative AI and Large Language Models
It provides a resource for researchers in requirements engineering and NLP, but is incremental as it focuses on data collection without new methods.
This study introduces AGGA, a dataset of 80 academic guidelines for Generative AI and Large Language Models, containing 188,674 words, to support natural language processing tasks like model synthesis and ambiguity detection in academic settings.
This study introduces AGGA, a dataset comprising 80 academic guidelines for the use of Generative AIs (GAIs) and Large Language Models (LLMs) in academic settings, meticulously collected from official university websites. The dataset contains 188,674 words and serves as a valuable resource for natural language processing tasks commonly applied in requirements engineering, such as model synthesis, abstraction identification, and document structure assessment. Additionally, AGGA can be further annotated to function as a benchmark for various tasks, including ambiguity detection, requirements categorization, and the identification of equivalent requirements. Our methodologically rigorous approach ensured a thorough examination, with a selection of universities that represent a diverse range of global institutions, including top-ranked universities across six continents. The dataset captures perspectives from a variety of academic fields, including humanities, technology, and both public and private institutions, offering a broad spectrum of insights into the integration of GAIs and LLMs in academia.