ClickSight: Interpreting Student Clickstreams to Reveal Insights on Learning Strategies via LLMs
This addresses the challenge of extracting insights from educational data for researchers and educators, but it is incremental as it builds on existing LLM methods.
The authors tackled the problem of interpreting high-dimensional student clickstream data from digital learning environments by introducing ClickSight, an LLM-based pipeline that generates textual interpretations of learning strategies, finding that LLMs can reasonably interpret strategies but quality varies by prompting and self-refinement offers limited improvement.
Clickstream data from digital learning environments offer valuable insights into students' learning behaviors, but are challenging to interpret due to their high dimensionality and granularity. Prior approaches have relied mainly on handcrafted features, expert labeling, clustering, or supervised models, therefore often lacking generalizability and scalability. In this work, we introduce ClickSight, an in-context Large Language Model (LLM)-based pipeline that interprets student clickstreams to reveal their learning strategies. ClickSight takes raw clickstreams and a list of learning strategies as input and generates textual interpretations of students' behaviors during interaction. We evaluate four different prompting strategies and investigate the impact of self-refinement on interpretation quality. Our evaluation spans two open-ended learning environments and uses a rubric-based domain-expert evaluation. Results show that while LLMs can reasonably interpret learning strategies from clickstreams, interpretation quality varies by prompting strategy, and self-refinement offers limited improvement. ClickSight demonstrates the potential of LLMs to generate theory-driven insights from educational interaction data.