HCAILGMar 12, 2024

TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial Creation on Physical Tasks

arXiv:2403.08049v118 citationsh-index: 32CHI
AI Analysis

This addresses the challenge of creating browsable tutorials for procedural skills, but it is incremental as it builds on existing AI models and focuses on a specific application domain.

The paper tackled the problem of automating mixed-media tutorial creation for physical tasks, which is tedious manually and limited by domain-specific solutions, by proposing TutoAI, a cross-domain framework that uses AI models to extract components and design user interfaces, achieving higher or similar quality compared to a baseline in preliminary user studies.

Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos. However, manually creating such tutorials is tedious, and existing automated solutions are often restricted to a particular domain. While AI models hold promise, it is unclear how to effectively harness their powers, given the multi-modal data involved and the vast landscape of models. We present TutoAI, a cross-domain framework for AI-assisted mixed-media tutorial creation on physical tasks. First, we distill common tutorial components by surveying existing work; then, we present an approach to identify, assemble, and evaluate AI models for component extraction; finally, we propose guidelines for designing user interfaces (UI) that support tutorial creation based on AI-generated components. We show that TutoAI has achieved higher or similar quality compared to a baseline model in preliminary user studies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes