AIJul 6, 2021

Comparing PCG metrics with Human Evaluation in Minecraft Settlement Generation

arXiv:2107.02457v114 citations
AI Analysis

This work addresses the challenge of validating PCG metrics for game design, but it is incremental as it applies known methods to a new domain without major breakthroughs.

The paper tackled the problem of evaluating procedural content generation (PCG) metrics by adapting existing and developing new metrics for Minecraft settlements and comparing them to human evaluations, finding relationships between human scores and metrics that count specific elements, measure block diversity, and assess crafting material presence.

There are a range of metrics that can be applied to the artifacts produced by procedural content generation, and several of them come with qualitative claims. In this paper, we adapt a range of existing PCG metrics to generated Minecraft settlements, develop a few new metrics inspired by PCG literature, and compare the resulting measurements to existing human evaluations. The aim is to analyze how those metrics capture human evaluation scores in different categories, how the metrics generalize to another game domain, and how metrics deal with more complex artifacts. We provide an exploratory look at a variety of metrics and provide an information gain and several correlation analyses. We found some relationships between human scores and metrics counting specific elements, measuring the diversity of blocks and measuring the presence of crafting materials for the present complex blocks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes