CLAISep 10, 2020

Classification of descriptions and summary using multiple passes of statistical and natural language toolkits

arXiv:2009.04953v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a specific, incremental problem for package management or documentation systems by providing a tool to assess name-summary alignment.

The paper tackles the problem of checking the relevance of an entity's name to its summary or definition, specifically for package names and summaries from pypi.org, by developing a classifier that outputs a percentage score for this relevance.

This document describes a possible approach that can be used to check the relevance of a summary / definition of an entity with respect to its name. This classifier focuses on the relevancy of an entity's name to its summary / definition, in other words, it is a name relevance check. The percentage score obtained from this approach can be used either on its own or used to supplement scores obtained from other metrics to arrive upon a final classification; at the end of the document, potential improvements have also been outlined. The dataset that this document focuses on achieving an objective score is a list of package names and their respective summaries (sourced from pypi.org).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes