AILGSep 25, 2012

Feature selection with test cost constraint

arXiv:1209.5601v1165 citations
Originality Incremental advance
AI Analysis

This addresses the practical issue of limited resources in feature acquisition for machine learning applications, offering an incremental improvement by redefining existing rough set problems from a constraint satisfaction perspective.

The paper tackles the problem of selecting informative yet affordable features under resource constraints by formulating feature selection with test cost as a constraint satisfaction problem, proposing a backtracking algorithm for medium-sized data and a heuristic for large datasets that finds optimal solutions in most cases.

Feature selection is an important preprocessing step in machine learning and data mining. In real-world applications, costs, including money, time and other resources, are required to acquire the features. In some cases, there is a test cost constraint due to limited resources. We shall deliberately select an informative and cheap feature subset for classification. This paper proposes the feature selection with test cost constraint problem for this issue. The new problem has a simple form while described as a constraint satisfaction problem (CSP). Backtracking is a general algorithm for CSP, and it is efficient in solving the new problem on medium-sized data. As the backtracking algorithm is not scalable to large datasets, a heuristic algorithm is also developed. Experimental results show that the heuristic algorithm can find the optimal solution in most cases. We also redefine some existing feature selection problems in rough sets, especially in decision-theoretic rough sets, from the viewpoint of CSP. These new definitions provide insight to some new research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes