AISep 29, 2012

Test-cost-sensitive attribute reduction of data with normal distribution measurement errors

arXiv:1210.0091v226 citations
Originality Incremental advance
AI Analysis

This work addresses cost-sensitive learning for realistic applications where measurement errors follow normal distributions, representing an incremental advancement over prior uniform distribution models.

The paper tackles the problem of selecting attributes with normal distribution measurement errors to minimize test costs in decision-making, proposing a new covering-based rough set model and a heuristic algorithm that shows improved effectiveness and efficiency on ten UCI datasets.

The measurement error with normal distribution is universal in applications. Generally, smaller measurement error requires better instrument and higher test cost. In decision making based on attribute values of objects, we shall select an attribute subset with appropriate measurement error to minimize the total test cost. Recently, error-range-based covering rough set with uniform distribution error was proposed to investigate this issue. However, the measurement errors satisfy normal distribution instead of uniform distribution which is rather simple for most applications. In this paper, we introduce normal distribution measurement errors to covering-based rough set model, and deal with test-cost-sensitive attribute reduction problem in this new model. The major contributions of this paper are four-fold. First, we build a new data model based on normal distribution measurement errors. With the new data model, the error range is an ellipse in a two-dimension space. Second, the covering-based rough set with normal distribution measurement errors is constructed through the "3-sigma" rule. Third, the test-cost-sensitive attribute reduction problem is redefined on this covering-based rough set. Fourth, a heuristic algorithm is proposed to deal with this problem. The algorithm is tested on ten UCI (University of California - Irvine) datasets. The experimental results show that the algorithm is more effective and efficient than the existing one. This study is a step toward realistic applications of cost-sensitive learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes