Ranking Under Uncertainty
This addresses the issue of unreliable rankings due to noise in real-world data, such as in biomedical research, but is incremental as it builds on existing similarity measures.
The paper tackled the problem of ranking reliability under noisy measurements by introducing an analytical method to assess noise influence, finding that Top-K-List overlap is more sensitive to noise than Kendall's tau, and applied it to gene selection in cancer microarray experiments, revealing poor reliability and the need for much larger experiment sizes than currently available.
Ranking objects is a simple and natural procedure for organizing data. It is often performed by assigning a quality score to each object according to its relevance to the problem at hand. Ranking is widely used for object selection, when resources are limited and it is necessary to select a subset of most relevant objects for further processing. In real world situations, the object's scores are often calculated from noisy measurements, casting doubt on the ranking reliability. We introduce an analytical method for assessing the influence of noise levels on the ranking reliability. We use two similarity measures for reliability evaluation, Top-K-List overlap and Kendall's tau measure, and show that the former is much more sensitive to noise than the latter. We apply our method to gene selection in a series of microarray experiments of several cancer types. The results indicate that the reliability of the lists obtained from these experiments is very poor, and that experiment sizes which are necessary for attaining reasonably stable Top-K-Lists are much larger than those currently available. Simulations support our analytical results.