How to Formulate and Solve Statistical Recognition and Learning Problems
This work addresses foundational issues in statistical learning and recognition, offering a novel framework that could impact a broad range of machine learning applications.
The paper tackles the problem of formulating statistical recognition and learning by unifying them under a complex hypothesis testing framework, showing that some common methods are improper and proposing a generalized solution that outperforms maximal likelihood estimation in illustrative cases.
We formulate problems of statistical recognition and learning in a common framework of complex hypothesis testing. Based on arguments from multi-criteria optimization, we identify strategies that are improper for solving these problems and derive a common form of the remaining strategies. We show that some widely used approaches to recognition and learning are improper in this sense. We then propose a generalized formulation of the recognition and learning problem which embraces the whole range of sizes of the learning sample, including the zero size. Learning becomes a special case of recognition without learning. We define the concept of closest to optimal strategy, being a solution to the formulated problem, and describe a technique for finding such a strategy. On several illustrative cases, the strategy is shown to be superior to the widely used learning methods based on maximal likelihood estimation.