Toward predictive machine learning for active vision
This work addresses active vision for machine learning applications, but it appears incremental as it builds on existing frameworks like Friston's active inference.
The paper tackles the problem of implementing active inference for vision by proposing a machine-learning-compliant cognitive architecture and control policies, showing that offline calculation with saliency maps saves processing costs with negligible impact on recognition and compression rates.
We develop a comprehensive description of the active inference framework, as proposed by Friston (2010), under a machine-learning compliant perspective. Stemming from a biological inspiration and the auto-encoding principles, the sketch of a cognitive architecture is proposed that should provide ways to implement estimation-oriented control policies. Computer simulations illustrate the effectiveness of the approach through a foveated inspection of the input data. The pros and cons of the control policy are analyzed in detail, showing interesting promises in terms of processing compression. Though optimizing future posterior entropy over the actions set is shown enough to attain locally optimal action selection, offline calculation using class-specific saliency maps is shown better for it saves processing costs through saccades pathways pre-processing, with a negligible effect on the recognition/compression rates.