Context-based Deep Learning Architecture with Optimal Integration Layer for Image Parsing
This work addresses a specific bottleneck in image parsing for computer vision applications, but it appears incremental as it builds on existing deep learning methods with a novel integration approach.
The authors tackled the problem of deep learning models not fully exploiting visual and contextual information simultaneously in image parsing by proposing a three-layer context-based architecture with an integration layer using genetic algorithm-based optimal fusion, resulting in promising experimental outcomes on benchmark datasets.
Deep learning models have been efficient lately on image parsing tasks. However, deep learning models are not fully capable of exploiting visual and contextual information simultaneously. The proposed three-layer context-based deep architecture is capable of integrating context explicitly with visual information. The novel idea here is to have a visual layer to learn visual characteristics from binary class-based learners, a contextual layer to learn context, and then an integration layer to learn from both via genetic algorithm-based optimal fusion to produce a final decision. The experimental outcomes when evaluated on benchmark datasets are promising. Further analysis shows that optimized network weights can improve performance and make stable predictions.