Modelling the Human Intuition to Complete the Missing Information in Images for Convolutional Neural Networks
This addresses the challenge of handling occluded or incomplete visual data for computer vision applications, but it is incremental as it builds on existing Gestalt theory and focuses on a specific dataset.
The study tackled the problem of improving Convolutional Neural Networks (CNNs) by modeling human visual intuition to complete missing information in images, resulting in higher performance on incomplete images from the MNIST dataset compared to classic models.
In this study, we attempt to model intuition and incorporate this formalism to improve the performance of the Convolutional Neural Networks. Despite decades of research, ambiguities persist on principles of intuition. Experimental psychology reveals many types of intuition, which depend on state of the human mind. We focus on visual intuition, useful for completing missing information during visual cognitive tasks. First, we set up a scenario to gradually decrease the amount of visual information in the images of a dataset to examine its impact on CNN accuracy. Then, we represent a model for visual intuition using Gestalt theory. The theory claims that humans derive a set of templates according to their subconscious experiences. When the brain decides that there is missing information in a scene, such as occlusion, it instantaneously completes the information by replacing the missing parts with the most similar ones. Based upon Gestalt theory, we model the visual intuition, in two layers. Details of these layers are provided throughout the paper. We use the MNIST data set to test the suggested intuition model for completing the missing information. Experiments show that the augmented CNN architecture provides higher performances compared to the classic models when using incomplete images.