A Multiclass Multiple Instance Learning Method with Exact Likelihood
This addresses weak supervision in MIL for tasks like image understanding, offering an incremental improvement by extending exact likelihood methods to multiclass settings.
The paper tackles multiclass multiple instance learning (MIL) with weak supervision, where labels only indicate class presence or absence, and solves it by maximizing exact model likelihood. It achieves application to real-world tasks like multiple object detection and localization, as demonstrated on the Google street view imagery dataset.
We study a multiclass multiple instance learning (MIL) problem where the labels only suggest whether any instance of a class exists or does not exist in a training sample or example. No further information, e.g., the number of instances of each class, relative locations or orders of all instances in a training sample, is exploited. Such a weak supervision learning problem can be exactly solved by maximizing the model likelihood fitting given observations, and finds applications to tasks like multiple object detection and localization for image understanding. We discuss its relationship to the classic classification problem, the traditional MIL, and connectionist temporal classification (CTC). We use image recognition as the example task to develop our method, although it is applicable to data with higher or lower dimensions without much modification. Experimental results show that our method can be used to learn all convolutional neural networks for solving real-world multiple object detection and localization tasks with weak annotations, e.g., transcribing house number sequences from the Google street view imagery dataset.