Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy
This addresses the challenge of optimizing speed and accuracy in sequential classification for applications with high sampling costs, representing a novel method rather than an incremental improvement.
The paper tackles the problem of early and accurate sequential data classification under high sampling costs by proposing SPRT-TANDEM, a deep neural network-based algorithm that overcomes limitations of the original SPRT, achieving statistically significantly better classification accuracy with fewer data samples on video databases like Nosaic MNIST, UCF101, and SiW.
Classifying sequential data as early and as accurately as possible is a challenging yet critical problem, especially when a sampling cost is high. One algorithm that achieves this goal is the sequential probability ratio test (SPRT), which is known as Bayes-optimal: it can keep the expected number of data samples as small as possible, given the desired error upper-bound. However, the original SPRT makes two critical assumptions that limit its application in real-world scenarios: (i) samples are independently and identically distributed, and (ii) the likelihood of the data being derived from each class can be calculated precisely. Here, we propose the SPRT-TANDEM, a deep neural network-based SPRT algorithm that overcomes the above two obstacles. The SPRT-TANDEM sequentially estimates the log-likelihood ratio of two alternative hypotheses by leveraging a novel Loss function for Log-Likelihood Ratio estimation (LLLR) while allowing correlations up to $N (\in \mathbb{N})$ preceding samples. In tests on one original and two public video databases, Nosaic MNIST, UCF101, and SiW, the SPRT-TANDEM achieves statistically significantly better classification accuracy than other baseline classifiers, with a smaller number of data samples. The code and Nosaic MNIST are publicly available at https://github.com/TaikiMiyagawa/SPRT-TANDEM.