Summarization and Visualization of Large Volumes of Broadcast Video Data
This work addresses the need for efficient management of growing broadcast video data for media organizations, but it is incremental as it builds on existing methods for format detection and classification.
The paper tackled the problem of automatically summarizing and storing large volumes of broadcast news video data by developing a robust news format detector that identifies and classifies band elements in news frames, achieving a Jaccard Index of 0.8138 for band detection and a balanced accuracy of 96.5% for classification using an SVM classifier.
Over the past few years, there has been an astounding growth in the number of news channels as well as the amount of broadcast news video data. As a result, it is imperative that automated methods need to be developed in order to effectively summarize and store this voluminous data. Format detection of news videos plays an important role in news video analysis. Our problem involves building a robust and versatile news format detector, which identifies the different band elements in a news frame. Probabilistic progressive Hough transform has been used for the detection of band edges. The detected bands are classified as natural images, computer generated graphics (non-text) and text bands. A contrast based text detector has been used to identify the text regions from news frames. Two classifers have been trained and evaluated for the labeling of the detected bands as natural or artificial - Support Vector Machine (SVM) Classifer with RBF kernel, and Extreme Learning Machine (ELM) classifier. The classifiers have been trained on a dataset of 6000 images (3000 images of each class). The ELM classifier reports a balanced accuracy of 77.38%, while the SVM classifier outperforms it with a balanced accuracy of 96.5% using 10-fold cross-validation. The detected bands which have been fragmented due to the presence of gradients in the image have been merged using a three-tier hierarchical reasoning model. The bands were detected with a Jaccard Index of 0.8138, when compared to manually marked ground truth data. We have also presented an extensive literature review of previous work done towards news videos format detection, element band classification, and associative reasoning.