Walid Mahdi

MM
5papers
67citations
Novelty31%
AI Score19

5 Papers

CVDec 15, 2014
Automatic video scene segmentation based on spatial-temporal clues and rhythm

Walid Mahdi, Liming Chen, Mohsen Ardebilian

With ever increasing computing power and data storage capacity, the potential for large digital video libraries is growing rapidly.However, the massive use of video for the moment is limited by its opaque characteristics. Indeed, a user who has to handle and retrieve sequentially needs too much time in order to find out segments of interest within a video. Therefore, providing an environment both convenient and efficient for video storing and retrieval, especially for content-based searching as this exists in traditional textbased database systems, has been the focus of recent and important efforts of a large research community In this paper, we propose a new automatic video scene segmentation method that explores two main video features; these are spatial-temporal relationship and rhythm of shots. The experimental evidence we obtained from a 80 minutevideo showed that our prototype provides very high accuracy for video segmentation.

CVJan 19, 2013
Lip Localization and Viseme Classification for Visual Speech Recognition

Salah Werda, Walid Mahdi, Abdelmajid Ben Hamadou

The need for an automatic lip-reading system is ever increasing. Infact, today, extraction and reliable analysis of facial movements make up an important part in many multimedia systems such as videoconference, low communication systems, lip-reading systems. In addition, visual information is imperative among people with special needs. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple syllable pronunciation. Moreover, people with hearing problems compensate for their special needs by lip-reading as well as listening to the person with whome they are talking.

MMJan 10, 2013
A Visual Grammar Approach for TV Program Identification

Tarek Zlitni, Walid Mahdi

Automatic identification of TV programs within TV streams is an important task for archive exploitation. This paper proposes a new spatial-temporal approach to identify programs in TV streams in two main steps: First, a reference catalogue for video grammars of visual jingles is constructed. We exploit visual grammars characterizing instances of the same program type in order to identify the various program types in the TV stream. The role of video grammar is to represent the visual invariants for each visual jingle using a set of descriptors appropriate for each TV program. Secondly, programs in TV streams are identified by examining the similarity of the video signal to the visual grammars in the catalogue. The main idea of identification process consists in comparing the visual similarity of the video signal signature in TV stream to the catalogue elements. After presenting the proposed approach, the paper overviews the encouraging experimental results on several streams extracted from different channels and composed of several programs.

MMJan 10, 2013
AViTExt: Automatic Video Text Extraction, A new Approach for video content indexing Application

Baseem Bouaziz, Tarek Zlitni, Walid Mahdi

In this paper, we propose a spatial temporal video-text detection technique which proceed in two principal steps:potential text region detection and a filtering process. In the first step we divide dynamically each pair of consecutive video frames into sub block in order to detect change. A significant difference between homologous blocks implies the appearance of an important object which may be a text region. The temporal redundancy is then used to filter these regions and forms an effective text region. The experimentation driven on a variety of video sequences shows the effectiveness of our approach by obtaining a 89,39% as precision rate and 90,19 as recall.

MMJan 10, 2013
Content-Based Video Browsing by Text Region Localization and Classification

Bassem Bouaziz, Walid Mahdi, Tarek Zlitni et al.

The amount of digital video data is increasing over the world. It highlights the need for efficient algorithms that can index, retrieve and browse this data by content. This can be achieved by identifying semantic description captured automatically from video structure. Among these descriptions, text within video is considered as rich features that enable a good way for video indexing and browsing. Unlike most video text detection and extraction methods that treat video sequences as collections of still images, we propose in this paper spatiotemporal. video-text localization and identification approach which proceeds in two main steps: text region localization and text region classification. In the first step we detect the significant appearance of the new objects in a frame by a split and merge processes applied on binarized edge frame pair differences. Detected objects are, a priori, considered as text. They are then filtered according to both local contrast variation and texture criteria in order to get the effective ones. The resulted text regions are classified based on a visual grammar descriptor containing a set of semantic text class regions characterized by visual features. A visual table of content is then generated based on extracted text regions occurring within video sequence enriched by a semantic identification. The experimentation performed on a variety of video sequences shows the efficiency of our approach.