Mohamed Cheriet

h-index49

26papers

573citations

Novelty37%

AI Score29

Ranked #142,868 of 194,257 authors (top 74%)#46,915 in CV (top 79%)

26 Papers

1.2NIJan 24, 2023

Evolution of MAC Protocols in the Machine Learning Decade: A Comprehensive Survey

Mostafa Hussien, Islam A. T. F. Taj-Eddin, Mohammed F. A. Ahmed et al.

The last decade, (2012 - 2022), saw an unprecedented advance in machine learning (ML) techniques, particularly deep learning (DL). As a result of the proven capabilities of DL, a large amount of work has been presented and studied in almost every field. Since 2012, when the convolution neural networks have been reintroduced in the context of \textit{ImagNet} competition, DL continued to achieve superior performance in many challenging tasks and problems. Wireless communications, in general, and medium access control (MAC) techniques, in particular, were among the fields that were heavily affected by this improvement. MAC protocols play a critical role in defining the performance of wireless communication systems. At the same time, the community lacks a comprehensive survey that collects, analyses, and categorizes the recent work in ML-inspired MAC techniques. In this work, we fill this gap by surveying a long line of work in this era. We solidify the impact of machine learning on wireless MAC protocols. We provide a comprehensive background to the widely adopted MAC techniques, their design issues, and their taxonomy, in connection with the famous application domains. Furthermore, we provide an overview of the ML techniques that have been considered in this context. Finally, we augment our work by proposing some promising future research directions and open research questions that are worth further investigation.

2.3SPMar 17, 2022

A Learning Framework for Bandwidth-Efficient Distributed Inference in Wireless IoT

Mostafa Hussien, Kim Khoa Nguyen, Mohamed Cheriet

In wireless Internet of things (IoT), the sensors usually have limited bandwidth and power resources. Therefore, in a distributed setup, each sensor should compress and quantize the sensed observations before transmitting them to a fusion center (FC) where a global decision is inferred. Most of the existing compression techniques and entropy quantizers consider only the reconstruction fidelity as a metric, which means they decouple the compression from the sensing goal. In this work, we argue that data compression mechanisms and entropy quantizers should be co-designed with the sensing goal, specifically for machine-consumed data. To this end, we propose a novel deep learning-based framework for compressing and quantizing the observations of correlated sensors. Instead of maximizing the reconstruction fidelity, our objective is to compress the sensor observations in a way that maximizes the accuracy of the inferred decision (i.e., sensing goal) at the FC. Unlike prior work, we do not impose any assumptions about the observations distribution which emphasizes the wide applicability of our framework. We also propose a novel loss function that keeps the model focused on learning complementary features at each sensor. The results show the superior performance of our framework compared to other benchmark models.

1.8LGJun 9, 2022

Leveraging Centric Data Federated Learning Using Blockchain For Integrity Assurance

Riadh Ben Chaabene, Darine Amayed, Mohamed Cheriet

Machine learning abilities have become a vital component for various solutions across industries, applications, and sectors. Many organizations seek to leverage AI-based solutions across their business services to unlock better efficiency and increase productivity. Problems, however, can arise if there is a lack of quality data for AI-model training, scalability, and maintenance. We propose a data-centric federated learning architecture leveraged by a public blockchain and smart contracts to overcome this significant issue. Our proposed solution provides a virtual public marketplace where developers, data scientists, and AI-engineer can publish their models and collaboratively create and access quality data for training. We enhance data quality and integrity through an incentive mechanism that rewards contributors for data contribution and verification. Those combined with the proposed framework helped increase with only one user simulation the training dataset with an average of 100 input daily and the model accuracy by approximately 4\%.

1.8LGSep 22, 2022

Non-Negative Matrix Factorization with Scale Data Structure Preservation

Rachid Hedjam, Abdelhamid Abdesselam, Abderrahmane Rahiche et al.

The model described in this paper belongs to the family of non-negative matrix factorization methods designed for data representation and dimension reduction. In addition to preserving the data positivity property, it aims also to preserve the structure of data during matrix factorization. The idea is to add, to the NMF cost function, a penalty term to impose a scale relationship between the pairwise similarity matrices of the original and transformed data points. The solution of the new model involves deriving a new parametrized update scheme for the coefficient matrix, which makes it possible to improve the quality of reduced data when used for clustering and classification. The proposed clustering algorithm is compared to some existing NMF-based algorithms and to some manifold learning-based algorithms when applied to some real-life datasets. The obtained results show the effectiveness of the proposed algorithm.

3.7CVDec 25, 2024Code

HAND: Hierarchical Attention Network for Multi-Scale Handwritten Document Recognition and Layout Analysis

Mohammed Hamdan, Abderrahmane Rahiche, Mohamed Cheriet

Handwritten document recognition (HDR) is one of the most challenging tasks in the field of computer vision, due to the various writing styles and complex layouts inherent in handwritten texts. Traditionally, this problem has been approached as two separate tasks, handwritten text recognition and layout analysis, and struggled to integrate the two processes effectively. This paper introduces HAND (Hierarchical Attention Network for Multi-Scale Document), a novel end-to-end and segmentation-free architecture for simultaneous text recognition and layout analysis tasks. Our model's key components include an advanced convolutional encoder integrating Gated Depth-wise Separable and Octave Convolutions for robust feature extraction, a Multi-Scale Adaptive Processing (MSAP) framework that dynamically adjusts to document complexity and a hierarchical attention decoder with memory-augmented and sparse attention mechanisms. These components enable our model to scale effectively from single-line to triple-column pages while maintaining computational efficiency. Additionally, HAND adopts curriculum learning across five complexity levels. To improve the recognition accuracy of complex ancient manuscripts, we fine-tune and integrate a Domain-Adaptive Pre-trained mT5 model for post-processing refinement. Extensive evaluations on the READ 2016 dataset demonstrate the superior performance of HAND, achieving up to 59.8% reduction in CER for line-level recognition and 31.2% for page-level recognition compared to state-of-the-art methods. The model also maintains a compact size of 5.60M parameters while establishing new benchmarks in both text recognition and layout analysis. Source code and pre-trained models are available at : https://github.com/MHHamdan/HAND.

2.0CVDec 24, 2024Code

HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge Distillation

Mohammed Hamdan, Abderrahmane Rahiche, Mohamed Cheriet

Despite significant advances in deep learning, current Handwritten Text Recognition (HTR) systems struggle with the inherent complexity of historical documents, including diverse writing styles, degraded text quality, and computational efficiency requirements across multiple languages and time periods. This paper introduces HTR-JAND (HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge Distillation), an efficient HTR framework that combines advanced feature extraction with knowledge distillation. Our architecture incorporates three key components: (1) a CNN architecture integrating FullGatedConv2d layers with Squeeze-and-Excitation blocks for adaptive feature extraction, (2) a Combined Attention mechanism fusing Multi-Head Self-Attention with Proxima Attention for robust sequence modeling, and (3) a Knowledge Distillation framework enabling efficient model compression while preserving accuracy through curriculum-based training. The HTR-JAND framework implements a multi-stage training approach combining curriculum learning, synthetic data generation, and multi-task learning for cross-dataset knowledge transfer. We enhance recognition accuracy through context-aware T5 post-processing, particularly effective for historical documents. Comprehensive evaluations demonstrate HTR-JAND's effectiveness, achieving state-of-the-art Character Error Rates (CER) of 1.23\%, 1.02\%, and 2.02\% on IAM, RIMES, and Bentham datasets respectively. Our Student model achieves a 48\% parameter reduction (0.75M versus 1.5M parameters) while maintaining competitive performance through efficient knowledge transfer. Source code and pre-trained models are available at \href{https://github.com/DocumentRecognitionModels/HTR-JAND}{Github}.

4.6LGOct 21, 2024

Small Contributions, Small Networks: Efficient Neural Network Pruning Based on Relative Importance

Mostafa Hussien, Mahmoud Afifi, Kim Khoa Nguyen et al.

Recent advancements have scaled neural networks to unprecedented sizes, achieving remarkable performance across a wide range of tasks. However, deploying these large-scale models on resource-constrained devices poses significant challenges due to substantial storage and computational requirements. Neural network pruning has emerged as an effective technique to mitigate these limitations by reducing model size and complexity. In this paper, we introduce an intuitive and interpretable pruning method based on activation statistics, rooted in information theory and statistical analysis. Our approach leverages the statistical properties of neuron activations to identify and remove weights with minimal contributions to neuron outputs. Specifically, we build a distribution of weight contributions across the dataset and utilize its parameters to guide the pruning process. Furthermore, we propose a Pruning-aware Training strategy that incorporates an additional regularization term to enhance the effectiveness of our pruning method. Extensive experiments on multiple datasets and network architectures demonstrate that our method consistently outperforms several baseline and state-of-the-art pruning techniques.

4.4LGMar 22, 2021Code

Energy Disaggregation using Variational Autoencoders

Antoine Langevin, Marc-André Carbonneau, Mohamed Cheriet et al.

Non-intrusive load monitoring (NILM) is a technique that uses a single sensor to measure the total power consumption of a building. Using an energy disaggregation method, the consumption of individual appliances can be estimated from the aggregate measurement. Recent disaggregation algorithms have significantly improved the performance of NILM systems. However, the generalization capability of these methods to different houses as well as the disaggregation of multi-state appliances are still major challenges. In this paper we address these issues and propose an energy disaggregation approach based on the variational autoencoders framework. The probabilistic encoder makes this approach an efficient model for encoding information relevant to the reconstruction of the target appliance consumption. In particular, the proposed model accurately generates more complex load profiles, thus improving the power signal reconstruction of multi-state appliances. Moreover, its regularized latent space improves the generalization capabilities of the model across different houses. The proposed model is compared to state-of-the-art NILM approaches on the UK-DALE and REFIT datasets, and yields competitive results. The mean absolute error reduces by 18% on average across all appliances compared to the state-of-the-art. The F1-Score increases by more than 11%, showing improvements for the detection of the target appliance in the aggregate measurement.

5.9NINov 9, 2020

PRVNet: A Novel Partially-Regularized Variational Autoencoders for Massive MIMO CSI Feedback

Mostafa Hussien, Kim Khoa Nguyen, Mohamed Cheriet

In a multiple-input multiple-output frequency-division duplexing (MIMO-FDD) system, the user equipment (UE) sends the downlink channel state information (CSI) to the base station to report link status. Due to the complexity of MIMO systems, the overhead incurred in sending this information negatively affects the system bandwidth. Although this problem has been widely considered in the literature, prior work generally assumes an ideal feedback channel. In this paper, we introduce PRVNet, a neural network architecture inspired by variational autoencoders (VAE) to compress the CSI matrix before sending it back to the base station under noisy channel conditions. Moreover, we propose a customized loss function that best suits the special characteristics of the problem being addressed. We also introduce an additional regularization hyperparameter for the learning objective, which is crucial for achieving competitive performance. In addition, we provide an efficient way to tune this hyperparameter using KL-annealing. Experimental results show the proposed model outperforms the benchmark models including two deep learning-based models in a noise-free feedback channel assumption. In addition, the proposed model achieves an outstanding performance under different noise levels for additive white Gaussian noise feedback channels.

3.0SEApr 27, 2020

Internet of Things Architectures: A Comparative Study

Marcela G. dos Santos, Darine Ameyed, Fabio Petrillo et al.

Over the past two decades, the Internet of Things (IoT) has become an underlying concept to a variety of solutions and technologies that it is now hardly possible to enumerate and describe all of them. The concept behind the Internet of Things is as powerful as it is complex, and for the components in the IoT solution tomesh together perfectly, they all have to be part of a well-thought-out structure. That is where understanding the IoT architecture becomes paramount. Because of the vast domain of IoT, there is no single consensus on IoT architecture. Different researchers and organizations proposed different architectures under a variety of classifications, mainly: conceptual, standard and, industrial or commercial adoption. It is indispensable to make a systematic analysis of IoT architecture to be able to compare the industrial proposals and identify their similarities and their differences. In this work, we summarize information about seven IoT industrial architectures in order to propose an approach that makes possible a comparative analysis between different IoT architectures. This work presents two main contributions: (i) an approach for analyzing and comparing IoTarchitectures using Layer-Model; (ii) a comparative study of seven industrial IoT architectures.

1.2NIFeb 8, 2020

BLCS: Brain-Like based Distributed Control Security in Cyber Physical Systems

Hui Yang, Kaixuan Zhan, Michel Kadoch et al.

Cyber-physical system (CPS) has operated, controlled and coordinated the physical systems integrated by a computing and communication core applied in industry 4.0. To accommodate CPS services, fog radio and optical networks (F-RON) has become an important supporting physical cyber infrastructure taking advantage of both the inherent ubiquity of wireless technology and the large capacity of optical networks. However, cyber security is the biggest issue in CPS scenario as there is a tradeoff between security control and privacy exposure in F-RON. To deal with this issue, we propose a brain-like based distributed control security (BLCS) architecture for F-RON in CPS, by introducing a brain-like security (BLS) scheme. BLCS can accomplish the secure cross-domain control among tripartite controllers verification in the scenario of decentralized F-RON for distributed computing and communications, which has no need to disclose the private information of each domain against cyber-attacks. BLS utilizes parts of information to perform control identification through relation network and deep learning of behavior library. The functional modules of BLCS architecture are illustrated including various controllers and brain-like knowledge base. The interworking procedures in distributed control security modes based on BLS are described. The overall feasibility and efficiency of architecture are experimentally verified on the software defined network testbed in terms of average mistrust rate, path provisioning latency, packet loss probability and blocking probability. The emulation results are obtained and dissected based on the testbed.

8.3CVApr 7, 2018

Efficient No-Reference Quality Assessment and Classification Model for Contrast Distorted Images

Hossein Ziaei Nafchi, Mohamed Cheriet

In this paper, an efficient Minkowski Distance based Metric (MDM) for no-reference (NR) quality assessment of contrast distorted images is proposed. It is shown that higher orders of Minkowski distance and entropy provide accurate quality prediction for the contrast distorted images. The proposed metric performs predictions by extracting only three features from the distorted images followed by a regression analysis. Furthermore, the proposed features are able to classify type of the contrast distorted images with a high accuracy. Experimental results on four datasets CSIQ, TID2013, CCID2014, and SIQAD show that the proposed metric with a very low complexity provides better quality predictions than the state-of-the-art NR metrics. The MATLAB source code of the proposed metric is available to public at http://www.synchromedia.ca/system/files/MDM.zip.

2.1CVSep 12, 2016

MUG: A Parameterless No-Reference JPEG Quality Evaluator Robust to Block Size and Misalignment

Hossein Ziaei Nafchi, Atena Shahkolaei, Rachid Hedjam et al.

In this letter, a very simple no-reference image quality assessment (NR-IQA) model for JPEG compressed images is proposed. The proposed metric called median of unique gradients (MUG) is based on the very simple facts of unique gradient magnitudes of JPEG compressed images. MUG is a parameterless metric and does not need training. Unlike other NR-IQAs, MUG is independent to block size and cropping. A more stable index called MUG+ is also introduced. The experimental results on six benchmark datasets of natural images and a benchmark dataset of synthetic images show that MUG is comparable to the state-of-the-art indices in literature. In addition, its performance remains unchanged for the case of the cropped images in which block boundaries are not known. The MATLAB source code of the proposed metrics is available at https://dl.dropboxusercontent.com/u/74505502/MUG.m and https://dl.dropboxusercontent.com/u/74505502/MUGplus.m.

17.4CVAug 26, 2016

Mean Deviation Similarity Index: Efficient and Reliable Full-Reference Image Quality Evaluator

Hossein Ziaei Nafchi, Atena Shahkolaei, Rachid Hedjam et al.

Applications of perceptual image quality assessment (IQA) in image and video processing, such as image acquisition, image compression, image restoration and multimedia communication, have led to the development of many IQA metrics. In this paper, a reliable full reference IQA model is proposed that utilize gradient similarity (GS), chromaticity similarity (CS), and deviation pooling (DP). By considering the shortcomings of the commonly used GS to model human visual system (HVS), a new GS is proposed through a fusion technique that is more likely to follow HVS. We propose an efficient and effective formulation to calculate the joint similarity map of two chromatic channels for the purpose of measuring color changes. In comparison with a commonly used formulation in the literature, the proposed CS map is shown to be more efficient and provide comparable or better quality predictions. Motivated by a recent work that utilizes the standard deviation pooling, a general formulation of the DP is presented in this paper and used to compute a final score from the proposed GS and CS maps. This proposed formulation of DP benefits from the Minkowski pooling and a proposed power pooling as well. The experimental results on six datasets of natural images, a synthetic dataset, and a digitally retouched dataset show that the proposed index provides comparable or better quality predictions than the most recent and competing state-of-the-art IQA metrics in the literature, it is reliable and has low complexity. The MATLAB source code of the proposed metric is available at https://www.mathworks.com/matlabcentral/fileexchange/59809.

2.3MMJun 20, 2016

A Note on Efficiency of Downsampling and Color Transformation in Image Quality Assessment

Hossein Ziaei Nafchi, Mohamed Cheriet

Several existing and successful full reference image quality assessment (IQA) models use linear color transformation and downsampling before measuring similarity or quality of images. This paper indicates to the right order of these two procedures and that the existing models have not chosen the more efficient approach. In addition, efficiency of these metrics is not compared in a fair basis in the literature.

1.3CVMay 13, 2015

Modified Hausdorff Fractal Dimension (MHFD)

Reza Farrahi Moghaddam, Mohamed Cheriet

The Hausdorff fractal dimension has been a fast-to-calculate method to estimate complexity of fractal shapes. In this work, a modified version of this fractal dimension is presented in order to make it more robust when applied in estimating complexity of non-fractal images. The modified Hausdorff fractal dimension stands on two features that weaken the requirement of presence of a shape and also reduce the impact of the noise possibly presented in the input image. The new algorithm has been evaluated on a set of images of different character with promising performance.

2.3MMApr 26, 2015

Deviation Based Pooling Strategies For Full Reference Image Quality Assessment

Hossein Ziaei Nafchi, Rachid Hedjam, Atena Shahkolaei et al.

The state-of-the-art pooling strategies for perceptual image quality assessment (IQA) are based on the mean and the weighted mean. They are robust pooling strategies which usually provide a moderate to high performance for different IQAs. Recently, standard deviation (SD) pooling was also proposed. Although, this deviation pooling provides a very high performance for a few IQAs, its performance is lower than mean poolings for many other IQAs. In this paper, we propose to use the mean absolute deviation (MAD) and show that it is a more robust and accurate pooling strategy for a wider range of IQAs. In fact, MAD pooling has the advantages of both mean pooling and SD pooling. The joint computation and use of the MAD and SD pooling strategies is also considered in this paper. Experimental results provide useful information on the choice of the proper deviation pooling strategy for different IQA models.

1.2CYApr 23, 2015

40 Gbps Access for Metro networks: Implications in terms of Sustainability and Innovation from an LCA Perspective

Reza Farrahi Moghaddam, Yves Lemieux, Mohamed Cheriet

In this work, the implications of new technologies, more specifically the new optical FTTH technologies, are studied both from the functional and non-functional perspectives. In particular, some direct impacts are listed in the form of abandoning non-functional technologies, such as micro-registration, which would be implicitly required for having a functioning operation before arrival the new high-bandwidth access technologies. It is shown that such abandonment of non-functional best practices, which are mainly at the management level of ICT, immediately results in additional consumption and environmental footprint, and also there is a chance that some other new innovations might be 'missed.' Therefore, unconstrained deployment of these access technologies is not aligned with a possible sustainable ICT picture, except if they are regulated. An approach to pricing the best practices, including both functional and non-functional technologies, is proposed in order to develop a regulation and policy framework for a sustainable broadband access.

1.3CVFeb 4, 2015

A Multiple-Expert Binarization Framework for Multispectral Images

Reza Farrahi Moghaddam, Mohamed Cheriet

In this work, a multiple-expert binarization framework for multispectral images is proposed. The framework is based on a constrained subspace selection limited to the spectral bands combined with state-of-the-art gray-level binarization methods. The framework uses a binarization wrapper to enhance the performance of the gray-level binarization. Nonlinear preprocessing of the individual spectral bands is used to enhance the textual information. An evolutionary optimizer is considered to obtain the optimal and some suboptimal 3-band subspaces from which an ensemble of experts is then formed. The framework is applied to a ground truth multispectral dataset with promising results. In addition, a generalization to the cross-validation approach is developed that not only evaluates generalizability of the framework, it also provides a practical instance of the selected experts that could be then applied to unseen inputs despite the small size of the given ground truth dataset.

6.2HCJul 18, 2014

Quality of Experience (QoE) beyond Quality of Service (QoS) as its baseline: QoE at the Interface of Experience Domains

Reza Farrahi Moghaddam, Mohamed Cheriet

In this work, a new approach to the definition of the quality of experience is presented. By considering the quality of service as a baseline, that portion of the QoE that can be inferred from the QoS is excluded, and then the rest of the QoE is approached with the notion of QoE at a Boundary (QoEaaB). With the QoEaaB as the core of the proposed approach, various potential boundaries, and their associated unseen opportunities to improve the QoE are discussed. In particular, property, contract, SLA, and content are explored in terms of their boundaries and also their associated QoEaaB. With an interest in online video delivery, management of resource sharing and isolation associated with multi-tenant operations is considered. It is concluded that the proposed QoEaaB can bring a new perspective in QoE modeling and assessment toward a more enriched approach to improving the experience based on innovation and deep connectivity among actors.

1.9CVApr 15, 2014

Spiralet Sparse Representation

Reza Farrahi Moghaddam, Mohamed Cheriet

This is the first report on Working Paper WP-RFM-14-01. The potential and capability of sparse representations is well-known. However, their (multivariate variable) vectorial form, which is completely fine in many fields and disciplines, results in removal and filtering of important "spatial" relations that are implicitly carried by two-dimensional [or multi-dimensional] objects, such as images. In this paper, a new approach, called spiralet sparse representation, is proposed in order to develop an augmented representation and therefore a modified sparse representation and theory, which is capable to preserve the data associated to the spatial relations.

3.3CYMar 12, 2014

Challenges and complexities in application of LCA approaches in the case of ICT for a sustainable future

Reza Farrahi Moghaddam, Fereydoun Farrahi Moghaddam, Thomas Dandres et al.

In this work, three of many ICT-specific challenges of LCA are discussed. First, the inconsistency versus uncertainty is reviewed with regard to the meta-technological nature of ICT. As an example, the semiconductor technologies are used to highlight the complexities especially with respect to energy and water consumption. The need for specific representations and metric to separately assess products and technologies is discussed. It is highlighted that applying product-oriented approaches would result in abandoning or disfavoring of new technologies that could otherwise help toward a better world. Second, several believed-untouchable hot spots are highlighted to emphasize on their importance and footprint. The list includes, but not limited to, i) User Computer-Interfaces (UCIs), especially screens and displays, ii) Network-Computer Interlaces (NCIs), such as electronic and optical ports, and iii) electricity power interfaces. In addition, considering cross-regional social and economic impacts, and also taking into account the marketing nature of the need for many ICT's product and services in both forms of hardware and software, the complexity of End of Life (EoL) stage of ICT products, technologies, and services is explored. Finally, the impact of smart management and intelligence, and in general software, in ICT solutions and products is highlighted. In particular, it is observed that, even using the same technology, the significance of software could be highly variable depending on the level of intelligence and awareness deployed. With examples from an interconnected network of data centers managed using Dynamic Voltage and Frequency Scaling (DVFS) technology and smart cooling systems, it is shown that the unadjusted assessments could be highly uncertain, and even inconsistent, in calculating the management component's significance on the ICT impacts.

5.5CVJun 25, 2013

A maximal-information color to gray conversion method for document images: Toward an optimal grayscale representation for document image binarization

Reza Farrahi Moghaddam, Shaohua Chen, Rachid Hedjam et al.

A novel method to convert color/multi-spectral images to gray-level images is introduced to increase the performance of document binarization methods. The method uses the distribution of the pixel data of the input document image in a color space to find a transformation, called the dual transform, which balances the amount of information on all color channels. Furthermore, in order to reduce the intensity variations on the gray output, a color reduction preprocessing step is applied. Then, a channel is selected as the gray value representation of the document image based on the homogeneity criterion on the text regions. In this way, the proposed method can provide a luminance-independent contrast enhancement. The performance of the method is evaluated against various images from two databases, the ICDAR'03 Robust Reading, the KAIST and the DIBCO'09 datasets, subjectively and objectively with promising results. The ground truth images for the images from the ICDAR'03 Robust Reading dataset have been created manually by the authors.

5.3LGJun 11, 2013

Large Margin Low Rank Tensor Analysis

Guoqiang Zhong, Mohamed Cheriet

Other than vector representations, the direct objects of human cognition are generally high-order tensors, such as 2D images and 3D textures. From this fact, two interesting questions naturally arise: How does the human brain represent these tensor perceptions in a "manifold" way, and how can they be recognized on the "manifold"? In this paper, we present a supervised model to learn the intrinsic structure of the tensors embedded in a high dimensional Euclidean space. With the fixed point continuation procedures, our model automatically and jointly discovers the optimal dimensionality and the representations of the low dimensional embeddings. This makes it an effective simulation of the cognitive process of human brain. Furthermore, the generalization of our model based on similarity between the learned low dimensional embeddings can be viewed as counterpart of recognition of human brain. Experiments on applications for object recognition and face recognition demonstrate the superiority of our proposed model over state-of-the-art approaches.

12.0CVMay 13, 2013

Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images

Reza Farrahi Moghaddam, Fereydoun Farrahi Moghaddam, Mohamed Cheriet

In recent years, a large number of binarization methods have been developed, with varying performance generalization and strength against different benchmarks. In this work, to leverage on these methods, an ensemble of experts (EoE) framework is introduced, to efficiently combine the outputs of various methods. The proposed framework offers a new selection process of the binarization methods, which are actually the experts in the ensemble, by introducing three concepts: confidentness, endorsement and schools of experts. The framework, which is highly objective, is built based on two general principles: (i) consolidation of saturated opinions and (ii) identification of schools of experts. After building the endorsement graph of the ensemble for an input document image based on the confidentness of the experts, the saturated opinions are consolidated, and then the schools of experts are identified by thresholding the consolidated endorsement graph. A variation of the framework, in which no selection is made, is also introduced that combines the outputs of all experts using endorsement-dependent weights. The EoE framework is evaluated on the set of participating methods in the H-DIBCO'12 contest and also on an ensemble generated from various instances of grid-based Sauvola method with promising performance.

11.2NEAug 10, 2012

Curved Space Optimization: A Random Search based on General Relativity Theory

Fereydoun Farrahi Moghaddam, Reza Farrahi Moghaddam, Mohamed Cheriet

Designing a fast and efficient optimization method with local optima avoidance capability on a variety of optimization problems is still an open problem for many researchers. In this work, the concept of a new global optimization method with an open implementation area is introduced as a Curved Space Optimization (CSO) method, which is a simple probabilistic optimization method enhanced by concepts of general relativity theory. To address global optimization challenges such as performance and convergence, this new method is designed based on transformation of a random search space into a new search space based on concepts of space-time curvature in general relativity theory. In order to evaluate the performance of our proposed method, an implementation of CSO is deployed and its results are compared on benchmark functions with state-of-the art optimization methods. The results show that the performance of CSO is promising on unimodal and multimodal benchmark functions with different search space dimension sizes.