LGJul 2, 2017
Dimensionality reduction with missing values imputationRania Mkhinini Gahar, Olfa Arfaoui, Minyar Sassi Hidri et al.
In this study, we propose a new statical approach for high-dimensionality reduction of heterogenous data that limits the curse of dimensionality and deals with missing values. To handle these latter, we propose to use the Random Forest imputation's method. The main purpose here is to extract useful information and so reducing the search space to facilitate the data exploration process. Several illustrative numeric examples, using data coming from publicly available machine learning repositories are also included. The experimental component of the study shows the efficiency of the proposed analytical approach.
DBJul 2, 2017
Classification non supervisée des données hétérogènes à large échelleMohamed Ali Zoghlami, Olfa Arfaoui, Minyar Sassi Hidri et al.
When it comes to cluster massive data, response time, disk access and quality of formed classes becoming major issues for companies. It is in this context that we have come to define a clustering framework for large scale heterogeneous data that contributes to the resolution of these issues. The proposed framework is based on, firstly, the descriptive analysis based on MCA, and secondly, the MapReduce paradigm in a large scale environment. The results are encouraging and prove the efficiency of the hybrid deployment on response quality and time component as on qualitative and quantitative data.
IRDec 6, 2013
Flexible queries in XML native databasesOlfa Arfaoui, Minyar Sassi-Hidri
To date, most of the XML native databases (DB) flexible querying systems are based on exploiting the tree structure of their semi structured data (SSD). However, it becomes important to test the efficiency of Formal Concept Analysis (FCA) formalism for this type of data since it has been proved a great performance in the field of information retrieval (IR). So, the IR in XML databases based on FCA is mainly based on the use of the lattice structure. Each concept of this lattice can be interpreted as a pair (response, query). In this work, we provide a new flexible modeling of XML DB based on fuzzy FCA as a first step towards flexible querying of SSD.