LGJun 28, 2023
Feature Selection: A perspective on inter-attribute cooperationGustavo Sosa-Cabrera, Santiago Gómez-Guerrero, Miguel García-Torres et al.
High-dimensional datasets depict a challenge for learning tasks in data mining and machine learning. Feature selection is an effective technique in dealing with dimensionality reduction. It is often an essential data processing step prior to applying a learning algorithm. Over the decades, filter feature selection methods have evolved from simple univariate relevance ranking algorithms to more sophisticated relevance-redundancy trade-offs and to multivariate dependencies-based approaches in recent years. This tendency to capture multivariate dependence aims at obtaining unique information about the class from the intercooperation among features. This paper presents a comprehensive survey of the state-of-the-art work on filter feature selection methods assisted by feature intercooperation, and summarizes the contributions of different approaches found in the literature. Furthermore, current issues and challenges are introduced to identify promising future research and development.
ITMar 27, 2024
Representatividad Muestral en la Incertidumbre Simétrica Multivariada para la Selección de AtributosGustavo Sosa-Cabrera
In this work, we analyze the behavior of the multivariate symmetric uncertainty (MSU) measure through the use of statistical simulation techniques under various mixes of informative and non-informative randomly generated features. Experiments show how the number of attributes, their cardinalities, and the sample size affect the MSU. In this thesis, through observation of results, it is proposed an heuristic condition that preserves good quality in the MSU under different combinations of these three factors, providing a new useful criterion to help drive the process of dimension reduction. -- En el presente trabajo hemos analizado el comportamiento de una versión multivariada de la incertidumbre simétrica a través de técnicas de simulación estadísticas sobre varias combinaciones de atributos informativos y no-informativos generados de forma aleatoria. Los experimentos muestran como el número de atributos, sus cardinalidades y el tamaño muestral afectan al MSU como medida. En esta tesis, mediante la observación de resultados hemos propuesto una condición que preserva una buena calidad en el MSU bajo diferentes combinaciones de los tres factores mencionados, lo cual provee un nuevo y valioso criterio para llevar a cabo el proceso de reducción de dimensionalidad.
LGSep 25, 2017
Understanding a Version of Multivariate Symmetric Uncertainty to assist in Feature SelectionGustavo Sosa-Cabrera, Miguel García-Torres, Santiago Gómez et al.
In this paper, we analyze the behavior of the multivariate symmetric uncertainty (MSU) measure through the use of statistical simulation techniques under various mixes of informative and non-informative randomly generated features. Experiments show how the number of attributes, their cardinalities, and the sample size affect the MSU. We discovered a condition that preserves good quality in the MSU under different combinations of these three factors, providing a new useful criterion to help drive the process of dimension reduction.