Christian E. Schaerer

2papers

2 Papers

LGJun 28, 2023
Feature Selection: A perspective on inter-attribute cooperation

Gustavo Sosa-Cabrera, Santiago Gómez-Guerrero, Miguel García-Torres et al.

High-dimensional datasets depict a challenge for learning tasks in data mining and machine learning. Feature selection is an effective technique in dealing with dimensionality reduction. It is often an essential data processing step prior to applying a learning algorithm. Over the decades, filter feature selection methods have evolved from simple univariate relevance ranking algorithms to more sophisticated relevance-redundancy trade-offs and to multivariate dependencies-based approaches in recent years. This tendency to capture multivariate dependence aims at obtaining unique information about the class from the intercooperation among features. This paper presents a comprehensive survey of the state-of-the-art work on filter feature selection methods assisted by feature intercooperation, and summarizes the contributions of different approaches found in the literature. Furthermore, current issues and challenges are introduced to identify promising future research and development.

2.7ITMar 25
On topological and algebraic structures of categorical random variables

Inocencio Ortiz, Santiago Gómez-Guerrero, Christian E. Schaerer

Based on entropy and symmetrical uncertainty (SU), we define a metric for categorical random variables and show that this metric can be promoted into an appropriate quotient space of categorical random variables. Moreover, we also show that there is a natural commutative monoid structure in the same quotient space, which is compatible with the topology induced by the metric, in the sense that the monoid operation is continuous.