LGOct 13, 2020
Similarity Based Stratified Splitting: an approach to train better classifiersFelipe Farias, Teresa Ludermir, Carmelo Bastos-Filho
We propose a Similarity-Based Stratified Splitting (SBSS) technique, which uses both the output and input space information to split the data. The splits are generated using similarity functions among samples to place similar samples in different splits. This approach allows for a better representation of the data in the training phase. This strategy leads to a more realistic performance estimation when used in real-world applications. We evaluate our proposal in twenty-two benchmark datasets with classifiers such as Multi-Layer Perceptron, Support Vector Machine, Random Forest and K-Nearest Neighbors, and five similarity functions Cityblock, Chebyshev, Cosine, Correlation, and Euclidean. According to the Wilcoxon Sign-Rank test, our approach consistently outperformed ordinary stratified 10-fold cross-validation in 75\% of the assessed scenarios.
LGJan 23, 2020
Towards Automatic Clustering Analysis using Traces of Information Gain: The InfoGuide MethodPaulo Rocha, Diego Pinheiro, Martin Cadeiras et al.
Clustering analysis has become a ubiquitous information retrieval tool in a wide range of domains, but a more automatic framework is still lacking. Though internal metrics are the key players towards a successful retrieval of clusters, their effectiveness on real-world datasets remains not fully understood, mainly because of their unrealistic assumptions underlying datasets. We hypothesized that capturing {\it traces of information gain} between increasingly complex clustering retrievals---{\it InfoGuide}---enables an automatic clustering analysis with improved clustering retrievals. We validated the {\it InfoGuide} hypothesis by capturing the traces of information gain using the Kolmogorov-Smirnov statistic and comparing the clusters retrieved by {\it InfoGuide} against those retrieved by other commonly used internal metrics in artificially-generated, benchmarks, and real-world datasets. Our results suggested that {\it InfoGuide} can enable a more automatic clustering analysis and may be more suitable for retrieving clusters in real-world datasets displaying nontrivial statistical properties.
NEApr 8, 2019
Characterizing the Social Interactions in the Artificial Bee Colony AlgorithmLydia Taw, Nishant Gurrapadi, Mariana Macedo et al.
Computational swarm intelligence consists of multiple artificial simple agents exchanging information while exploring a search space. Despite a rich literature in the field, with works improving old approaches and proposing new ones, the mechanism by which complex behavior emerges in these systems is still not well understood. This literature gap hinders the researchers' ability to deal with known problems in swarms intelligence such as premature convergence, and the balance of coordination and diversity among agents. Recent advances in the literature, however, have proposed to study these systems via the network that emerges from the social interactions within the swarm (i.e., the interaction network). In our work, we propose a definition of the interaction network for the Artificial Bee Colony (ABC) algorithm. With our approach, we captured striking idiosyncrasies of the algorithm. We uncovered the different patterns of social interactions that emerge from each type of bee, revealing the importance of the bees variations throughout the iterations of the algorithm. We found that ABC exhibits a dynamic information flow through the use of different bees but lacks continuous coordination between the agents.
NENov 8, 2018
Uncovering the Social Interaction in Swarm Intelligence with Network ScienceMarcos Oliveira, Diego Pinheiro, Mariana Macedo et al.
Swarm intelligence is the collective behavior emerging in systems with locally interacting components. Because of their self-organization capabilities, swarm-based systems show essential properties for handling real-world problems such as robustness, scalability, and flexibility. Yet, we do not know why swarm-based algorithms work well and neither we can compare the different approaches in the literature. The lack of a common framework capable of characterizing these several swarm-based algorithms, transcending their particularities, has led to a stream of publications inspired by different aspects of nature without a systematic comparison over existing approaches. Here, we address this gap by introducing a network-based framework---the interaction network---to examine computational swarm-based systems via the optics of the social dynamics of such interaction network; a clear example of network science being applied to bring further clarity to a complicated field within artificial intelligence. We discuss the social interactions of four well-known swarm-based algorithms and provide an in-depth case study of the Particle Swarm Optimization. The interaction network enables researchers to study swarm algorithms as systems, removing the algorithm particularities from the analyses while focusing on the structure of the social interactions.