LGJan 26, 2021
Pitfalls of Assessing Extracted Hierarchies for Multi-Class ClassificationPablo del Moral, Slawomir Nowaczyk, Anita Sant'Anna et al.
Using hierarchies of classes is one of the standard methods to solve multi-class classification problems. In the literature, selecting the right hierarchy is considered to play a key role in improving classification performance. Although different methods have been proposed, there is still a lack of understanding of what makes one method to extract hierarchies perform better or worse. To this effect, we analyze and compare some of the most popular approaches to extracting hierarchies. We identify some common pitfalls that may lead practitioners to make misleading conclusions about their methods. In addition, to address some of these problems, we demonstrate that using random hierarchies is an appropriate benchmark to assess how the hierarchy's quality affects the classification performance. In particular, we show how the hierarchy's quality can become irrelevant depending on the experimental setup: when using powerful enough classifiers, the final performance is not affected by the quality of the hierarchy. We also show how comparing the effect of the hierarchies against non-hierarchical approaches might incorrectly indicate their superiority. Our results confirm that datasets with a high number of classes generally present complex structures in how these classes relate to each other. In these datasets, the right hierarchy can dramatically improve classification performance.
LGSep 16, 2019
No Free Lunch But A Cheaper Supper: A General Framework for Streaming Anomaly DetectionEce Calikus, Slawomir Nowaczyk, Anita Sant'Anna et al.
In recent years, there has been increased research interest in detecting anomalies in temporal streaming data. A variety of algorithms have been developed in the data mining community, which can be divided into two categories (i.e., general and ad hoc). In most cases, general approaches assume the one-size-fits-all solution model where a single anomaly detector can detect all anomalies in any domain. To date, there exists no single general method that has been shown to outperform the others across different anomaly types, use cases and datasets. In this paper, we propose SAFARI, a general framework formulated by abstracting and unifying the fundamental tasks in streaming anomaly detection, which provides a flexible and extensible anomaly detection procedure to overcome the limitations of one-size-fits-all solutions. SAFARI helps to facilitate more elaborate algorithm comparisons by allowing us to isolate the effects of shared and unique characteristics of different algorithms on detection performance. Using SAFARI, we have implemented various anomaly detectors and identified a research gap that motivates us to propose a novel learning strategy in this work. We conducted an extensive evaluation study of 20 detectors that are composed using SAFARI and compared their performances using real-world benchmark datasets with different properties. The results indicate that there is no single superior detector that works well for every case, proving our hypothesis that "there is no free lunch" in the streaming anomaly detection world. Finally, we discuss the benefits and drawbacks of each method in-depth and draw a set of conclusions to guide future users of SAFARI.
CEJan 14, 2019
A data-driven approach for discovering heat load patterns in district heatingEce Calikus, Slawomir Nowaczyk, Anita Sant'Anna et al.
Understanding the heat usage of customers is crucial for effective district heating operations and management. Unfortunately, existing knowledge about customers and their heat load behaviors is quite scarce. Most previous studies are limited to small-scale analyses that are not representative enough to understand the behavior of the overall network. In this work, we propose a data-driven approach that enables large-scale automatic analysis of heat load patterns in district heating networks without requiring prior knowledge. Our method clusters the customer profiles into different groups, extracts their representative patterns, and detects unusual customers whose profiles deviate significantly from the rest of their group. Using our approach, we present the first large-scale, comprehensive analysis of the heat load patterns by conducting a case study on many buildings in six different customer categories connected to two district heating networks in the south of Sweden. The 1222 buildings had a total floor space of 3.4 million square meters and used 1540 TJ heat during 2016. The results show that the proposed method has a high potential to be deployed and used in practice to analyze and understand customers' heat-use habits.
ROOct 4, 2017
Exploring home robot capabilities by medium fidelity prototypingMartin Cooney, Sepideh Pashami, Yuantao Fan et al.
In order for autonomous robots to be able to support people's well-being in homes and everyday environments, new interactive capabilities will be required, as exemplified by the soft design used for Disney's recent robot character Baymax in popular fiction. Home robots will be required to be easy to interact with and intelligent--adaptive, fun, unobtrusive and involving little effort to power and maintain--and capable of carrying out useful tasks both on an everyday level and during emergencies. The current article adopts an exploratory medium fidelity prototyping approach for testing some new robotic capabilities in regard to recognizing people's activities and intentions and behaving in a way which is transparent to people. Results are discussed with the aim of informing next designs.