MultiDendrograms: Variable-Group Agglomerative Hierarchical Clusterings
This solves a specific computational issue in clustering for data analysts, but it is incremental as it builds on existing methods to address tie-breaking.
The paper tackles the non-uniqueness problem in agglomerative hierarchical clustering when ties in minimum distances occur, by developing MultiDendrograms, a Java application that uses a variable-group algorithm to group multiple clusters simultaneously, ensuring unique dendrograms.
MultiDendrograms is a Java-written application that computes agglomerative hierarchical clusterings of data. Starting from a distances (or weights) matrix, MultiDendrograms is able to calculate its dendrograms using the most common agglomerative hierarchical clustering methods. The application implements a variable-group algorithm that solves the non-uniqueness problem found in the standard pair-group algorithm. This problem arises when two or more minimum distances between different clusters are equal during the agglomerative process, because then different output clusterings are possible depending on the criterion used to break ties between distances. MultiDendrograms solves this problem implementing a variable-group algorithm that groups more than two clusters at the same time when ties occur.