MultiDendrograms is an open-source, Java-based software application and algorithm used to compute and plot Agglomerative Hierarchical Clustering (AHC) of data. Developed by researchers at the Universitat Rovira i Virgili, its definitive feature is the ability to solve the non-uniqueness problem (the “ties in proximity” problem) in cluster analysis. The Problem It Solves: Ties in Proximity
In standard hierarchical clustering, a “pair-group” algorithm joins exactly two clusters at a time by finding the shortest distance between them. However, if two or more pairs share the exact same minimum distance (a tie), standard software must break the tie arbitrarily (often randomly or based on data entry order). This can result in completely different tree structures and final data classifications depending on how the tie was broken.
MultiDendrograms eliminates this bias by implementing a variable-group algorithm. If a tie occurs, it merges more than two clusters simultaneously, producing a unique, non-arbitrary, and repeatable tree known as a multifurcated dendrogram (or multidendrogram). Core Features sergio-gomez/MultiDendrograms – GitHub
Leave a Reply