Visualizing Complex Data: An Introduction to MultiDendrograms Software

Written by

in

MultiDendrograms is an open-source, Java-based software application and algorithm used to compute and plot Agglomerative Hierarchical Clustering (AHC) of data. Developed by researchers at the Universitat Rovira i Virgili, its definitive feature is the ability to solve the non-uniqueness problem (the “ties in proximity” problem) in cluster analysis. The Problem It Solves: Ties in Proximity

In standard hierarchical clustering, a “pair-group” algorithm joins exactly two clusters at a time by finding the shortest distance between them. However, if two or more pairs share the exact same minimum distance (a tie), standard software must break the tie arbitrarily (often randomly or based on data entry order). This can result in completely different tree structures and final data classifications depending on how the tie was broken.

MultiDendrograms eliminates this bias by implementing a variable-group algorithm. If a tie occurs, it merges more than two clusters simultaneously, producing a unique, non-arbitrary, and repeatable tree known as a multifurcated dendrogram (or multidendrogram). Core Features sergio-gomez/MultiDendrograms – GitHub

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *