In this post I’m sharing the code snippet in R I used to get a pretty graph
to visualize dendrograms and clusters in an alternative way.
Recipe
The general recipe consists of the following steps:
Obtain a distance matrix from your data set with dist()
Perform a hierarchical clustering analysis with hclust()
Examine the dendrogram to determine the number of clusters
Cut the dendrogram to obtain clusters with cutree()
Convert cluster structure into a "phylo" object with as.phylo()
Use the tree nodes from the "phylo" object to obtain a graph with graph.edgelist()
Obtain a graph layout, in this case with layout.auto()
Plot the data with the x-y coordinates from the graph layout!
Example with data “USArrests”
For this example I’m going to use the data set USArrests that comes with R.
The idea is to get a dendrogram from a hierarchical clustering analysis. For
illustration purposes I’m going to cut the dendrogram in 4 clusters.
Code in R: Pretty Tree Graph
Once we have the “not very outstanding” dendrogram, we can do some data wrangling
in order to obtain a better layout to display the obtained clusters in a very
appealing visual way. Here’s the code snippet in R (feel free to adapt it for your
own visualizations).