ggbiplot
and plotly
R packagesThis post is part of my R notes series
Take a usual PCA biplot and make it interactive via the functionality of R packages ggbiplot and plotly.
These notes build on the PCA example from An Introduction to Statistical Learning - with Applications in R (James et al. 2013), using as example the PCA biplot from FIGURE 10.1. The first two principal components for the USArrests data.
Load R packages:
# Note: intsall `ggtree` from GitHub;
# At the time of writing these notes, ggbiplot was not on CRAN.
# devtools::install_github("vqv/ggbiplot")
library(ggbiplot)
library(plotly)
Print version information about R, the OS and attached or loaded packages.
## R version 3.4.3 (2017-11-30)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] plotly_4.8.0 ggbiplot_0.55 scales_0.5.0 plyr_1.8.4 ggplot2_3.0.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.18 pillar_1.3.0 compiler_3.4.3
## [4] bindr_0.1.1 tools_3.4.3 digest_0.6.15
## [7] viridisLite_0.3.0 jsonlite_1.5 evaluate_0.11
## [10] tibble_1.4.2 gtable_0.2.0 pkgconfig_2.0.1
## [13] rlang_0.2.1 yaml_2.2.0 bindrcpp_0.2.2
## [16] withr_2.1.2 dplyr_0.7.6 stringr_1.3.1
## [19] httr_1.3.1 knitr_1.20 htmlwidgets_1.2
## [22] rprojroot_1.3-2 tidyselect_0.2.4 glue_1.3.0
## [25] data.table_1.11.4 R6_2.2.2 rmarkdown_1.10
## [28] tidyr_0.8.1 purrr_0.2.5 magrittr_1.5
## [31] backports_1.1.2 htmltools_0.3.6 assertthat_0.2.0
## [34] colorspace_1.3-2 stringi_1.2.4 lazyeval_0.2.1
## [37] munsell_0.5.0 crayon_1.3.4
Run principal components analysis (PCA) and the usual biplot.
The usual PCA biplot
ggbiplot
& ggplotly
First steps with ggbiplot
& ggplotly
.
A simple PCA biplot with ggbiplot
A simple interactive PCA biplot with plotly
(hover the mouse pointer)
Example of mapping one of the variables (features) into color and size.
p <- ggbiplot(pcobj = pr.out,
scale = 0,
alpha = 0)
p1 <- p + geom_point(aes(color = USArrests$UrbanPop,
size = USArrests$UrbanPop))
p1
Map urban population feature into color and size
Take the draft plot from above and make it interactive.
Interactive PCA biplot
The two aesthetics will need to be combined in the legend. While this is possible in ggplot
with using the guides
function, it will not trickle down to ggplotly
output. This SO link could be interesting to investigate.
# Combine size and color in legend;
# see https://stackoverflow.com/a/32652899/5193830
p2 <- p1 +
guides(color = guide_legend(),
size = guide_legend())
p2
If you need to adjust the limits and breaks for the bubbles, use the scale_
functions. The changes in the legend will also not trickle down to the ggplotly
output.
# Combine size and color in legend, adjust the limits and breaks;
# see https://stackoverflow.com/a/32652899/5193830
p2 <- p1 +
scale_color_continuous("Urban pop %",
limits = c(30, 100),
breaks = seq(30, 100, by = 10)) +
scale_size_continuous("Urban pop %",
limits = c(30, 100),
breaks = seq(30, 100, by = 10)) +
guides(color = guide_legend(),
size = guide_legend())
p2
Simple example, with adding a single new field in the plotly
popup.
p3 <- p + geom_point(aes(color = USArrests$UrbanPop,
size = USArrests$UrbanPop,
# Text aesthetic will be ignored in `ggplot`,
# but will be used for mouse hovering in `ggplotly`.
text = paste("Murder:", USArrests$Murder)))
## Warning: Ignoring unknown aesthetics: text
Interactive PCA biplot - add a new field in the popup (hover the mouse pointer)
Add more fields in the plotly
popup. This was inspired from a plotly example.
p4 <- p + geom_point(aes(color = USArrests$UrbanPop,
size = USArrests$UrbanPop,
# Text aesthetic will be ignored in `ggplot`,
# but will be used for mouse hovering in `ggplotly`.
text = paste("Murder:", USArrests$Murder,
'</br> Assault:', USArrests$Assault,
'</br> UrbanPop:', USArrests$UrbanPop,
'</br> Rape:', USArrests$Rape)))
## Warning: Ignoring unknown aesthetics: text
Interactive PCA biplot - use the </br>
trick to add several fields in the popup (hover the mouse pointer)
A way to automatically add all feature values from the data as fields in the plotly popup.
features <- paste0(c('State', names(USArrests)), ": ")
my_labels <- apply(X = cbind(rownames(USArrests), USArrests),
MARGIN = 1,
FUN = function(row) paste(paste(features, row),
collapse = "</br>"))
p5 <- p + geom_point(aes(color = USArrests$UrbanPop,
size = USArrests$UrbanPop,
# Text aesthetic will be ignored in `ggplot`,
# but will be used for mouse hovering in `ggplotly`.
text = my_labels))
## Warning: Ignoring unknown aesthetics: text
Interactive PCA biplot - add all feature values as fields in the popup (hover the mouse pointer)