Sankey diagrams with googleVis

25 comments
Sankey diagrams are great for visualising flows from one set of data values to another. Although named after Irish Captain Matthew Henry Phineas Riall Sankey, who used this type of diagram in 1898 to show the energy efficiency of a steam engine, the best know Sankey diagram is probably Charles Minard's Map of Napoleon's Russian Campaign of 1812, which he actually produced in 1869.

Thomas Rahlf: Datendesign mit R

The above example from Thomas Rahlf's book Datendesign mit R shows that Minard's plot can be reproduced with base graphics in R. Aaron Berdanier posted in 2010 the SankeyR function and January Weiner published the river plot package on CRAN that allows users to create static Sankey charts as well.

Interactive Sankey diagram can be generated with rCharts and now also with googleVis (version >= 0.5.0). For my a first example I use UK visitor data from VisitBritain.org. The following diagram visualises the flow of visitors in 2012; where they came from and which parts of the UK they visited. This example illustrates the key concept already. I need a data frame with three columns that explains the flow of data from a source to a target and the strength or weight of the connection.




My next example uses a graph data set that I visualise in the same way again, but here I start to play around with the various parameters of the Google API.




As stated by Google, the Sankey chart may be undergoing substantial revisions in future Google Charts releases.

For more information and installation instructions see the googleVis project site and Google documentation.

Session Info

R version 3.0.3 (2014-03-06)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets  methods  
[7] base     

other attached packages:
[1] googleVis_0.5.0-4 igraph_0.7.0     

loaded via a namespace (and not attached):
[1] RJSONIO_1.0-3 tools_3.0.3 

25 comments :

Somewhere_In_The_Middle said...

The report windows just say SSL Protocol error :(

Markus Gesmann said...

I believe, this is a result of a security setting at your end. The googleVis charts are hosted on Dropbox and I am sure your proxy regards them as a potential threat.

D said...

I get this message trying to reproduce the plots: Error in plot(gvisSankey(UKvisits, from = "origin", to = "visit", weight = "weight", :
could not find function "gvisSankey"

Markus Gesmann said...

You have to install the developer version of googleVis first. Visit http://gitub.com/mages/googleVis for more details.

D said...

Thanks! Really cool, now it works!

January Weiner said...

Just a minor thing -- I am the author of the riverplot package :-)

Anopheles said...

I got an error message to the code

Error in gvisSankey(UKvisits, from = "origin", to = "visit", weight = "weight", :

could not find function "gvisChart" :(

Markus Gesmann said...

Oops, I am really sorry about this, now corrected.

Markus Gesmann said...

Please send me more details by email, in particular the output of sessionInfo(). It appears that you didn't install googleVis correctly.

Anopheles said...

Hi Markus, Thanks for your help

sessionInfo()
R version 3.0.3 (2014-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] rCharts_0.4.2 devtools_1.4.1 googleVis_0.5.0-4

loaded via a namespace (and not attached):
[1] coin_1.0-23 digest_0.6.4 evaluate_0.5.1 grid_3.0.3 httr_0.3 lattice_0.20-27 memoise_0.1
[8] modeltools_0.2-21 mvtnorm_0.9-9997 parallel_3.0.3 party_1.0-13 plyr_1.8.1 Rcpp_0.11.1 RCurl_1.95-4.1
[15] rJava_0.9-6 RJSONIO_1.0-3 rpart_4.1-5 sandwich_2.3-0 splines_3.0.3 stats4_3.0.3 stringr_0.6.2
[22] strucchange_1.5-0 survival_2.37-7 tools_3.0.3 whisker_0.3-2 yaml_2.1.11 zoo_1.7-11

aduncan5 said...

Hi, Markus. Great post. I'm trying to implement gvisSankey(). Just having trouble installing the dev version from your github repo. Getting the following error:

install_github("mages/googleVis")
Installing github repo(s) mages/googleVis/master from hadley
Downloading mages/googleVis.zip from https://github.com/hadley/mages/googleVis/archive/master.zip

Error: client error: (404) Not Found

Many thanks! -adam (p.s. please don't spend more than 1 minute thinking about this.)

Session info:
R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] httpuv_1.2.1 Rcpp_0.11.0 shiny_0.8.0 knitr_1.5 RJSONIO_1.0-3 devtools_1.3

loaded via a namespace (and not attached):
[1] bitops_1.0-6 caTools_1.14 digest_0.6.3 evaluate_0.5 formatR_0.9
[6] grid_3.0.2 httr_0.2 lattice_0.20-23 memoise_0.1 parallel_3.0.2
[11] plyr_1.8 rCharts_0.4.1 RCurl_1.95-4.1 stringr_0.6.2 tools_3.0.2
[16] whisker_0.3-2 xtable_1.7-1 yaml_2.1.7

Markus Gesmann said...

It sounds to me like you are behind a firewall that is causing you the problem. You could try to download the source zip-archive and install the package manually, or alternatively install my pre-build binary version for Windows.
I hope this helps.

aduncan5 said...

That worked. Thanks, Markus. Much appreciated.

alessandro benedetti said...

Hi Markus,
scuse me, i'am an unskilled worker
about error message "could not find function " gvisChart", I ask you if it possible to have an aid abot developer version.
thank you Ale

sessionInfo()

R version 3.0.3 (2014-03-06)

Platform: i386-w64-mingw32/i386 (32-bit)

locale:

[1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252

[3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C

[5] LC_TIME=Italian_Italy.1252

attached base packages:

[1] stats graphics grDevices utils datasets methods base

other attached packages:

[1] igraph_0.7.0 devtools_1.4.1

loaded via a namespace (and not attached):

[1] digest_0.6.4 evaluate_0.5.3 formatR_0.10 httr_0.3 knitr_1.5

[6] memoise_0.1 parallel_3.0.3 RCurl_1.95-4.1 stringr_0.6.2 tools_3.0.3

[11] whisker_0.3-2

Markus Gesmann said...

Ale, I believe you forgot to load the package.
Try library(googleVis)
example(gvisSankey)

alessandro benedetti said...

thank Markus,

i would'nt like to abuse your tolerance, but I find very interesting in your work. I try new path but for me it's impossible to replicate your code about Sankey. (i'am trying to introduce R in our poor public school).

here my bash:

R version 3.0.3 (2014-03-06) -- "Warm Puppy"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)
...
update.packages(ask='graphics',checkBuilt=TRUE)
install.packages(googleVis)
library(googleVis)
require(igraph)
library(igraph)
require(RJSONIO)
library(RJSONIO)
#on http://cran.r-project.org/bin/windows/Rtools/ download Rtools31.exe
install.packages(devtools)
library(devtools)

>UKvisits <- data.frame(origin=c(....
.....
> plot(
gvisSankey(UKvisits, from="origin",
to="visit", weight="weight",
options=list(
height=250,
sankey="{link:{color:{fill:'lightblue'}}}"
+ ))
+ )

Error in plot(gvisSankey(UKvisits, from = "origin", to = "visit", weight = "weight", :
could not find function "gvisSankey"
> gvisSankey <- function(data, from="", to="", weight="",
options=list(), chartid){
my.type <- "Sankey"
dataName <- deparse(substitute(data))
my.options <- list(gvis=modifyList(list(width=400, height=400),options), dataName=dataName,
data=list(from=from, to=to, weight=weight,
+allowed=c("number", "string"))
)
#checked.data <- gvisCheckSankeyData(data, my.options)
output <- gvisChart(type=my.type, checked.data=data, options=my.options,
chartid=chartid, package="sankey")
return(output)
}

> plot(
gvisSankey(UKvisits, from="origin",
to="visit", weight="weight",
options=list(
height=250,
sankey="{link:{color:{fill:'lightblue'}}}"
))
)

Error in gvisSankey(UKvisits, from = "origin", to = "visit", weight = "weight", :
could not find function "gvisChart"


thank in any case.
ale

Markus Gesmann said...

You have to install the developer version from GitHub. The following lines of R code will do just that

install.packages(c("devtools","RJSONIO", "knitr", "shiny", "httpuv"))
library(devtools)
install_github("mages/googleVis")

alessandro benedetti said...

thank you for your kindness
congratulation for your work.
Ale

Brent Schneeman said...

In comparing gvisSankey and the rCharts sankey, there are parts of both that I like. When you hover over rCharts, tool tips popup indicating the number of items in the edge or the node. Any chance of getting that into gvisSankey?

http://timelyportfolio.github.io/rCharts_d3_sankey/example_build_network_sankey.html <- live version of rCharts.



I really like the gvisSankey behavior of highlighting all edges connected to a node when hovering over the node.

Peter said...

Hello Markus! I'm wondering if it is possible to change the order of the nodes/branches somehow?

Mohammad Moazzam Mehar said...

Hi,
I downloaded your code and try to run on my R Console,but the below error is coming..Please help me how to solve it

Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet, :
pdflatex is not available
Calls: -> texi2pdf -> texi2dvi
Execution halted
Error: Command failed (1)

Markus Gesmann said...

Well, the statement tells you that pdflatex is not available. Google how to install pdflatex on your computer.

Kim said...

Thank you for your interesting post!
I have an one question.
Can I add the label of the link? This is because, I need to add the specific number of weight in the sankey diagram.

Markus Gesmann said...

I don't think so. Check the Google documentation for more details: https://developers.google.com/chart/interactive/docs/gallery/sankey

Shalin Siriwaradhana said...

Sanky is great as there are interactive diagrams as well. But I use creately a cloud based diagram software to host my diagrams.

Post a Comment