Showing posts with label World Bank. Show all posts
Showing posts with label World Bank. Show all posts

New R package to access World Bank data

Staying on top of new CRAN packages is quite a challenge nowadays. However, thanks to Dirk's CRANberries service I occasionally spot a new gem, such as wbstats, which appeared on CRAN last week.

Similarly to the WDI package, wbstats offers an interface to the World Bank database.

With the functions of wbstats the World Bank data can be searched and data for several indicators requested. Unlike WDI, the data is returned in a 'long' table with one column for all values and a separate column for the indicators. Additionally, the function wb allows me to specify how many most recent values (mrv) I am interested.

Thus, to recreate the famous Gapminder chart by Hans Rosling, showing the correlation between fertility, i.e. number of children per woman, and life expectancy over time by country and region, I can write (note, a Flash player is required):


If you'd like to learn more about how to create interactive charts with googleVis, then check out the free tutorial on DataCamp.

Session Info

R version 3.2.4 (2016-03-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.4 (El Capitan)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

other attached packages:
[1] googleVis_0.5.10 data.table_1.9.6 wbstats_0.1     

loaded via a namespace (and not attached):
[1] httr_1.1.0        R6_2.1.2          rsconnect_0.4.2.1
[4] tools_3.2.4       curl_0.9.7        RJSONIO_1.3-0    
[7] jsonlite_0.9.19   chron_2.3-47 

googleVis 0.4.2 with support for shiny released on CRAN

The new version of googleVis 0.4.2 is now available via CRAN. Many thanks to all who provided feedback on version 0.4.0 and particularly to Sebastian Campbell, John Maindonald and Aonan Zhang. As usual, if you find any issues or bugs, please send us an email or add a line to our online issues log.

With version 0.4.0 we introduced support for googleVis on shiny. See my previous post for more details and examples.

The CRAN release of googleVis 0.4.2 has some further improvements:
  • New shiny and FAQ sections in package vignette
  • Core charts (e.g. line, area, scatter, bar, column and combo charts) accept now also date variables for the x-axis
  • Typos in the Stock and Andrew data sets have been fixed
  • The WorldBank demo uses now the WDI package to download data from the World Bank, see the R code below

Changes in life expectancy animated with geo charts

The data of the World Bank is absolutely amazing. I had said this before, but their updated iPhone App gives me a reason to return to this topic. Version 3 of the DataFinder App allows you to visualise the data on your phone, including motion maps, see the screen shot below.

Screen shot of DataFinder 3.0

I was intrigued by the by the changes in life expectancy over time around the world. The average life expectancy of a new born baby in 1960 was only 52.6 years, and don't forget we are in 2012, 52 years later. Babies born in 2009 can expect to live to the age 69.4 years, an increase by nearly 17 years or 32%. That is remarkable. But these are average figures and vary a lot by country.

The world bank's online version of the map unfortunately lacks the the animation of the smartphone app. Thus I was keen to find out if I could reproduce something similar in R based on code and ideas provided to me by Manoj Ananthapadmanabhan and Anand Ramalingam last year. Those ideas had helped me to create the animated geo maps demo "AnimatedGeoMap" of the googleVis R package.

Setting the initial view of a motion chart in R

Following on from my article about accessing and plotting World Bank data with R I want to talk about how to change the initial view of a motion chart.

Over the last couple of weeks I have been asked a view times how to do this. For instance Stephen O'Grady wanted to create a motion chart, which shows initially a line chart, rather than a bubble chart.

Changing the initial settings of a motion chart is actually quite easy, if you know how to. The trick is to use the state argument in the list of options of gvisMotionChart.

As a case study I will use the World Bank data set and try to do some homework given by Duncan Temple Lang in his course on introduction to statistical computing course. Duncan asked his students to query the World Bank data base to create a line chart, which would show the number of internet users per 1000 in Africa over time. Further, he would like to see a legend next to the chart to identify which country is which and tooltips for each curve to identify the country.

A motion chart, displayed as a line chart, would do the trick.

Okay, getting the data is easy, thanks to the WDI package, or via a direct download, and so it is to create a motion chart with bubbles. Interactively I can change the bubble chart into a line chart, I can select some countries and change the y-axis to log-scale. However, when I reload the page I am back to square one: a bubble chart. So the idea is to pass the changed chart settings on to the initial plot. I find those settings, of the current view, as a string in the advanced tab of the settings window. I click on the wrench symbol in the bottom right hand corner of a motion chart to access this window.

Screen shot the settings window of a motion chart

Next I copy this string and paste it into the state argument of the options list. Note the line break at the beginning and at the end of the state string in the example. Alternatively I can add \n to both side of the state string.

Here is an example, where I pre-selected Sierra Leone and Seychelles (countries with the lowest and highest number of internet users) together with Africa, North Africa and Sub-Saharan Africa (all income levels). You find the R code below to replicate the plot.

What does the data tell you? Play around with the graph, e.g. change it to a column graph, deselect all countries and change the y-axes to linear again, and hit the play button. How could we improve the plot?

Accessing and plotting World Bank data with R

Over the past couple of days I played around with the data sets of the World Bank, and I have to admit that I am blown away by it. It is amazing, to see what is available on their web site and it is worth visiting their Data Visualisation Tools page. It is fantastic that they provide an API to their data. They have used it to build an iPhone App which is pretty cool. You can have the world's data in your pocket.

In this post I will show you how we can access data from the World Bank in R. As an example we create a motion chart, in the Hans Rosling style, as you find it on the Google Public Data Explorer site, which also uses data from the World Bank. Doing this, should give us the confidence that we understand the World Bank's interface. You can find this example as demo WorldBank as part of the googleVis package from version 0.2.10 onwards.

So let's try to replicate the initial plot of the Google Public Data Explorer, which shows fertility rate against life expectancy for each country from 1960 to today, whereby the countries are represented as bubbles, with the size reflecting the population and the colour the region.