Monday, 26 March 2012

Wednesday, 21 March 2012

Copy and paste small data sets into R

How can I embed a small data set into my R code? That was the question I came across today, when I prepared my talk about Dynamical Systems in R with simecol for the forthcoming Cologne R user group meeting.

I wanted to add all the R code of the talk to the last slide. That's easy, but the presentation makes use of a small data set of 3 columns and 21 rows. Surely there must be an elegant solution that I can embed the data into the R code, without writing x <- c(x1, x2,...).

Of course there is a solution, but let's look at the data first. It shows the numbers of trapped lynx and snowshoe hares recorded by the Hudson Bay company in North Canada from 1900 to 1920.


Data sourced from Joseph M. Mahaffy. Original data believed to be published
in E. P. Odum (1953), Fundamentals of Ecology, Philadelphia, W. B. Saunders.
Another source with data from 1845 to 1935 can be found on D. Hundley's page.

Saturday, 17 March 2012

Logistic map: Feigenbaum diagram in R

The other day I found some old basic code I had written about 15 years ago on a Mac Classic II to plot the Feigenbaum diagram for the logistic map. I remember, it took the little computer the whole night to produce the bifurcation chart.

With today's computers even a for-loop in a scripting language like R takes only a few seconds.
logistic.map <- function(r, x, N, M){
  ## r: bifurcation parameter
  ## x: initial value
  ## N: number of iteration
  ## M: number of iteration points to be returned
  z <- 1:N
  z[1] <- x
  for(i in c(1:(N-1))){
    z[i+1] <- r *z[i]  * (1 - z[i])
  }
  ## Return the last M iterations 
  z[c((N-M):N)]
}

## Set scanning range for bifurcation parameter r
my.r <- seq(2.5, 4, by=0.003)
system.time(Orbit <- sapply(my.r, logistic.map,  x=0.1, N=1000, M=300))
##   user  system elapsed (on a 2.4GHz Core2Duo)
##   2.910   0.018   2.919 

Orbit <- as.vector(Orbit)
r <- sort(rep(my.r, 301))

plot(Orbit ~ r, pch=".")

Let's not forget when Mitchell Feigenbaum started this work in 1975 he did this on his little calculator!

Update, 18 March 2012

The comment from Berend has helped to speedup the code by a factor of about four, thanks to byte compiling (using the same parameters as above), and Owe got me thinking about the alpha value of the plotting colour. Here is the updated result, with the R code below:

library(compiler) ## requires R >= 2.13.0
logistic.map <- cmpfun(logistic.map) # same function as above
my.r <- seq(2.5, 4, by=0.001)
N <- 2000; M <- 500; start.x <- 0.1
orbit <- sapply(my.r, logistic.map,  x=start.x, N=N, M=M)
Orbit <- as.vector(orbit)
r <- sort(rep(my.r, (M+1)))
plot(Orbit ~ r, pch=".", col=rgb(0,0,0,0.05))

Monday, 12 March 2012

Changes in life expectancy animated with geo charts

The data of the World Bank is absolutely amazing. I had said this before, but their updated iPhone App gives me a reason to return to this topic. Version 3 of the DataFinder App allows you to visualise the data on your phone, including motion maps, see the screen shot below.

Screen shot of DataFinder 3.0

I was intrigued by the by the changes in life expectancy over time around the world. The average life expectancy of a new born baby in 1960 was only 52.6 years, and don't forget we are in 2012, 52 years later. Babies born in 2009 can expect to live to the age 69.4 years, an increase by nearly 17 years or 32%. That is remarkable. But these are average figures and vary a lot by country.

The world bank's online version of the map unfortunately lacks the the animation of the smartphone app. Thus I was keen to find out if I could reproduce something similar in R based on code and ideas provided to me by Manoj Ananthapadmanabhan and Anand Ramalingam last year. Those ideas had helped me to create the animated geo maps demo "AnimatedGeoMap" of the googleVis R package.

Saturday, 10 March 2012

German train monitor provides access to train delay data

The German newspaper Süddeutsche Zeitung (SZ) worked together with OpenDataCity to create an online train monitor of the German network: Zugmonitor. This is another great example of the new form of data journalism.

The project provides access to data of train delays collected over 150 days between 2 October 2011 and 1 March 2012 and allows you to analyse the delays in more detail.

Here is an example showing the delays by station.

This SZ article (in German) gives you an overview of the data and how to access it. I believe the most convient method to query the data is to use the Google Fusion tables. It allows you to import the data into R with the read.csv function. The filename to use is an url mixed with a little bit of SQL syntax.

Here is an example extracting the station data from the map above (Fusion table 3166152):

The other sources can be accessed in the same way:

DelayFusion table ID
by station3166152
between stations (all trains) 3166064
between stations (ICE tains only) 3166328
by country 3166042
by cause 3165200
by daytime 3164289
by train type 3165124

I am curious what people will make of the data. Apparently more data will be made available in the future. I will keep an eye the project page.


Sunday, 4 March 2012

googleVis 0.2.15 is released: Improved geo and bubble charts

The guys behind the Google Visualisation API don't seem to rest. On 22 February 2012 they released an update of their API. Google added options for a gradient colour axis to bubble chart and a magnifying glass to geo chart, which opens when the user hovers over cluttered markers (excluding IE<=8). Those updates have been incorporated into version 0.2.15 of the googleVis package for R.

Examples of new features

Here are two examples demonstrating the new features.

Bubble chart with gradient colour mode

Example of a bubble chart with a colour gradient axis.

Thursday, 1 March 2012

Kölner R User Meeting 30 March 2012

Panorama of Cologne by Elke Wetzig. Source: Wikipedia, License: CC-BY-SA

Am 30. März 2012 möchte ich gerne das erste Kölner R Benutzer Treffen organisieren. Ich habe an den Treffen in London in den vergangen Jahren teilgenommen und hoffe auch in Köln Gleichgesinnte zu finden, die sich gerne bei einem Kölsch über R and das Leben unterhalten würden.

I would like to organise the first R user group meeting in Cologne, Germany, on 30 March 2012. In the past few years I have participated at the London R user groups and I hope to find also like-minded people in Cologne, who would like to catch up over a Kölsch on R and life in general.

Bitte geben Sie mir bescheid wenn Sie vorbei kommen möchten. Es erleichtert die Planung. Vorschläge für einen Vortrag werden gerne entgegengenommen.

Themen zu denen ich beisteuern kann sind zum Beispiel R in der Versicherungswelt, Daten Visualisierung, R im Zusammenspiel mit Datenbanken, LaTeX und MS Office.

Weitere Informationen werde ich später auf dieser Seite bereitstellen.

Please let me know if you would like to come along. It helps with the planning. I would very much appreciate suggestions for talks.

Topics on which I can contribute are for example R in the insurance industry, data visualisation, using R with data bases, LaTeX and MS Office.

I will provide further information and updates and the following page.