Wednesday, 19 December 2012

R in Insurance Conference, London, 15 July 2013

The first conference on R in Insurance will be held on Monday 15 July 2013 at Cass Business School in London, UK.

The intended audience of the conference includes both academics and practitioners who are active or interested in the applications of R in insurance.

This one-day conference will focus on applications in insurance and actuarial science that use R, the lingua franca for statistical computation. Topics covered may include actuarial statistics, capital modelling, pricing, reserving, reinsurance and extreme events, portfolio allocation, advanced risk tools, high-performance computing, econometrics and more. All topics will be discussed within the context of using R as a primary tool for insurance risk management, analysis and modelling.

The intended audience of the conference includes both academics and practitioners who are active or interested in the applications of R in insurance.

The 2013 R in Insurance conference builds upon the success of the R in Finance and R/Rmetrics events. We expect invited keynote lectures by:
We invite you to submit a one-page abstract for consideration. Both academic and practitioner proposals related to R are encouraged.

Details about the registration and abstract submission are given on the R in Insurance event web site of Cass Business School.

You can contact us via rinsuranceconference at gmail dot com.

The organisers, Andreas Tsanakas and Markus Gesmann, gratefully acknowledge the sponsorship of Mango Solutions.

Monday, 17 December 2012

Now I see it! K-means cluster analysis in R

Of course, a picture on a computer monitor is a coloured plot of x and y coordinates or pixels. Still, I was smitten by David Sparks' posts on is.r(), where he shows how easy it is to read images into R to analyse them. In two posts [1], [2] he replicates functionality of image manipulation programmes like GIMP.

I can't resist to write about this here as well. David's first post is about k-means cluster analysis. One of the popular algorithms for k-means is Lloyd's algorithm. So, on that note I will use a picture of the Lloyd's of London building to play around with David's code, despite the fact that the two Lloyds have nothing to do with each other. Lloyd's provides pictures of its building copyright free on its web site. However, I will use a reduced file size version hosted on wikimedia.

The jpeg package by Simon Urbanek [3] allows me to load a jpeg-file into R. The R object of the images is an array, which has the structure of three layered matrices, representing the value of the colours red, green and blue for each x and y coordinate. I convert the array into a data frame, as this is an accepted structure by k-means and plot the data.
url <-""
fn <- paste(tempfile(), "jpeg", sep=".")
download.file(url, fn, 
              mode = ifelse(['sysname'] == "Windows", 
                            'wb', 'w'))
readImage <- readJPEG(fn)
dm <- dim(readImage)
rgbImage <- data.frame(
                    x=rep(1:dm[2], each=dm[1]),
                    y=rep(dm[1]:1, dm[2]),

plot(y ~ x, data=rgbImage, main="Lloyd's building",
     col = rgb(rgbImage[c("r.value", "g.value", "b.value")]), 
     asp = 1, pch = ".")

Running a k-means analysis on the three colour columns in my data frame allows me to reduce the picture to k colours. The output gives me for each x and y coordinate the colour cluster it belongs to. Thus, I plot my picture again, but replace the original colours with the cluster colours.

Tuesday, 11 December 2012

Comparing regions: maps, cartograms and tree maps

Last week I attended a seminar where a talk was given about the economic opportunities in the SAAAME (South-America, Asia, Africa and Middle East) regions. Of course a map was shown with those regions highlighted. The map was not that disimilar to the one below.

mapByRegion( countryExData, 
             joinCode="ISO3", nameJoinColumn="ISO3V10",
             regionType="Stern", mapTitle=" ", addLegend=FALSE,
             FUN="mean", colourPalette=brewer.pal(6, "Blues"))
It is a map that most of us in the Northern hemisphere see often. However, it shows a distorted picture of the world.

Greenland appears to be of the same size as Brazil or Australia and Africa seems to be only about four times as big as Greenland. Of course this is not true. Africa is about 14 times the size of Greenland, while Brazil and Australia are about four times the size of Greenland, with Brazil slightly larger than Australia and nine times the population. Thus, talking about regional opportunities without a comparable scale can give a misleading picture.

Of course you can use a different projection and XKCD helps you to find one which fits your personality. On the other hand, Michael Gastner and Mark Newman developed cartograms, which can reshape the world based on data [1]. Doing this in R is a bit tricky. Duncan Temple Lang provides the Rcartogram package on Omegahat based on Mark Newman's code and Mark Ward has some examples using the package on his 2009 fall course page to get you started.

A simple example of a cartogram is given as part of the maps package. It shows the US population by state:

Tuesday, 4 December 2012

Changing colours and legends in lattice plots

Lattice plots are a great way of displaying multivariate data in R. Deepayan Sarkar, the author of lattice, has written a fantastic book about Multivariate Data Visualization with R [1]. However, I often have to refer back to the help pages to remind myself how to set and change the legend and how to ensure that the legend will use the same colours as my plot. Thus, I thought I write down an example for future reference.

Eventually I would like to get the following result:

I start with a simple bar chart, using the Insurance data set of the MASS package.

Please note that if you use R-2.15.2 or higher you may get a warning message with the current version of lattice (0.20-10), but this is a known issue, and I don't think you need to worry about it.
## Plot the claims frequency against age group by engine size and district
barchart(Claims/Holders ~ Age | Group, groups=District,
         data=Insurance, origin=0, auto.key=TRUE)

To change the legend I specify a list with the various options for auto.key. Here I want to set the legend items next to each other, add a title for the legend and change its font size: