Tuesday, 15 January 2013

Reserving based on log-incremental payments in R, part II

Following on from last week's post I will continue to go through the paper Regression models based on log-incremental payments by Stavros Christofides [1]. In the previous post I introduced the model from the first 15 pages up to section F. Today I will progress with sections G to K which illustrate the model with a more realistic incremental claims payments triangle from a UK Motor Non-Comprehensive account:
# Page D5.17
tri <- t(matrix(
  c(3511, 3215, 2266, 1712, 1059, 587, 340,
    4001, 3702, 2278, 1180,  956, 629,  NA,
    4355, 3932, 1946, 1522, 1238,  NA,  NA,
    4295, 3455, 2023, 1320,   NA,  NA,  NA,
    4150, 3747, 2320,   NA,   NA,  NA,  NA,
    5102, 4548,   NA,   NA,   NA,  NA,  NA,
    6283,   NA,   NA,   NA,   NA,  NA,  NA), nc=7))
The rows show origin period data, e.g. accident years, underwriting years or years of account and the columns present the development periods or lags. The triangle appears to be fairly well behaved. The last two years in rows 6 and 7 appear to be slightly higher than rows 2 to 5 and the values in row 1 are lower in comparison to the later years. The last payment of £1,238 in the third row stands out a bit as well.

Before I plot the data, I will transform the triangle into a data frame and add extra columns:
m <- dim(tri)[1]; n <- dim(tri)[2]
dat <- data.frame(
  origin=rep(0:(m-1), n),
  dev=rep(0:(n-1), each=m),
  value=as.vector(tri))

## Add dimensions as factors
dat <- with(dat, data.frame(origin, dev, cal=origin+dev, 
                            value, logvalue=log(value),
                            originf=factor(origin),
                            devf=as.factor(dev),
                            calf=as.factor(origin+dev)))
I am particularly interested in the decay of claims payments in the development year direction for each origin year on the original and log-scale. The interaction.plot of the stats package does an excellent job for this:
op <- par(mfrow=c(2,1), mar=c(4,4,2,2))
with(dat, interaction.plot(x.factor=dev, trace.factor=origin, 
          response=value))
points(dat$devf, dat$value, pch=16, cex=0.5)
with(dat, interaction.plot(x.factor=dev, trace.factor=origin, 
          response=logvalue))
points(dat$devf, dat$logvalue, pch=16, cex=0.5)
par(op)
Indeed the origin years 1 to 4 (rows 2 to 5) look quite similar and the decay of claims in development year direction appears to be linear on a log-scale from development year 1 onwards.

Based on those observations Christofides suggests two models; the first one will have a unique level for each origin year and a unique level for the zero development period. The parameters for development periods 1 to 6 are assumed to follow a linear relationship with the same slope \(s\):
\begin{align}
\ln(P_{ij}) & = Y_{ij} = a_i + d_j + \epsilon_{ij}
&\mbox{for } i,\,j \mbox{ from } 0 \mbox{ to } 6\\
\mbox{where } d_0 &= d,\quad d_j = s \cdot j
&\mbox{for } j > 0
\end{align}and \(\epsilon_{ij} \sim N(0, \sigma^2)\). The second model will be a reduced version of the above with only two levels for the origin years 5 and 6. Hence, I add four more columns to my data frame:
dat <- with(dat, 
            data.frame(dat,
                       a6 = ifelse(origin == 6, 1, 0),
                       a5 = ifelse(origin == 5, 1, 0),
                       d = ifelse(dev < 1, 1, 0),
                       s = ifelse(dev < 1, 0, dev)))        

dat <- dat[order(dat$origin),] # to have similar output as in the paper 
As in the earlier section of the paper and my previous post, I start with a model using all levels of the origin factor:
# Page D5.20/21
summary(Fit1 <- lm(logvalue ~ 0 + originf + d + s, data=na.omit(dat)))

# Call: 
# lm(formula = logvalue ~ 0 + originf + d + s, data = na.omit(dat))
# 
# Residuals:
#      Min       1Q   Median       3Q      Max 
# -0.22142 -0.03973  0.01116  0.03289  0.19622 
# 
# Coefficients:
#          Estimate Std. Error t value Pr(>|t|)    
# originf0  8.57284    0.07569 113.262  < 2e-16 ***
# originf1  8.57379    0.07152 119.884  < 2e-16 ***
# originf2  8.66487    0.06936 124.923  < 2e-16 ***
# originf3  8.55412    0.07022 121.814  < 2e-16 ***
# originf4  8.63666    0.07587 113.841  < 2e-16 ***
# originf5  8.84550    0.09059  97.646  < 2e-16 ***
# originf6  9.04182    0.13368  67.638  < 2e-16 ***
# d        -0.29622    0.06990  -4.238 0.000445 ***
# s        -0.43496    0.01849 -23.526 1.63e-15 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
# 
# Residual standard error: 0.1139 on 19 degrees of freedom
# Multiple R-squared: 0.9999,  Adjusted R-squared: 0.9998 
# F-statistic: 1.428e+04 on 9 and 19 DF,  p-value: < 2.2e-16 
Clearly all origin levels are significantly different from zero, yet most of them are actually quite similar. The tests in the above output don't make much sense as they test the various intercepts against zero. That's easy to fix, I take out the first term '0 +', which forced lm to fit individual intercepts. The next model makes much more sense:
# Page D5.21/22
summary(Fit2 <- lm(logvalue ~ originf + d + s, data=na.omit(dat)))

# Call:
# lm(formula = logvalue ~ originf + d + s, data = na.omit(dat))
# 
# Residuals:
#      Min       1Q   Median       3Q      Max 
# -0.22142 -0.03973  0.01116  0.03289  0.19622 
# 
# Coefficients:
#               Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  8.5728353  0.0756903 113.262  < 2e-16 ***
# originf1     0.0009565  0.0639346   0.015 0.988220    
# originf2     0.0920366  0.0686753   1.340 0.195997    
# originf3    -0.0187149  0.0752609  -0.249 0.806287    
# originf4     0.0638284  0.0843021   0.757 0.458254    
# originf5     0.2726676  0.0982451   2.775 0.012052 *  
# originf6     0.4689829  0.1315930   3.564 0.002072 ** 
# d           -0.2962154  0.0699026  -4.238 0.000445 ***
# s           -0.4349597  0.0184885 -23.526 1.63e-15 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
# 
# Residual standard error: 0.1139 on 19 degrees of freedom
# Multiple R-squared: 0.9832,  Adjusted R-squared: 0.9762 
# F-statistic: 139.1 on 8 and 19 DF,  p-value: 3.287e-15 
Following Christofides I reduce the model, assuming a constant intercept (or constant claims in development period 0) for the origin years 0 to 4 and separate ones for the the origin years 5 and 6:
# Page D5.23
summary(Fit3 <- lm(logvalue ~ a5 + a6 + d + s, data=na.omit(dat)))

# Call:
# lm(formula = logvalue ~ a5 + a6 + d + s, data = na.omit(dat))
# 
# Residuals:
#      Min       1Q   Median       3Q      Max 
# -0.21567 -0.04910  0.00654  0.05137  0.27198 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  8.60795    0.05150 167.142  < 2e-16 ***
# a5           0.24353    0.08517   2.859 0.008870 ** 
# a6           0.44111    0.12170   3.625 0.001421 ** 
# d           -0.30345    0.06779  -4.476 0.000172 ***
# s           -0.43967    0.01666 -26.390  < 2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
# 
# Residual standard error: 0.1119 on 23 degrees of freedom
# Multiple R-squared: 0.9804,  Adjusted R-squared: 0.977 
# F-statistic: 287.7 on 4 and 23 DF,  p-value: < 2.2e-16 
Wonderful, all coefficients appear to be significant and I am left with a model of only five parameters. Next I look at the residual plots:
# Resdiual plots
op <- par(mfrow=c(2,2), mar=c(4,4,2,2))
attach(model.frame(Fit3))
with(na.omit(dat), 
plot.default(rstandard(Fit3) ~ origin,
             main="Residuals vs. origin years"))
abline(h=0, lty=2)

with(na.omit(dat), 
plot.default(rstandard(Fit3) ~ dev,
             main="Residuals vs. dev. years"))
abline(h=0, lty=2)
with(na.omit(dat), 
     plot.default(rstandard(Fit3) ~ cal,
                  main="Residuals vs. payments years"))
abline(h=0, lty=2)
plot.default(rstandard(Fit3) ~ logvalue,
             main="Residuals vs. fitted")
abline(h=0, lty=2)
detach(model.frame(Fit3))
par(op)
The residual plots look quite well behaved. The last payment in year 2 does appear to be a bit of an outlier and also the pattern in the payment or calendar year direction gives an indication of potential claims inflation which should be investigated further.

Following the paper and for the purpose of this post I want to focus on deriving estimators for the payments in future years and their standard errors. I did most of the leg work last week, so I here bring the various calculations for the forecast and error estimation together into a new function:
log.incr.predict <- function(
            model, # lm output
            newdata, # same argument as in predict
            origin.var="origin", # name of the origin column in newdata
            dev.var="dev" # name of the dev column in newdata
  ){
  origin <- newdata[[origin.var]] 
  dev <- newdata[[dev.var]]

  Pred <- predict(model, newdata=newdata, se.fit=TRUE)
  
  Y <- Pred$fit
  VarY <- Pred$se.fit^2 + Pred$residual.scale^2 
  P <- exp(Y + VarY/2)
  VarP <-  exp(2*Y + VarY)*(exp(VarY)-1)
  seP <- sqrt(VarP)
  
  ## Recreate formula to derive the model.frame and future design matrix   
  model.formula <- as.formula(paste("~", as.character(formula(model)[3])))
  ## See also package formula.tool
  
  mframe <- model.frame(model.formula, data=newdata)
  fdm <- model.matrix(model.formula, data=newdata)
  varcovar <- fdm %*% vcov(model) %*% t(fdm)
  CoVar <-  sweep(sweep((exp(varcovar)-1), 1, P, "*"), 2, P, "*")
  CoVar[col(CoVar)==row(CoVar)] <- 0
  
  OverallVar <- sum(CoVar) + sum(VarP)
  Total.SE <- sqrt(OverallVar)
  Total.Reserve <- sum(P)
  
  # Prepare output
  Incr=data.frame(origin, dev, Y, VarY, P, seP, CV=seP/P)
  out <- list(Forecast=Incr[order(newdata[[origin.var]]),],              
              Totals=data.frame(Total.Reserve, 
                                Total.SE=Total.SE, 
                                CV=Total.SE/Total.Reserve)
              )
                
  return(out)
  }
With the above function it is straightforward to carry out the prediction of future claims payments and standard errors. As a bonus I can estimate claims payments beyond the available data, as my models assume an exponential decay. Here I will assume in line with Christofides that claims payments will stop 6 years after the latest available development data:
tail.years <- 6
# Create a data frame for the future periods
fdat <- data.frame(
  origin=rep(0:(m-1), n+tail.years),
  dev=rep(0:(n+tail.years-1), each=m)
  )
fdat$cal <- with(fdat, origin + dev) 

fdat <- with(fdat, 
             data.frame(fdat,
                        originf = factor(origin),
                        a6 = ifelse(origin == 6, 1, 0),
                        a5 = ifelse(origin == 5, 1, 0),
                        d = ifelse(dev < 1, 1, 0),
                        s = ifelse(dev < 1, 0, dev)))        
So, here are the results for my two models:
FM2 <- log.incr.predict(Fit2, subset(fdat, cal>6))
FM2 # Page D5.32/33

# $Forecast
#    origin dev        Y       VarY          P        seP        CV
# 50      0   7 5.528118 0.01947679  254.13254  35.639951 0.1402416
# 57      0   8 5.093158 0.02232053  164.73171  24.748983 0.1502381
# 64      0   9 4.658198 0.02584791  106.81754  17.284935 0.1618174
# 71      0  10 4.223239 0.03005895   69.28774  12.103614 0.1746862
# 78      0  11 3.788279 0.03495363   44.95921   8.479513 0.1886046
# 85      0  12 3.353319 0.04053196   29.18296   5.935315 0.2033829
# 44      1   6 5.964034 0.01852071  392.79744  53.704591 0.1367234
# 51      1   7 5.529074 0.02098384  254.56748  37.070434 0.1456212
# 58      1   8 5.094114 0.02413062  165.03864  25.792598 0.1562822
# 65      1   9 4.659155 0.02796105  107.03279  18.023389 0.1683913
# 72      1  10 4.224195 0.03247512   69.43788  12.615584 0.1816816
# 79      1  11 3.789235 0.03767285   45.06346   8.829618 0.1959374
# 86      1  12 3.354276 0.04355423   29.25507   6.172516 0.2109896
# 38      2   5 6.490074 0.01787515  664.48422  89.238683 0.1342977
# 45      2   6 6.055114 0.01994216  430.55926  61.106559 0.1419237
# 52      2   7 5.620154 0.02269282  279.08060  42.280723 0.1515000
# 59      2   8 5.185195 0.02612713  180.95676  29.441747 0.1627005
# 66      2   9 4.750235 0.03024509  117.37306  20.567807 0.1752345
# 73      2  10 4.315275 0.03504670   76.15712  14.383028 0.1888599
# 80      2  11 3.880315 0.04053196   49.43118  10.053456 0.2033829
# 87      2  12 3.445356 0.04670087   32.09519   7.017665 0.2186516
# 32      3   4 6.814282 0.01765354  918.83680 122.623541 0.1334552
# 39      3   5 6.379322 0.01929729  595.24359  83.088604 0.1395876
# 46      3   6 5.944362 0.02162469  385.74431  57.033048 0.1478519
# 53      3   7 5.509403 0.02463574  250.06492  39.492611 0.1579294
# 60      3   8 5.074443 0.02833043  162.16401  27.489339 0.1695157
# 67      3   9 4.639483 0.03270878  105.19731  19.182172 0.1823447
# 74      3  10 4.204524 0.03777077   68.26581  13.393528 0.1961967
# 81      3  11 3.769564 0.04351641   44.31495   9.345848 0.2108960
# 88      3  12 3.334604 0.04994571   28.77702   6.512388 0.2263051
# 26      4   3 7.331785 0.01813609 1542.02654 208.610284 0.1352832
# 33      4   4 6.896825 0.01930228  998.72172 139.427318 0.1396058
# 40      4   5 6.461865 0.02115212  647.06155  94.606974 0.1462102
# 47      4   6 6.026906 0.02368561  419.36786  64.925326 0.1548171
# 54      4   7 5.591946 0.02690275  271.88995  44.897168 0.1651299
# 61      4   8 5.156986 0.03080354  176.33544  31.188388 0.1768697
# 68      4   9 4.722027 0.03538797  114.40224  21.712803 0.1897935
# 75      4  10 4.287067 0.04065606   74.24682  15.124104 0.2037003
# 82      4  11 3.852107 0.04660779   48.20251  10.528803 0.2184285
# 89      4  12 3.417148 0.05324318   31.30473   7.320629 0.2338506
# 20      5   2 7.975584 0.02024479 2938.65106 420.248726 0.1430074
# 27      5   3 7.540624 0.02079769 1902.68762 275.827175 0.1449671
# 34      5   4 7.105664 0.02203424 1232.35383 183.942131 0.1492608
# 41      5   5 6.670704 0.02395444  798.45748 124.322818 0.1557037
# 48      5   6 6.235745 0.02655829  517.50747  84.899789 0.1640552
# 55      5   7 5.800785 0.02984579  335.52888  58.400959 0.1740564
# 62      5   8 5.365825 0.03381694  217.61641  40.359033 0.1854595
# 69      5   9 4.930866 0.03847173  141.18931  27.961671 0.1980438
# 76      5  10 4.495906 0.04381018   91.63481  19.391972 0.2116223
# 83      5  11 4.060946 0.04983227   59.49323  13.447943 0.2260416
# 90      5  12 3.625987 0.05653801   38.63875   9.318818 0.2411780
# 14      6   1 8.606859 0.02956698 5550.49295 961.508610 0.1732294
# 21      6   2 8.171899 0.02896367 3591.69920 615.713978 0.1714269
# 28      6   3 7.736939 0.02904400 2324.96711 399.122363 0.1716680
# 35      6   4 7.301979 0.02980798 1505.50472 261.874245 0.1739445
# 42      6   5 6.867020 0.03125561  975.20492 173.764972 0.1781830
# 49      6   6 6.432060 0.03338689  631.91417 116.434338 0.1842566
# 56      6   7 5.997100 0.03620181  409.60831  78.645947 0.1920028
# 63      6   8 5.562141 0.03970039  265.59988  53.450274 0.2012436
# 70      6   9 5.127181 0.04388262  172.28023  36.489139 0.2118011
# 77      6  10 4.692221 0.04874849  111.78704  24.985402 0.2235089
# 84      6  11 4.257262 0.05429801   72.55978  17.139964 0.2362185
# 91      6  12 3.822302 0.06053118   47.11388  11.769112 0.2498014
# 
# $Totals
#   Total.Reserve Total.SE         CV
# 1       34377.1 2742.493 0.07977674


FM3 <- log.incr.predict(Fit3, subset(fdat, cal>6))
FM3 # Page D5.35

# $Forecast
#    origin dev        Y       VarY          P        seP        CV
# 50      0   7 5.530256 0.01823949  254.51909  34.531073 0.1356718
# 57      0   8 5.090586 0.02089800  164.19172  23.860337 0.1453200
# 64      0   9 4.650915 0.02411166  105.95042  16.551577 0.1562200
# 71      0  10 4.211245 0.02788045   68.38717  11.498964 0.1681450
# 78      0  11 3.771575 0.03220439   44.15371   7.987864 0.1809104
# 85      0  12 3.331904 0.03708347   28.51545   5.542545 0.1943699
# 44      1   6 5.969926 0.01613612  394.64811  50.334287 0.1275422
# 51      1   7 5.530256 0.01823949  254.51909  34.531073 0.1356718
# 58      1   8 5.090586 0.02089800  164.19172  23.860337 0.1453200
# 65      1   9 4.650915 0.02411166  105.95042  16.551577 0.1562200
# 72      1  10 4.211245 0.02788045   68.38717  11.498964 0.1681450
# 79      1  11 3.771575 0.03220439   44.15371   7.987864 0.1809104
# 86      1  12 3.331904 0.03708347   28.51545   5.542545 0.1943699
# 38      2   5 6.409597 0.01458789  612.09698  74.199726 0.1212222
# 45      2   6 5.969926 0.01613612  394.64811  50.334287 0.1275422
# 52      2   7 5.530256 0.01823949  254.51909  34.531073 0.1356718
# 59      2   8 5.090586 0.02089800  164.19172  23.860337 0.1453200
# 66      2   9 4.650915 0.02411166  105.95042  16.551577 0.1562200
# 73      2  10 4.211245 0.02788045   68.38717  11.498964 0.1681450
# 80      2  11 3.771575 0.03220439   44.15371   7.987864 0.1809104
# 87      2  12 3.331904 0.03708347   28.51545   5.542545 0.1943699
# 32      3   4 6.849267 0.01359481  949.62249 111.100293 0.1169942
# 39      3   5 6.409597 0.01458789  612.09698  74.199726 0.1212222
# 46      3   6 5.969926 0.01613612  394.64811  50.334287 0.1275422
# 53      3   7 5.530256 0.01823949  254.51909  34.531073 0.1356718
# 60      3   8 5.090586 0.02089800  164.19172  23.860337 0.1453200
# 67      3   9 4.650915 0.02411166  105.95042  16.551577 0.1562200
# 74      3  10 4.211245 0.02788045   68.38717  11.498964 0.1681450
# 81      3  11 3.771575 0.03220439   44.15371   7.987864 0.1809104
# 88      3  12 3.331904 0.03708347   28.51545   5.542545 0.1943699
# 26      4   3 7.288937 0.01315686 1473.67697 169.593220 0.1150817
# 33      4   4 6.849267 0.01359481  949.62249 111.100293 0.1169942
# 40      4   5 6.409597 0.01458789  612.09698  74.199726 0.1212222
# 47      4   6 5.969926 0.01613612  394.64811  50.334287 0.1275422
# 54      4   7 5.530256 0.01823949  254.51909  34.531073 0.1356718
# 61      4   8 5.090586 0.02089800  164.19172  23.860337 0.1453200
# 68      4   9 4.650915 0.02411166  105.95042  16.551577 0.1562200
# 75      4  10 4.211245 0.02788045   68.38717  11.498964 0.1681450
# 82      4  11 3.771575 0.03220439   44.15371   7.987864 0.1809104
# 89      4  12 3.331904 0.03708347   28.51545   5.542545 0.1943699
# 20      5   2 7.972137 0.01949165 2927.43785 410.706570 0.1402956
# 27      5   3 7.532467 0.01989258 1886.37635 267.385161 0.1417454
# 34      5   4 7.092796 0.02084866 1215.87674 176.480269 0.1451465
# 41      5   5 6.653126 0.02235988  783.91921 117.879486 0.1503720
# 48      5   6 6.213456 0.02442624  505.56105  79.498576 0.1572482
# 55      5   7 5.773785 0.02704774  326.13428  54.001429 0.1655804
# 62      5   8 5.334115 0.03022439  210.44560  36.864508 0.1751736
# 69      5   9 4.894445 0.03395617  135.83253  25.244126 0.1858474
# 76      5  10 4.454774 0.03824310   87.69772  17.315310 0.1974431
# 83      5  11 4.015104 0.04308517   56.63610  11.883707 0.2098257
# 90      5  12 3.575434 0.04848238   36.58634   8.154476 0.2228831
# 14      6   1 8.609387 0.02849728 5561.56881 945.584646 0.1700212
# 21      6   2 8.169717 0.02791130 3581.98421 602.630554 0.1682393
# 28      6   3 7.730046 0.02788045 2307.65333 388.020444 0.1681450
# 35      6   4 7.290376 0.02840476 1487.09261 252.420631 0.1697410
# 42      6   5 6.850706 0.02948420  958.57481 165.817262 0.1729831
# 49      6   6 6.411035 0.03111878  618.06557 109.883703 0.1777865
# 56      6   7 5.971365 0.03330851  398.62418  73.361414 0.1840365
# 63      6   8 5.531694 0.03605338  257.16584  49.273391 0.1916016
# 70      6   9 5.092024 0.03935339  165.95237  33.247674 0.2003447
# 77      6  10 4.652354 0.04320854  107.12090  22.509569 0.2101324
# 84      6  11 4.212683 0.04761883   69.16486  15.274447 0.2208412
# 91      6  12 3.773013 0.05258426   44.67014  10.379573 0.2323604
# 
# $Totals
#   Total.Reserve Total.SE         CV
# 1      33846.53 2545.076 0.07519459
The two models produce very similar results and it shouldn't be much of a surprise as they are quite similar indeed. The second model has thanks to its smaller number of parameters a proportionally smaller standard error and may hence be the preferred choice.

For comparison here is the output of a Mack chain-ladder model [2] run on the same triangle using the ChainLadder package [3], assuming a tail factor of 1.05 and standard error of 0.02:
library(ChainLadder)
M <- MackChainLadder(incr2cum(tri), est.sigma="Mack", 
                     tail=1.05, tail.se=0.02)
M
# MackChainLadder(Triangle = incr2cum(tri), est.sigma = "Mack", 
#                 tail = 1.05, tail.se = 0.02)
# 
# Latest Dev.To.Date Ultimate   IBNR Mack.S.E CV(IBNR)
# 1 12,690       0.952   13,324    634      254   0.4010
# 2 12,746       0.927   13,752  1,006      263   0.2611
# 3 12,993       0.882   14,732  1,739      282   0.1623
# 4 11,093       0.804   13,795  2,702      303   0.1120
# 5 10,217       0.701   14,574  4,357      527   0.1210
# 6  9,650       0.547   17,653  8,003      801   0.1001
# 7  6,283       0.289   21,714 15,431    1,032   0.0669
# 
# Totals
# Latest:     75,672.00
# Dev:             0.69
# Ultimate:  109,544.16
# IBNR:       33,872.16
# Mack S.E.:   2,563.40
# CV(IBNR):        0.08

The chain ladder method provides similar forecast to the log-incremental regression model, but at the price of many more parameters. Also my estimation of the tail factor and standard error is just my wet finger in the air.

Next week I will discuss sections L - M, which examine how data normalisation and claims inflation can be addressed.

Conclusions

The log-incremental regression model provides an intuitive and elegant stochastic claims reserving model. By the way, at the time of writing Christofides noted that the calculation on a 12Mhz computer with maths co-processor took just less than two minutes(!).

I wonder if the first step of the model selection process could be simplified further. In particular the aspect of finding periods of developments where the decay in claims payments could be regarded linear. Nathan Lemoine posted an interesting article on picewise linear regression using the segmented package [4] . An alternative might use functions of the strucchange package [5].

Using hierarchical or multilevel models as presented by J. Guszcza [6] appears a natural next step for claims reserving. Thankfully Jim provides his R code as well. The Clark LDF model which Jim mentions, has already been implemented by Dan Murphy in R as part of the ChainLadder package.

Lastly I wonder if the plm package [7] for linear models of panel data could be useful as well?

As usual, you find the R code of this post as a gist on Github and feedback and comments are appreciated.

References

[1] Stavros Christofides. Regression models based on log-incremental payments. Claims Reserving Manual. Volume 2 D5. September 1997

[2] Thomas Mack. The standard error of chain ladder reserve estimates: Recursive calculation and inclusion of a tail factor. Astin Bulletin, Vol. 29(2):361 – 266, 1999.

[3] Markus Gesmann, Dan Murphy, and Wayne Zhang. ChainLadder: Mack-, Bootstrap and Munich-chain-ladder methods for insurance claims reserving, 2012. R package version 0.1.5-4.

[4] Vito M.R. Muggeo. segmented: Segmented relationships in regression models with breakpoints/changepoints estimation, 2012. R package version 0.2-9.3.

[5] Achim Zeileis, Friedrich Leisch, Kurt Hornik, Christian Kleiber. strucchange: Testing, Monitoring, and Dating Structural Changes, 2012. R package version 1.4-7.

[6] James Guszcza. Hierarchical Growth Curve Models for Loss Reserving, Fall 2008. Casualty Actuarial Society E-Forum.

[7] Yves Croissant, Giovanni Millo. plm: Linear Models for Panel Data, 2012. R package version 1.3-1.

Session Info

R version 2.15.2 Patched (2013-01-01 r61512)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] grid splines stats graphics grDevices utils datasets methods base     

other attached packages:
[1] ChainLadder_0.1.5-4 tweedie_2.1.5 statmod_1.4.16  cplm_0.6-4       
lme4_0.999999-0  ggplot2_0.9.3 coda_0.16-1 biglm_0.8 DBI_0.2-5 
reshape2_1.2.2 actuar_1.1-5 RUnit_0.4.26 systemfit_1.1-14 lmtest_0.9-30
zoo_1.7-9 car_2.0-15 nnet_7.3-5 MASS_7.3-22 Matrix_1.0-10 lattice_0.20-10
Hmisc_3.10-1 survival_2.37-2 
   
loaded via a namespace (and not attached):
[1] cluster_1.14.3 colorspace_1.2-0 dichromat_1.2-4 digest_0.6.0
gtable_0.1.2 labeling_0.1 minqa_1.2.1 munsell_0.4 nlme_3.1-106
plyr_1.8 proto_0.3-10 RColorBrewer_1.0-5 sandwich_2.2-9 scales_0.2.3
stats4_2.15.2 stringr_0.6.2     

No comments:

Post a Comment