How many more R-bloggers posts can I expect?

I noticed that the monthly number of posts on R-bloggers stopped increasing over the last year. Indeed, the last couple of months saw a decline in posts compared to the previous year. Thus, has most been said and written about R already?

Who knows? Well, I took a stab at looking into the future. However, I can tell you already that I am not convinced by my predictions. But maybe someone else will be inspired to take this work forward.

First, I have to get the data - that's easy, I can scrape the monthly post counts from the R-bloggers homepage.

Looking at the incremental and cumulative plots, and believing that eventually the number of R posts will decrease, I thought that a logistic growth function would provide a nice fit to the data and also give an asymptotic view of the total number of posts on R-bloggers.

Although the fit, see below, looks reasonable at first glance, I don't believe it provides a sensible prediction of the future. The model would forecast only another 1,269 post by the end of 2016 with not much more to expect after that. Indeed the asymptotic total number of posts K is only 14,396. I don't believe this can be right, not even as a proxy, when the current count of monthly posts is well above 100.

I played around with data and the logistic growth function a little further, using annual instead of monthly data, changing the time horizon and fixing K, yet without much success.

Eventually I recalled a talk by Rob Hyndman's about his forecast package. After all, I have a time series here. So, applying the forecast function to the incremental data provides a somewhat more realistic prediction of 2,695 posts for the next 12 months, but with an increasing trend in monthly posts for 2014, which I find hard to believe given the observations over the last year.

Well, I presented two models here: One predicts a rapid decline in monthly posts on R-bloggers, while the other forecasts an increase. Neither feels right to me. Of course time will tell, but have you got any ideas or views?

Session Info

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats  graphics  grDevices  utils  datasets  methods   base     

other attached packages:
[1] forecast_4.8 xts_0.9-7    zoo_1.7-10   XML_3.95-0.2

loaded via a namespace (and not attached):
 [1] colorspace_1.2-4        fracdiff_1.4-2         
 [3] grid_3.0.2              lattice_0.20-23        
 [5] nnet_7.3-7              parallel_3.0.2         
 [7] quadprog_1.5-5          Rcpp_0.10.6            
 [9] RcppArmadillo_0.3.920.1 tools_3.0.2            
[11] tseries_0.10-32


Andrej-Nikolai Spiess said...

Interesting! I felt my own motivation for submitting posts also declining a bit because I often got no comments. This may be due to having written something uninteresting of course, or missing the right audience or touching a topic not en vogue. Another reason could be that people tend to consume rather than to involve. Anyone out there with the same feelings?


Rees Morrison said...

Two quick points:
First, I very much appreciate R-bloggers but don't know enough to comment and add value.

Second, might it be useful to see the distribution of posts on R-bloggers by blog? Has there been a drop off in blogs or in posts, or both. Possible, a posts by blog analysis over time would give more insights.


David said...

I have been looking at somewhat similar web time-course data, which has produced a similar graph, but without the dip over the past year. I think that a logistic (or similar s-shaped curve) fitted to the frequency data (rather than the cumulative data) would meet the case. That is, the blog posts have reached their upper limit, and you can continue to expect a yearly average similar to the current one.

adunaic said...

I read R-bloggers every day for a long time but this is the first time i've commented on an article!

Post a Comment