Happy Birthday R! R is turning 20 years old next Saturday

Jozef's Rblog

Contributor:
Jozef's Rblog
Visit: Jozef's Rblog

Originally posted on February 22, 2020 on Jozef’s Rblog:
https://jozef.io/r921-happy-birthday-r/

Here is how much bigger, stronger and faster it got over the years

Excerpt

It is almost the 29th of February 2020! A day that is very interesting for R, because it marks 20 years from the release of R v1.0.0, the first official public release of the R programming language.

The first release of R, 29th February 2000

The first official public release of R happened on the 29th of February, 2000. In the release announcement, Peter Dalgaard notes:

“The release of a current major version indicates that we believe that R has reached a level of stability and maturity that makes it suitable for production use. Also, the release of 1.0.0 marks that the base language and the API for extension writers will remain stable for the foreseeable future. In addition we have taken the opportunity to tie up as many loose ends as we could.”

Today, 20 years later, it is quite amazing how true the statement around the API remaining stable has proven. The original release announcement and full release statement are still available online.

You can also still download the very first public version of R. For instance, for Windows you can find it on the Previous Releases of R for Windows page. And it is quite runnable, even under Windows 10.

Faster – R today versus 20 years ago?

With the 20th birthday of R approaching, I was curious as to how much faster did the implementation of R get with increasing versions. I wrote a very simple benchmarking code to solve the Longest Collatz sequence problem for the first 1 million numbers with a brute-force-ish algorithm.

Then executed it on the same hardware using 20 different versions of R, starting with the very original 1.0, through 2.0, 3.0 all the way to today’s development version.

Benchmarking code

Below is the code snippet with the implementation to be benchmarked:

col_len <- function(n) {
  len <- 0
  while (n > 1) {
    len <- len + 1
    if ((n %% 2) == 0)
      n <- n / 2
    else {
      n <- (n * 3 + 1) / 2
      len <- len + 1
    }
  }
  len
}

res <- lapply(
  1:10,
  function(i) {
    gc()
    system.time(
      max(sapply(seq(from = 1, to = 999999), col_len))
    )
  }
)

Results

Now to the interesting part, the results – the below chart shows the boxplots of time required to execute the code in seconds, with R versions on the horizontal axis.time (seconds)Execution time by R version1.0.01.4.12.0.02.10.02.11.02.12.02.13.02.14.02.15.02.4.02.6.02.8.03.0.03.1.03.2.03.3.03.4.03.5.03.6.0devel025050075010001250

We can see that the median time to execute the above code to find the longest Collatz sequence amongst the first million numbers was:

  • February 2000: More than 17 minutes with the first R version, 1.0.0
  • January 2002: A large performance boost came already with the 1.4.1 release, decreasing the time by almost 4x, to around 4.5 minutes
  • October 2004: Even more interestingly, my measurements have seen another big improvement with version 2.0.0 – to just 168 seconds, less than 3 minutes. I was not however able to get such good results for any of the later 2.x versions
  • April 2014 – Another speed improvement came 10 years later, with version 3.1 decreasing the time to around 145 seconds
  • April 2017 – Finally, the 3.4 release has seen another significant performance boost, from this version on the time needed to perform this calculation is less than 30 seconds.

Visit Jozef’s Rblog to read more about the history and programming development of R https://jozef.io/r921-happy-birthday-r/

Disclosure: Interactive Brokers

Information posted on IBKR Traders’ Insight that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Traders’ Insight are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from Jozef Rblog and is being posted with permission from Jozef Rblog. The views expressed in this material are solely those of the author and/or Jozef Rblog and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.

Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.