A thought experiment: How CRAN saved 3,620 (working) lives

Given the vast amount of R packages available today, it makes sense (at least to me, as a trained economist) to ask a simple yet difficult question: How much value has been created by all those packages?

As all R stuff on CRAN is open-source (which is a blessing), there is no measureable GDP contribution in terms of market value that we can use to provide a quick answer. But all of us R users know the pleasant feeling, if not to say the excitement, of finding a package that provides exactly the functionality we have been looking for so long. This saves us the time of developing the functionality ourselves. So, apparantly, the time saving is one way to estimate the beneficial effect of the package sharing on CRAN.
Here comes a simple (and not too serious) approach to estimating this effect. 
(Side note: I am well aware of the extremely high concentration of capable statisticians and data scientists in the R community, so be clement with my approach, I am, as you will see shortly, not aimi…

New package 'packagefinder' - Search for packages from the R console

There are over 12,700 R packages on CRAN. How to find the right one for you? The new package 'packagefinder' helps you search for packages on CRAN right from the R console.

With 'packagefinder' you can search for multiple keywords in the name, title and description of the CRAN package, either case-sensitive or insensitive and define your own weighting scheme for the search results, if you like. Once you have found a promising package, you can use the simple function go() to go to the package's CRAN webpage or view its PDF manual, directly from the R console without having to installing the package first. Of course, you can also install the package easily, if you want to try it out.

Check our 'packagefinder' on CRAN:

And leave your comments on GitHub ( or contact me via Twitter or e-mail. Your ideas are highly appreciated!

New R package flatxml: working with XML files as R dataframes

The world is flat

The new R package flatxml provides functions to easily deal with XML files. When parsing an XML document fxml_importXMLFlat produces a special dataframe that is 'flat' by its very nature but contains all necessary information about the hierarchical structure of the underlying XML document (for details on the dataframe see the reference for the fxml_importXMLFlat function). flatxml offers a set of functions to work with this dataframe.

Apart from representing the XML document in a dataframe structure, there is yet another way in which flatxml relates to dataframes: the fxml_toDataFrame function can be used to extract data from an XML document into a dataframe, e.g. to work on the data with statistical functions. Because in this case there is no need to represent the XML document structure as such (it's all about the data contained in the document), there is no representation of the hierarchical structure of the document any more, it's just a normal dat…

New version of package 'xplain' - Contribute your ideas!

As I am preparing for the next version of my package 'xplain' (see for more details) I invite everybody to share their ideas for improvements and new features.

I'm currently planning to release the new version in August.

Your ideas are highly appreciated! Leave your comments on GitHub ( or contact me via Twitter or e-mail. Thank you for your contribution!

Sign-up for my newsletter!

I just set up a new newsletter to provide you with updates on my R packages. Don't worry: I will not spam you all the time but only reach out to you when there is really important news!
Stay in the loop and click here to subscribe.

New R package xplain: Providing interactive interpretations and explanations of statistical results

The package xplain is designed to help users intepret the results of their statistical analyses.

It does so not in an abstract way as textbooks do. Textbooks do not help the user of a statistical method understand his findings directly. What does a result of 3.14 actually mean? This is often hard to answer with a textbook alone because the book may provide its own examples but cannot refer to the specifics of the user's case. However, as we all know, we understand things best when they are explained to us with reference to the actual problem we are working on. xplain is made to fill this gap that textbooks (and other learning materials) leave.

The basic idea behind xplain is simple: Package authors or other people intested in explaining statistics provide interpretation information for a statistical method (i.e. an R function) in the format of an XML file. With a simple syntax this interpretation information can reference the results of the user's call of the explained R fun…

New introduction article to the R language (in German)

About two months ago, the German online magazine 'Informatik Aktuell' asked me to write an introductory article on R. And so I did.

It's now only a few days ago that the article was published. It focuses on key concepts of the R language and provides an overview of R's eco system. Given the space constraints it does not discuss more advanced topics like environments or package developments.

Read the full article here (in German):

Everyone who wants to read a more comprehensive introduction to R in German should try my book "Statistik mit R" (O'Reilly).