Читать книгу Computational Statistics in Data Science - Группа авторов - Страница 40

2.1.4 What is the downside of R?

Оглавление

R is slow. Or at least that is the perception and sometimes the case. This is because R is not a compiled language, so methods of flow control such as for‐loops are not optimized. This shortcoming is easily circumvented by taking advantage of the vectorization offered through other built‐in functions like those from the apply family in R, but these faster techniques often go unused through lack of proficiency or because it is easier to write a for‐loop. Intrinsically slow functions can be written in C++ and run via Rcpp, but then that negates the simplicity of writing R. This is a special case where Python easily surpasses R. Python is also a scripted language, but through the use of NumPy and numba it can gain fast vectorized operations, loops, and utilize a just‐in‐time (JIT) compiler. Ergo, any performance shortcoming of Python can be taken care of through a decorator.

Packages are not written by programmers, or at least not programmers by trade or education. A great deal of libraries for R are written by researchers and analysts who needed a tool and created the tool. Because of this, there is often fragmentation in the syntax or incompatibility between packages, or generally a lack of best practices that leads to poorly performing code, or, in the most drastic setting, code that simply gives erroneous results.

Computational Statistics in Data Science

Подняться наверх