Optimising R Workflows: Welcome!


Introduction

Dr Anna Krystallli

R-RSE


https://optimising-r.netlify.app

đź‘‹ Hello

me: Dr Anna Krystalli

Objectives

In this course we’ll explore:

  • Benchmarking and profiling code

  • Best practice for writing performant code in R

  • Best practice in working efficiently with data

  • Parallelising workflows

Background

Computation

transistor icon

Transistor icons created by surang - Flaticon

Moore’s law

Yet…

we’ve hit clock speed stagnation

50 Years of Processor Trends. Distributed by Karl Rupp under a CC-BY 4.0 License

About computer hardware

CPU (Processing)

RAM (memory)

Hard Disks, Networks (I/O)

About R

R is an interpreted language

Compiled Languages

Converted directly into machine code that the processor can execute.

  • Tend to be faster and more efficient to execute.

  • Need a “build” step which builds for system they are run on

  • Examples: C, C++, Erlang, Haskell, Rust, and Go

Interpreted Languages

Code interpreted line by line during run time.

  • significantly slower although just-in-time compilation is closing that gap.

  • much more compact and easy to write

  • Examples: R, Ruby, Python, and JavaScript.

R performance

  • R offers some excellent features: dynamic typing, lazy functional evaluation and object-orientation

  • Side effect: operations are undertaken in single-threaded mode, i.e. sequentially

  • Many routines in R are written in compiled languages like C & Fortran.

  • R performance can be enhanced by linking to optimised Linear Algebra Libraries.

  • R offers many ways to parallelise computations.

  • Many packages wrap more performant C, Fortran, C++ code.

About this course

  • I normally like to live code…BUT!

  • There’s a lot of materials to get through so I will be copying & pasting from the materials alot

  • Have the materials handy to follow along

  • Please stop me for questions or to share your own experiences

  • Lunch around 1pm

Let’s go!