Monitor the Total and Peak RAM Used by an Expression or Function

When working with big data sets, RAM conservation is critically important. However, it is not always enough to just monitor the size of the objects created. So-called "copy-on-modify" behavior, characteristic of R, means that some expressions or functions may require an unexpectedly large amount of RAM overhead. For example, replacing a single value in a matrix duplicates that matrix in the back-end, making this task require twice as much RAM as that used by the matrix itself. This package makes it easy to monitor the total and peak RAM used so that developers can quickly identify and eliminate RAM hungry code.


Welcome to the peakRAM GitHub page!

When working with big datasets, RAM conservation is critically important. However, it is not always enough to just monitor the size of the objects created. So-called "copy-on-modify" behavior, characteristic of R, means that some expressions or functions may require an unexpectedly large amount of RAM overhead. For example, replacing a single value in a matrix (e.g., with '[<-') duplicates that matrix in the backend, making this task require twice as much RAM as that used by the matrix itself. The peakRAM package makes it easy to monitor the total and peak RAM used so that developers can quickly identify and eliminate RAM hungry code. You can get started with peakRAM by installing the most up-to-date version of this package directly from GitHub.

library(devtools)
devtools::install_github("tpq/peakRAM")
library(peakRAM)

The peakRAM package, inspired by the very elegant microbenchmark package, offers an easy way to monitor the total and peak RAM used by any number of R expressions or functions, including anonymous functions. Simply call peakRAM with any number of comma-separated expressions or functions provided as arguments. This function will execute each argument piecewise, recording the amount of RAM allocated as a result of that call (i.e., "Total RAM Used") as well as the maximum amount of RAM allocated at any point during that call (i.e., "Peak RAM Used"). Note that throughout this package, all RAM use is measured in mebibytes (MiB).

peakRAM(function() 1:1e7,
        1:1e7,
        1:1e7 + 1:1e7,
        1:1e7 * 2)
##        Function_Call Elapsed_Time_sec Total_RAM_Used_MiB Peak_RAM_Used_MiB
## 1 function() 1:1e+07            0.110               38.2              38.2
## 2            1:1e+07            0.087               38.2              38.2
## 3  1:1e+07 + 1:1e+07            0.211               38.2              76.3
## 4        1:1e+07 * 2            0.142               76.3             114.5

What happened here? Well, we see that initializing the vector 1:1e7 requires ~38 MiB of RAM, whether done through an anonymous function or not. Also, as we might expect, we see that adding 1:1e7 to 1:1e7 requires ~72 MiB of RAM, even though the result only occupies ~38 MiB of RAM, because the vector 1:1e7 is initialized twice.

When RAM is valuable and the object is large, we want to avoid this kind of overhead. To achieve this, we might try instead to double the vector 1:1e7, avoiding addition altogether. But wait, this uses even more RAM. Why? Well, multiplying by 2 in this case first copies the integer vector to a double vector, then multiplies the double vector by 2. For an instant, the original integer vector and new double vector exists simultaneously, occupying ~38 MiB plus ~72 MiB of RAM.

Alas, R is a most precarious lover. To conserve the maximum amount of RAM, we need an approach that makes no needless copies. In this case, we just need to force 2 to exist as an integer.

peakRAM(1:1e7 * 2:2)
##   Function_Call Elapsed_Time_sec Total_RAM_Used_MiB Peak_RAM_Used_MiB
## 1 1:1e+07 * 2:2            0.099               38.1              38.1

Now, we have a solution that we can scale confidently, knowing for sure that we will not unwittingly exceed memory capacity through superfluous RAM overhead.

News


  • Use inherits to avoid warnings with some results

  • Introduced peakRAM function
    • Core function measures total and peak RAM use
    • Handles expression and function arguments

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("peakRAM")

1.0.2 by Thomas Quinn, 4 months ago


http://github.com/tpq/peakRAM


Report a bug at http://github.com/tpq/peakRAM/issues


Browse source code at https://github.com/cran/peakRAM


Authors: Thomas Quinn [aut, cre]


Documentation:   PDF Manual  


GPL-2 license


Suggests knitr, rmarkdown


See at CRAN