Functions that simplify submitting R scripts to a Slurm workload manager, in part by automating the division of embarrassingly parallel calculations across cluster nodes.
CRAN checks: rslurm results
Development of this R package was supported by the National Socio-Environmental Synthesis Center (SESYNC) under funding received from the National Science Foundation DBI-1052875.
The package was developed by Philippe Marchand, with Ian Carroll (current maintainer) and Mike Smorul contributing.
Install the package from R with
install.packages('rslurm'). Note that job
submission is only possible on a system with access to a Slurm workload manager
(i.e. a system where the command line utilities
information from a Slurm head node).
Package documentation is accessible from the R console through
First version on CRAN
Major update to the package interface and implementation:
submit argument to
submit = FALSE,
the submission scripts are created but not run. This is useful if the files need
to be transferred from a local machine to the cluster and run at a later time.
Added new optional arguments to
slurm_call, allowing users to give
informative names to SLURM jobs (
jobname) and set any options understood by
data_file arugment to
slurm_call is replaced with
add_objects, which accepts a vector of R object names from the active workspace
and automatically saves them in a .RData file to be loaded on each node.
slurm_call now generate R and Bash scripts through
whisker templates. Advanced users may want
to edit those templates in the
templates folder of the installed R package
(e.g. to set default SBATCH options in
Files generated by the package (scripts, data files and output) are now saved
in a subfolder named
_rslurm_[jobname] in the current working directory.
Minor updates, including reformatting the output of
removing this package's dependency on
slurm_apply function to use
parallel::mcMap instead of
which fixes a bug where list outputs (i.e. each function call returns a list)
would be collapsed in a single list (rather than returned as a list of lists).
Changed the interface so that the output type (table or raw) is now an argument
get_slurm_out rather than of
slurm_apply, and defaults to
cpus_per_node argument to
slurm_apply, indicating the number of
parallel processes to be run on each node.
slurm_call function, which submits a single function evaluation
on the cluster, with syntax similar to the base function
get_slurm_out can now process the output even if some filese are missing,
in which case it issues a warning.
slurm_apply, indicating which packages should be loaded on each node (by default, all packages currently attached to the user's R session).
Added the optional argument
slurm_apply, which can take the
table (each function evaluation returns a row, output is a data frame) or
raw (each function evaluation returns an arbitrary R object, output is a list).
Fixed a bug in the chunk size calculation for