Several fast random number generators are provided as C++
header only libraries: The PCG family by O'Neill (2014
< https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf>) as well as
Xoroshiro128+ and Xoshiro256+ by Blackman and Vigna (2018
The dqrng package provides fast random number generators (RNG) with good statistical properties for usage with R. It combines these RNGs with fast distribution functions to sample from uniform, normal or exponential distributions. Both the RNGs and the distribution functions are distributed as C++ header-only library.
The currently released version is available from CRAN via
install.packages("dqrng")
Intermediate releases can also be obtained via drat:
if (!requireNamespace("drat", quietly = TRUE)) install.packages("drat")drat::addRepo("daqana")install.packages("dqrng")
Using the provided RNGs from R is deliberately similar to using R’s build-in RNGs:
library(dqrng)dqset.seed(42)dqrunif(5, min = 2, max = 10)#> [1] 9.211802 2.616041 6.236331 4.588535 5.764814dqrexp(5, rate = 4)#> [1] 0.35118613 0.17656197 0.06844976 0.16984095 0.10096744
They are quite a bit faster, though:
N <- 1e4bm <- bench::mark(rnorm(N), dqrnorm(N), check = FALSE)bm[, 1:5]#> # A tibble: 2 x 5#> expression min mean median max#> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm>#> 1 rnorm(N) 657µs 752.4µs 727.5µs 1.09ms#> 2 dqrnorm(N) 72µs 85.8µs 80.8µs 166.02µs
This is also true for the provided sampling functions with replacement:
m <- 1e7n <- 1e5bm <- bench::mark(sample.int(m, n, replace = TRUE),sample.int(1e3*m, n, replace = TRUE),dqsample.int(m, n, replace = TRUE),dqsample.int(1e3*m, n, replace = TRUE),check = FALSE)bm[, 1:5]#> # A tibble: 4 x 5#> expression min mean median max#> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm>#> 1 sample.int(m, n, replace = TRUE) 905.05µs 1.11ms 1.08ms 1.81ms#> 2 sample.int(1000 * m, n, replace = TR… 1.69ms 1.97ms 1.92ms 2.85ms#> 3 dqsample.int(m, n, replace = TRUE) 274.76µs 333.97µs 315.47µs 604.48µs#> 4 dqsample.int(1000 * m, n, replace = … 340.61µs 413.71µs 377.36µs 888.39µs
And without replacement:
bm <- bench::mark(sample.int(m, n),sample.int(1e3*m, n),sample.int(m, n, useHash = TRUE),dqsample.int(m, n),dqsample.int(1e3*m, n),check = FALSE)bm[, 1:5]#> # A tibble: 5 x 5#> expression min mean median max#> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm>#> 1 sample.int(m, n) 21.97ms 21.97ms 21.97ms 21.97ms#> 2 sample.int(1000 * m, n) 5.21ms 6.34ms 5.78ms 11.28ms#> 3 sample.int(m, n, useHash = TRUE) 3.25ms 3.97ms 3.61ms 8.43ms#> 4 dqsample.int(m, n) 1.2ms 1.62ms 1.4ms 4.37ms#> 5 dqsample.int(1000 * m, n) 1.77ms 2.32ms 2.09ms 4.87ms
Note that sampling from 10^10
elements triggers “long-vector support”
in R.
In addition the RNGs provide support for multiple independent streams for parallel usage:
N <- 1e7dqset.seed(42, 1)u1 <- dqrunif(N)dqset.seed(42, 2)u2 <- dqrunif(N)cor(u1, u2)#> [1] -0.0005787967
All feedback (bug reports, security issues, feature requests, …) should be provided as issues.
long_jump()
for Xo(ro)shiro as alternative to jump()
providing fewer streams with longer period.dqsample
and dqsample.int
using an unbiased sampling
algorithm.R_unif_index()
instead of unif_rand()
to retrieve random data
from R's RNG in generateSeedVectors()
.int
is used for seeding (Aaron Lun in #10)
dqrng::dqset_seed()
expects a Rcpp::IntegerVector
instead of an int
generateSeedVectors()
for generating a list of random int
vectors from R's RNG. These vectors can be used as seed (Aaron Lun in #10).std::random_device
as source of the default seed, since
std::random_device
is deterministic with MinGW (c.f. #2)dqrng_distribution.h
can now be used independently of Rcppxorshift.hpp
and xoroshiro.hpp
with xoshiro.h
.
This implementation is directly derived from the original C implementations.
It provides v1.0 of Xoroshiro128+ and Xoshiro256+.