Read Large Text Files

Read large text files by splitting them in smaller files.

lifecycle Travis build status AppVeyor build status Coverage status CRAN status

Read large text files based on splitting + data.table::fread


# devtools::install_github("privefl/bigreadr")
# Create a temporary file of ~141 MB (just as an example)
csv <- fwrite2(iris[rep(seq_len(nrow(iris)), 1e4), rep(1:5, 4)], tempfile())
format(file.size(csv), big.mark = ",")
## Splitting lines (1)
# Read (by parts) all data -> using `fread` would be faster
nlines(csv)  ## 1M5 lines -> split every 500,000
big_iris1 <- big_fread1(csv, every_nlines = 5e5)
# Read and subset (by parts)
big_iris1_setosa <- big_fread1(csv, every_nlines = 5e5, .transform = function(df) {
  dplyr::filter(df, Species == "setosa")
## Splitting columns (2)
big_iris2 <- big_fread2(csv, nb_parts = 3)
# Read and subset (by parts)
species_setosa <- (fread2(csv, select = 5)[[1]] == "setosa")
big_iris2_setosa <- big_fread2(csv, nb_parts = 3, .transform = function(df) {
  dplyr::filter(df, species_setosa)
## Verification
identical(big_iris1_setosa, dplyr::filter(big_iris1, Species == "setosa"))
identical(big_iris2, big_iris1)
identical(big_iris2_setosa, big_iris1_setosa)


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.1.6 by Florian Privé, 18 days ago

Report a bug at

Browse source code at

Authors: Florian Privé [aut, cre]

Documentation:   PDF Manual  

GPL-3 license

Imports data.table, Rcpp, parallel, fpeek, utils

Suggests spelling, testthat, covr, RSQLite

Linking to Rcpp

Imported by bigstatsr.

See at CRAN