Utility Functions to Execute Standard Extract/Transform/Load Operations (using Package 'ff') on Large Data

Provides functions to facilitate the use of the 'ff' package in interaction with big data in 'SQL' databases (e.g. in 'Oracle', 'MySQL', 'PostgreSQL', 'Hive') by allowing easy importing directly into 'ffdf' objects using 'DBI', 'RODBC' and 'RJDBC'. Also contains some basic utility functions to do fast left outer join merging based on 'match', factorisation of data and a basic function for re-coding vectors.


ETLUtils

ETLUtils provides utility functions to execute standard ETL operations (using package ff) on large data. Currently the following functions might be useful to you if you have some large dataset in SQL and want to import it immediately in an ffdf object

  • read.dbi.ffdf (reading of SQL data through DBI)
  • read.odbc.ffdf (reading of SQL data through RODBC)
  • read.jdbc.ffdf (reading of SQL data through RJDBC)

An example can be found at http://www.bnosac.be/index.php/blog/5-get-your-large-sql-data-in-ff-swiftly and at http://www.bnosac.be/index.php/blog/6-readodbcffdf-a-readdbiffdf-for-fetching-large-corporate-sql-data

For users who want to store data from an ffdf back in a database, the package also provides

  • write.dbi.ffdf (writing of ffdf data to a database table through DBI)
  • write.odbc.ffdf (writing of ffdf data to a database table through RODBC)
  • write.jdbc.ffdf (writing of ffdf data to a database table through RJDBC)

Other utilities

Other functions include factorise, matchmerge, recoder, naLOCFPlusone and renameColumns.

CRAN

This is the development version of the package which is available at CRAN.

To install the latest version from github remotes::install_github("jwijffels/ETLUtils")

To get the lastest version from CRAN:

News

Package: ETLUtils

Version: 1.4.1 [2018-01-25]

  • Copy .sqlite db to temp folder and chmod for complying to R CMD check

Version: 1.4 [2018-01-17]

  • Added argument odbcDriverConnect.args as requested in issue #2
  • Changed argument fetch to dbFetch in read.dbi.ffdf and read.jdbc.ffdf
  • Copy .sqlite db to temp folder for complying to R CMD check

Version: 1.3 [2015-05-12]

  • Added factorise
  • Added write.dbi.ffdf, write.odbc.ffdf and write.jdbc.ffdf
  • Changes regarding CRAN checks - import ff and import chunk from bit
  • Small changes in examples in the manuals

Version: 1.2 [2013-01-03]

  • Added function read.jdbc.ffdf to read in data from an RJDBC channel into an ffdf object. This can be used to import data from Hive into an ffdf
  • Added RJDBC as a suggested package

Version: 1.1 [2012-05-18]

  • Added function read.odbc.ffdf to read in data from an ODBC channel into an ffdf object
  • Packages DBI, RODBC and zoo are now Suggested packages instead of Required

Version: 1.0 [2012-03-26]

  • Initial package providing functions matchmerge, recoder, naLOCFPlusone, renameColumns, read.dbi.ffdf

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("ETLUtils")

1.4.1 by Jan Wijffels, 10 months ago


https://github.com/jwijffels/ETLUtils


Browse source code at https://github.com/cran/ETLUtils


Authors: Jan Wijffels


Documentation:   PDF Manual  


GPL-2 license


Imports bit

Depends on ff

Suggests RSQLite, zoo, DBI, RODBC, RJDBC


See at CRAN