Connect to ODBC Compatible Databases (using the DBI Interface)

A DBI-compatible interface to ODBC databases.


Project Status: Active – The project has reached a stable, usable state and is being actively developed. CRAN_Status_Badge Travis-CI Build Status Coverage Status AppVeyor Build Status

The goal of the odbc package is to provide a DBI-compliant interface to Open Database Connectivity (ODBC) drivers. This allows for an efficient, easy to setup connection to any database with ODBC drivers available, including SQL Server, Oracle, MySQL, PostgreSQL, SQLite and others. The implementation builds on the nanodbc C++ library.

Installation

For Unix and MacOS ODBC drivers should be compiled against unixODBC. Drivers compiled against iODBC may also work, but are not fully supported.

After installation of the driver manager and driver, you will have to register the driver in a odbcinst.ini file for it to appear in odbc::odbcListDrivers().

odbc and it's dependencies use C++11 features. Therefore you need gcc 4.8, clang 3.3 or Rtools 3.3 or later.

Windows is bundled with ODBC libraries however drivers for each database need to be installed separately. Windows ODBC drivers typically include an installer that needs to be run and will install the driver to the proper locations.

MacOS

homebrew can be used to easily install database drivers on MacOS.

UnixODBC - Required for all databases

# Install the unixODBC library 
brew install unixodbc

Common DB drivers

# SQL Server ODBC Drivers (Free TDS) 
brew install freetds --with-unixodbc
 
# PostgreSQL ODBC ODBC Drivers 
brew install psqlodbc
 
# MySQL ODBC Drivers (and database) 
brew install mysql
 
# SQLite ODBC Drivers 
brew install sqliteodbc

Linux - Debian / Ubuntu

apt-get can be used to easily install database drivers on Linux distributions which support it, such as Debian and Ubuntu.

UnixODBC - Required for all databases

# Install the unixODBC library 
apt-get install unixodbc unixodbc-dev

Common DB drivers

# SQL Server ODBC Drivers (Free TDS) 
apt-get install tdsodbc
 
# PostgreSQL ODBC ODBC Drivers 
apt-get install odbc-postgresql
 
# MySQL ODBC Drivers 
apt-get install libmyodbc
 
# SQLite ODBC Drivers 
apt-get install libsqliteodbc

R

# Install the latest odbc release from CRAN:
install.packages("odbc")
 
# Or the the development version from GitHub:
# install.packages(devtools)
devtools::install_github("r-dbi/odbc")

Connecting to a Database

Databases can be connected by specifying a connection string directly, or with DSN configuration files.

Connection Strings

Specify a connection string as named arguments directly in the dbConnect() method.

library(DBI)
con <- dbConnect(odbc::odbc(),
  driver = "PostgreSQL Driver",
  database = "test_db",
  uid = "postgres",
  pwd = "password",
  host = "localhost",
  port = 5432)

Alternatively you can pass a complete connection string as the .connection_string argument. The Connection Strings Reference is a useful resource that has example connection strings for a large variety of databases.

library(DBI)
con <- dbConnect(odbc::odbc(),
  .connection_string = "Driver={PostgreSQL Driver};Uid=postgres;Pwd=password;Host=localhost;Port=5432;Database=test_db;")

DSN Configuration files

ODBC configuration files are another option to specify connection parameters and allow one to use a Data Source Name (DSN) to make it easier to connect to a database.

con <- dbConnect(odbc::odbc(), "PostgreSQL")

Windows

The ODBC Data Source Administrator application is used to manage ODBC data sources on Windows.

MacOS / Linux

On MacOS and Linux there are two separate text files that need to be edited. UnixODBC includes a command line executable odbcinst which can be used to query and modify the DSN files. However these are plain text files you can also edit by hand if desired.

There are two different files used to setup the DSN information.

  • odbcinst.ini - which defines driver options
  • odbc.ini - which defines connection options

The DSN configuration files can be defined globally for all users of the system, often at /etc/odbc.ini or /opt/local/etc/odbc.ini, the exact location depends on what option was used when compiling unixODBC. odbcinst -j can be used to find the exact location. Alternatively the ODBCSYSINI environment variable can be used to specify the location of the configuration files. Ex. ODBCSYSINI=~/ODBC

A local DSN file can also be used with the files ~/.odbc.ini and ~/.odbcinst.ini.

odbcinst.ini

Contains driver information, particularly the name of the driver library. Multiple drivers can be specified in the same file.

[PostgreSQL Driver]
Driver          = /usr/local/lib/psqlodbcw.so
 
[SQLite Driver]
Driver          = /usr/local/lib/libsqlite3odbc.dylib
odbc.ini

Contains connection information, particularly the username, password, database and host information. The Driver line corresponds to the driver defined in odbcinst.ini.

[PostgreSQL]
Driver              = PostgreSQL Driver
Database            = test_db
Servername          = localhost
UserName            = postgres
Password            = password
Port                = 5432
 
[SQLite]
Driver          = SQLite Driver
Database=/tmp/testing

See also: unixODBC without the GUI for more information and examples.

Usage

All of the following examples assume you have already created a connection con. See Connecting to a database for more information on establishing a connection.

Table and Field information

dbListTables() is used for listing all existing tables in a database.

dbListTables(con)
 
# List tables beginning with f
dbListTables(con, table_name = "f%")
 
# List all fields in the 'flights' database
dbListFields(con, "flights")

Reading

dbReadTable() will read a full table into an R data.frame().

data <- dbReadTable(con, "flights")

Writing

dbWriteTable() will write an R data.frame() to an SQL table.

data <- dbWriteTable(con, "iris", iris)

Querying

dbGetQuery() will submit a query and fetch the results. It is also possible to submit the query and fetch separately with dbSendQuery() and dbFetch(). The n= argument to dbFetch() can be used to fetch only the part of a query result (the next n rows).

result <- dbSendQuery(con, "SELECT flight, tailnum, origin FROM flights ORDER BY origin")
 
# Retrieve the first 100 results
first_100 <- dbFetch(result, n = 100)
 
# Retrieve the rest of the results
rest <- dbFetch(result)

Benchmarks

The odbc package is often much faster than the existing RODBC and DBI compatible RODBCDBI packages.

Reading

Reading a table from a PostgreSQL database with the 'flights' dataset (336,776 rows, 19 columns) of the package nytflights13.

# First using RODBC / RODBCDBI
library(DBI)
#> Loading required package: methods
rodbc <- dbConnect(RODBCDBI::ODBC(), dsn = "PostgreSQL")
system.time(rodbc_result <- dbReadTable(rodbc, "flights"))
#>    user  system elapsed 
#>  12.130   2.139  15.590
 
# Now using odbc
odbc <- dbConnect(odbc::odbc(), dsn = "PostgreSQL")
system.time(odbc_result <- dbReadTable(odbc, "flights"))
#>    user  system elapsed 
#>   4.923   0.285   6.723
 
library(tibble)
as_tibble(odbc_result)
#> # A tibble: 336,776 x 19
#>     year month   day dep_time sched_dep_time dep_delay arr_time
#>    <int> <int> <int>    <int>          <int>     <dbl>    <int>
#>  1  2013     1     1      517            515         2      830
#>  2  2013     1     1      533            529         4      850
#>  3  2013     1     1      542            540         2      923
#>  4  2013     1     1      544            545        -1     1004
#>  5  2013     1     1      554            600        -6      812
#>  6  2013     1     1      554            558        -4      740
#>  7  2013     1     1      555            600        -5      913
#>  8  2013     1     1      557            600        -3      709
#>  9  2013     1     1      557            600        -3      838
#> 10  2013     1     1      558            600        -2      753
#> # ... with 336,766 more rows, and 12 more variables: sched_arr_time <int>,
#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
#> #   minute <dbl>, time_hour <dttm>
 
identical(dim(rodbc_result), dim(odbc_result))
#> [1] TRUE
rm(rodbc_result, odbc_result, odbc, rodbc)
gc(verbose = FALSE)
#> Warning in .Internal(gc(verbose, reset)): closing unused RODBC handle 1
#>           used (Mb) gc trigger  (Mb) max used  (Mb)
#> Ncells  706729 37.8    1168576  62.5  1024752  54.8
#> Vcells 2041397 15.6   13704358 104.6 17130448 130.7

Writing

Writing the same dataset to the database.

library(nycflights13)
# rodbc does not support writing timestamps natively.
rodbc <- dbConnect(RODBCDBI::ODBC(), dsn = "PostgreSQL")
system.time(dbWriteTable(rodbc, "flights2", as.data.frame(flights[, names(flights) != "time_hour"])))
#>    user  system elapsed 
#>   7.696   4.813  52.925
 
# Now using odbc
odbc <- dbConnect(odbc::odbc(), dsn = "PostgreSQL")
system.time(dbWriteTable(odbc, "flights3", as.data.frame(flights)))
#>    user  system elapsed 
#>   7.302   4.061  27.118

SQL Server

packageVersion("RSQLServer")
#> [1] ‘0.3.0’
 
# Writing
rsqlserver <- dbConnect(RSQLServer::SQLServer(), server = "SQLServer")
system.time(dbWriteTable(rsqlserver, "flights2", as.data.frame(flights)))
#>    user  system elapsed
#> 645.219  12.287 820.806
 
odbc <- dbConnect(odbc::odbc(), dsn = "SQLServer", UID = "testuser", PWD = "test")
system.time(dbWriteTable(odbc, "flights3", as.data.frame(flights)))
#>    user  system elapsed
#>  12.336   0.412  21.802
 
# Reading
system.time(dbReadTable(rsqlserver, "flights", as.data.frame(flights)))
#>    user  system elapsed
#>   5.101   1.289   3.584
 
system.time(dbReadTable(odbc, "flights", as.data.frame(flights)))
#>   user  system elapsed
#>  2.187   0.108   2.298

News

odbc 1.1.6

Features

  • dbConnect() gains a timeout parameter, to control how long a connection should be attempted before timing out. (#139)

  • Full schema support using Id objects available in DBI 8.0 (#91, #120)

Bugfixes

  • SQL inputs are always converted to database encoding before querying (#179).

odbc 1.1.5

  • Fix the return value for dbQuoteIdentifier() when given a length 0 input (#146 @edgararuiz).

odbc 1.1.4

  • Fix multiple transactions with rollback (#136).

  • Add custom sqlCreateTable and dbListTables method for Taradata connections (@edgararuiz)

Features

  • Add custom sqlCreateTable and dbListTables method for Teradata connections (@edgararuiz)

  • Add dbms.name parameter to dbConnect() to allow the user to set the database management system name in cases it cannot be queried from the Driver (#115, @hoxo-m).

Bugfixes

  • Fix multiple transactions with rollback (#136).

  • Fix translation of missing values for integer types (#119).

  • Update PIC flags to avoid portability concerns.

odbc 1.1.3

  • Small patch to fix install errors on CRAN's MacOS machines.

odbc 1.1.2

  • Add bigint parameter to dbConnect() to allow the user to set the behavior when converting 64 bit integers into R types.

  • Fixes for compatibility with the RStudio Connections pane and viewer. (@jmcphers).

  • Define BUILD_REAL_64_BIT_MODE for compatibility with older systems sql.h, to ensure we always use 8 byte pointers.

  • Added temporary-table support for Oracle database using a custom sqlCreateTable (#99, @edgararuiz)

  • Fix regression when binding due to the num_columns variable not being updated by odbc_result::bind_list().

  • Support table creation for Vertica and Redshift (#93, @khotilov).

  • Changed parameter fieldTypes to field.types in functions dbWriteTable() and sqlCreateTable() to be compliant with DBI (#106, @jschelbert).

  • dbSendStatement no longer executes the statement, this is instead done when dbBind() or dbGetRowsAffected() is called. This change brings ODBC into compliance with the DBI specification for dbSendStatement(). (#84, @ruiyiz).

odbc 1.1.1

  • Workaround for drivers which do not implement SQLGetInfo, such as the Access driver. (#78)

  • Fix for installation error for systems without GNU Make as the default make, such as Solaris.

odbc 1.1.0

  • Provide a fall backs for drivers which do not support SQLDescribeParam and those which do not support transactions.

  • sqlCreateTable() gains a fieldTypes argument, which allows one to override a column type for a given table, if the default type is not appropriate.

  • Support for databases with non UTF-8 encodings. Use the encoding parameter to dbConnect() to specify the database encoding.

  • Support for the SQL Connection Pane in the RStudio IDE.

  • Unknown fields no longer throw a warning, instead they signal a custom condition, which can be caught if desired with

    withCallingHandlers(expr, odbc_unknown_field_type = function(e) {...} )

  • Conversion to and from timezones is handled by the cctz library. If the database is in a non-UTC timezone it can be specified with the timezone parameter to dbConnect().

  • Time objects are converted to and from hms objects.

  • 64 bit integers are converted to and from bit64 objects.

  • Support table creation for Impala and Hive Databases (# 38, @edgararuiz).

odbc 1.0.1

  • Fixes for the CRAN build machines

    • Do not force c++14 on windows
    • Turn off database tests on CRAN, as I think they will be difficult to debug even if databases are supported.
  • Added a NEWS.md file to track changes to the package.

odbc 1.0.0

  • Initial odbc release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("odbc")

1.1.6 by Jim Hester, a year ago


https://github.com/r-dbi/odbc


Report a bug at https://github.com/r-dbi/odbc/issues


Browse source code at https://github.com/cran/odbc


Authors: Jim Hester [aut, cre] , Hadley Wickham [aut] , lexicalunit [cph] (nanodbc library) , Google Inc. [cph] (cctz library) , RStudio [cph, fnd]


Documentation:   PDF Manual  


Task views: Databases with R


MIT + file LICENSE license


Imports DBI, methods, Rcpp, blob, bit64, hms

Suggests tibble, DBItest, testthat, covr, magrittr

Linking to Rcpp, BH

System requirements: C++11, GNU make, An ODBC3 driver manager and drivers.


Imported by datrProfile, dbparser, dplyr.teradata, drfit, explore, virtuoso.

Suggested by REDCapR, dbx, healthcareai, implyr, radiant.data, sf.


See at CRAN