HTTP Client

A simple HTTP client, with tools for making HTTP requests, and mocking HTTP requests. The package is built on R6, and takes inspiration from Ruby's 'faraday' gem (< https://rubygems.org/gems/faraday>). The package name is a play on curl, the widely used command line tool for HTTP, and this package is built on top of the R package 'curl', an interface to 'libcurl' (< https://curl.haxx.se/libcurl>).


Project Status: Active - The project has reached a stable, usable state and is being actively developed. Build Status codecov cran checks rstudio mirror downloads cran version

An HTTP client, taking inspiration from Ruby's faraday and Python's requests

Package API:

  • HttpClient - Main interface to making HTTP requests. Synchronous requests only.
  • HttpResponse - HTTP response object, used for all responses across the different clients.
  • Paginator - Auto-paginate through requests - supports a subset of all possible pagination scenarios - will fill out more scenarios soon
  • Async - Asynchronous HTTP requests - a simple interface for many URLS - whose interface is similar to HttpClient - all URLs are treated the same.
  • AsyncVaried - Asynchronous HTTP requests - accepts any number of HttpRequest objects - with a different interface than HttpClient/Async due to the nature of handling requests with different HTTP methods, options, etc.
  • HttpRequest - HTTP request object, used for AsyncVaried
  • mock() - Turn on/off mocking, via webmockr
  • auth() - Simple authentication helper
  • proxy() - Proxy helper
  • upload() - File upload helper
  • set curl options globally: set_auth(), set_headers(), set_opts(), set_proxy(), and crul_settings()
  • Writing to disk and streaming: available with both synchronous requests as well as async requests.

Mocking:

crul now integrates with webmockr to mock HTTP requests. Checkout the http testing book

Caching:

crul also integrates with vcr to cache http requests/responses. Checkout the http testing book

CRAN version

install.packages("crul")

Dev version

devtools::install_github("ropensci/crul")
library("crul")

the client

HttpClient is where to start

(x <- HttpClient$new(
  url = "https://httpbin.org",
  opts = list(
    timeout = 1
  ),
  headers = list(
    a = "hello world"
  )
))
#> <crul connection> 
#>   url: https://httpbin.org
#>   curl options: 
#>     timeout: 1
#>   proxies: 
#>   auth: 
#>   headers: 
#>     a: hello world
#>   progress: FALSE

Makes an R6 class, that has all the bits and bobs you'd expect for doing HTTP requests. When it prints, it gives any defaults you've set. As you update the object you can see what's been set

x$opts
#> $timeout
#> [1] 1
x$headers
#> $a
#> [1] "hello world"

You can also pass in curl options when you make HTTP requests, see below for examples.

do some http

The client object created above has http methods that you can call, and pass paths to, as well as query parameters, body values, and any other curl options.

Here, we'll do a GET request on the route /get on our base url https://httpbin.org (the full url is then https://httpbin.org/get)

res <- x$get("get")

The response from a http request is another R6 class HttpResponse, which has slots for the outputs of the request, and some functions to deal with the response:

Status code

res$status_code
#> [1] 200

Status information

res$status_http()
#> <Status code: 200>
#>   Message: OK
#>   Explanation: Request fulfilled, document follows

The content

res$content
#>   [1] 7b 0a 20 20 22 61 72 67 73 22 3a 20 7b 7d 2c 20 0a 20 20 22 68 65 61
#>  [24] 64 65 72 73 22 3a 20 7b 0a 20 20 20 20 22 41 22 3a 20 22 68 65 6c 6c
#>  [47] 6f 20 77 6f 72 6c 64 22 2c 20 0a 20 20 20 20 22 41 63 63 65 70 74 22
#>  [70] 3a 20 22 61 70 70 6c 69 63 61 74 69 6f 6e 2f 6a 73 6f 6e 2c 20 74 65
#>  [93] 78 74 2f 78 6d 6c 2c 20 61 70 70 6c 69 63 61 74 69 6f 6e 2f 78 6d 6c
#> [116] 2c 20 2a 2f 2a 22 2c 20 0a 20 20 20 20 22 41 63 63 65 70 74 2d 45 6e
#> [139] 63 6f 64 69 6e 67 22 3a 20 22 67 7a 69 70 2c 20 64 65 66 6c 61 74 65
#> [162] 22 2c 20 0a 20 20 20 20 22 43 6f 6e 6e 65 63 74 69 6f 6e 22 3a 20 22
#> [185] 63 6c 6f 73 65 22 2c 20 0a 20 20 20 20 22 48 6f 73 74 22 3a 20 22 68
#> [208] 74 74 70 62 69 6e 2e 6f 72 67 22 2c 20 0a 20 20 20 20 22 55 73 65 72
#> [231] 2d 41 67 65 6e 74 22 3a 20 22 6c 69 62 63 75 72 6c 2f 37 2e 35 34 2e
#> [254] 30 20 72 2d 63 75 72 6c 2f 33 2e 32 20 63 72 75 6c 2f 30 2e 36 2e 32
#> [277] 2e 39 33 33 36 22 0a 20 20 7d 2c 20 0a 20 20 22 6f 72 69 67 69 6e 22
#> [300] 3a 20 22 32 34 2e 32 31 2e 32 32 39 2e 35 39 22 2c 20 0a 20 20 22 75
#> [323] 72 6c 22 3a 20 22 68 74 74 70 73 3a 2f 2f 68 74 74 70 62 69 6e 2e 6f
#> [346] 72 67 2f 67 65 74 22 0a 7d 0a

HTTP method

res$method
#> [1] "get"

Request headers

res$request_headers
#> $`User-Agent`
#> [1] "libcurl/7.54.0 r-curl/3.2 crul/0.6.2.9336"
#> 
#> $`Accept-Encoding`
#> [1] "gzip, deflate"
#> 
#> $Accept
#> [1] "application/json, text/xml, application/xml, */*"
#> 
#> $a
#> [1] "hello world"

Response headers

res$response_headers
#> $status
#> [1] "HTTP/1.1 200 OK"
#> 
#> $connection
#> [1] "keep-alive"
#> 
#> $server
#> [1] "gunicorn/19.9.0"
#> 
#> $date
#> [1] "Thu, 03 Jan 2019 05:28:11 GMT"
#> 
#> $`content-type`
#> [1] "application/json"
#> 
#> $`content-length`
#> [1] "355"
#> 
#> $`access-control-allow-origin`
#> [1] "*"
#> 
#> $`access-control-allow-credentials`
#> [1] "true"
#> 
#> $via
#> [1] "1.1 vegur"

All response headers - e.g., intermediate headers

res$response_headers_all

And you can parse the content with parse()

res$parse()
#> No encoding supplied: defaulting to UTF-8.
#> [1] "{\n  \"args\": {}, \n  \"headers\": {\n    \"A\": \"hello world\", \n    \"Accept\": \"application/json, text/xml, application/xml, */*\", \n    \"Accept-Encoding\": \"gzip, deflate\", \n    \"Connection\": \"close\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\": \"libcurl/7.54.0 r-curl/3.2 crul/0.6.2.9336\"\n  }, \n  \"origin\": \"24.21.229.59\", \n  \"url\": \"https://httpbin.org/get\"\n}\n"
jsonlite::fromJSON(res$parse())
#> No encoding supplied: defaulting to UTF-8.
#> $args
#> named list()
#> 
#> $headers
#> $headers$A
#> [1] "hello world"
#> 
#> $headers$Accept
#> [1] "application/json, text/xml, application/xml, */*"
#> 
#> $headers$`Accept-Encoding`
#> [1] "gzip, deflate"
#> 
#> $headers$Connection
#> [1] "close"
#> 
#> $headers$Host
#> [1] "httpbin.org"
#> 
#> $headers$`User-Agent`
#> [1] "libcurl/7.54.0 r-curl/3.2 crul/0.6.2.9336"
#> 
#> 
#> $origin
#> [1] "24.21.229.59"
#> 
#> $url
#> [1] "https://httpbin.org/get"

curl options

res <- HttpClient$new(url = "http://api.gbif.org/v1/occurrence/search")
res$get(query = list(limit = 100), timeout_ms = 100)
#> Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) :
#>   Timeout was reached

Asynchronous requests

The simpler interface allows many requests (many URLs), but they all get the same options/headers, etc. and you have to use the same HTTP method on all of them:

(cc <- Async$new(
  urls = c(
    'https://httpbin.org/',
    'https://httpbin.org/get?a=5',
    'https://httpbin.org/get?foo=bar'
  )
))
res <- cc$get()
lapply(res, function(z) z$parse("UTF-8"))

The AsyncVaried interface accepts any number of HttpRequest objects, which can define any type of HTTP request of any HTTP method:

req1 <- HttpRequest$new(
  url = "https://httpbin.org/get",
  opts = list(verbose = TRUE),
  headers = list(foo = "bar")
)$get()
req2 <- HttpRequest$new(url = "https://httpbin.org/post")$post()
out <- AsyncVaried$new(req1, req2)

Execute the requests

out$request()

Then functions get applied to all responses:

out$status()
#> [[1]]
#> <Status code: 200>
#>   Message: OK
#>   Explanation: Request fulfilled, document follows
#> 
#> [[2]]
#> <Status code: 200>
#>   Message: OK
#>   Explanation: Request fulfilled, document follows
out$parse()
#> [1] "{\n  \"args\": {}, \n  \"headers\": {\n    \"Accept\": \"application/json, text/xml, application/xml, */*\", \n    \"Accept-Encoding\": \"gzip, deflate\", \n    \"Connection\": \"close\", \n    \"Foo\": \"bar\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\": \"R (3.5.2 x86_64-apple-darwin15.6.0 x86_64 darwin15.6.0)\"\n  }, \n  \"origin\": \"24.21.229.59\", \n  \"url\": \"https://httpbin.org/get\"\n}\n"                                                                                                                                        
#> [2] "{\n  \"args\": {}, \n  \"data\": \"\", \n  \"files\": {}, \n  \"form\": {}, \n  \"headers\": {\n    \"Accept\": \"application/json, text/xml, application/xml, */*\", \n    \"Accept-Encoding\": \"gzip, deflate\", \n    \"Connection\": \"close\", \n    \"Content-Length\": \"0\", \n    \"Content-Type\": \"application/x-www-form-urlencoded\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\": \"libcurl/7.54.0 r-curl/3.2 crul/0.6.2.9336\"\n  }, \n  \"json\": null, \n  \"origin\": \"24.21.229.59\", \n  \"url\": \"https://httpbin.org/post\"\n}\n"

Progress bars

library(httr)
x <- HttpClient$new(
  url = "https://httpbin.org/bytes/102400", 
  progress = progress()
)
z <- x$get()
|==============================================| 100%

TO DO

  • ...

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for crul in R doing citation(package = 'crul')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

ropensci_footer

News

crul 0.7.0

NEW FEATURES

  • HttpClient gains a retry method: retries any request verb until successful (HTTP response status < 400) or a condition for giving up is met. (#89) (#95) thanks @hlapp
  • HttpClient, HttpRequest, and Async classes gain verb method for doing HTTP requests specifying any of the supported HTTP verbs (#97)
  • HttpClient and Paginator gain a url_fetch method: get the URL that would be sent in an HTTP request without sending the HTTP request. Useful for getting the URL before executing an HTTP request if you need to check something about the URL first. (#92)
  • new vignette for "API package best practices" (#65)
  • Package gains manual files for each HTTP verb to facilitate linking to package documentation for information on each HTTP verb (#98)
  • Intermediate headers (e.g., those in redirect chains) are now given back in a new slot in the HttpResponse class as $response_headers_all as an unnamed list, with each element a named list of headers; the last list in the set is the final response headers that match those given in the $response_headers slot (#60) (#99)

BUG FIXES

  • some dangling file connections were left open - now fixed (#93) (#95)
  • fix url_parse: lacked check that input was a string, and that it was length 1 - this PR fixed that (#100) thanks @aaronwolen

DEFUNCT

  • HttpStubbedResponse was removed from the package - it may have been used at some point, but is not used in the package anymore (#88)

crul 0.6.0

NEW FEATURES

  • Async and AsyncVaried now support simple auth, see ?auth (#70)
  • gains new function ok() to ping a URL to see if it's up or not, returns a single boolean (#71) (#73)
  • HttpClient and HttpRequest gain new parameter progress that accepts a function to use to construct a progress bar. For now accepts httr::progress() but will accept other options in the future (#20) (#81)
  • gains a new vignette for curl options (#7)
  • can now set curl options globally using new functions set_auth(), set_headers(), set_opts(), set_proxy(), and crul_settings() (#48) (#85)

MINOR IMPROVEMENTS

  • explicitly import httpcode::http_code (#80)
  • fix vignette names to make them more clear and add numbers to order them (#64)
  • change print function for Async and AsyncVaried to print max of 10 and tell user how many total and remaining not shown (#72)
  • added support to proxy() for socks, e.g. to use with TOR (#79)
  • now when Async and AsyncVaried requests fail, they don't error but instead we capture the error and pass it back in the result. this way any failure requests don't stop progress of the entire async request suite (#74) (#84)

crul 0.5.2

MINOR IMPROVEMENTS

  • Fixed handling of user agent: you can pass a UA string as a curl option or a header. Previously, we were wrongly overwriting the user input UA if given as a curl option - but were not doing so if given as a header. This is fixed now. (#63) thx to @maelle and @dpprdan

BUG FIXES

  • Fix to Paginator - it wasn't handling pagination correctly. In addition, fixed to hopefully handle all scenarios now. added more tests (#62)
  • Fixed handling of query parameters. We were using urltools::url_encode to encode strings, but it wasn't encoding correctly in some locales. Using curl::curl_escape fixes the problem. Encoding is done on query values and names (#67) (#68)

crul 0.5.0

NEW FEATURES

  • Gains a new R6 class Paginator to help users automatically paginate through multiple requests. It only supports query parameter based paginating for now. We'll add support later for other types including cursors (e.g., used in Solr servers), and for link headers (e.g., used in the GitHub API). Please get in touch if you find any problems with Paginator. (#56)
  • Async classes Async and Asyncvaried gain ability to write to disk and stream data (to disk or elsewhere, e.g. R console or to an R object) (#46) thanks @artemklevtsov for the push to do this

MINOR IMPROVEMENTS

  • Improved documentation for auth to indicate that user and pwd are indeed required - and to further indicate that one can pass in NULL to those parameters (similar to an empty string "" in httr::authenticate) when one e.g. may want to use gssnegotiate method (#43)
  • Fixed query builder so that one can now protect query parameters by wrapping them in I() (#55)

BUG FIXES

  • Fixed bug in head requests with HttpClient when passing query parameter - it was failing previously. Added query parameter back. (#52)

crul 0.4.0

NEW FEATURES

  • file uploads now work, see new function upload() and examples (#25)

MINOR IMPROVEMENTS

  • fixes to reused curl handles - within a connection object only, not across connection objects (#45)
  • crul now drops any options passed in to opts or to ... that are not in set of allowed curl options, see curl::curl_options() (#49)
  • cookies should now be persisted across requests within a connection object, see new doc ?cookies for how to set cookies (#44)
  • gather cainfo and use in curl options when applicable (#51)
  • remove disk and stream from head method in HttpClient and HttpRequest as no body returned in a HEAD request

crul 0.3.8

BUG FIXES

  • Fixed AsyncVaried to return async responses in the order that they were passed in. This also fixes this exact same behavior in Async because Async uses AsyncVaried internally. (#41) thanks @dirkschumacher for reporting

crul 0.3.6

  • Note: This version gains support for integration with webmockr, which is now on CRAN.

NEW FEATURES

  • New function auth() to do simple authentication (#33)
  • New function HttpStubbedResponse for making a stubbed response object for the webmockr integration (#4)
  • New function mock() to turn on mocking - it's off by default. If webmockr is not installed but user attempts to use mocking we error with message to install webmockr (#4)

MINOR IMPROVEMENTS

  • Use gzip-deflate by deafult for each request to make sure gzip compression is used if the server can do it (#34)
  • Change useragent to User-Agent as default user agent header (#35)
  • Now we make sure that user supplied headers override the default headers if they are of the same name (#36)

crul 0.3.4

NEW FEATURES

  • New utility functions url_build and url_parse (#31)

MINOR IMPROVEMENTS

  • Now using markdown for documentation (#32)
  • Better documentation for AsyncVaried (#30)
  • New vignette on how to use crul in realistic scenarios rather than brief examples to demonstrate individual features (#29)
  • Better documentation for HttpRequest (#28)
  • Included more tests

BUG FIXES

  • Fixed put/patch/delete as weren't passing body correctly in HttpClient (#26)
  • DRY out code for preparing requests - simplify to use helper functions (#27)

crul 0.3.0

NEW FEATURES

  • Added support for asynchronous HTTP requests, including two new R6 classes: Async and AsyncVaried. The former being a simpler interface treating all URLs with same options/HTTP method, and the latter allowing any type of request through the new R6 class HttpRequest (#8) (#24)
  • New R6 class HttpRequest to support AsyncVaried - this method only defines a request, but does not execute it. (#8)

MINOR IMPROVEMENTS

  • Added support for proxies (#22)

BUG FIXES

  • Fixed parsing of headers from FTP servers (#21)

crul 0.2.0

MINOR IMPROVEMENTS

  • Created new manual files for various tasks to document usage better (#19)
  • URL encode paths - should fix any bugs where spaces between words caused errors previously (#17)
  • URL encode query parameters - should fix any bugs where spaces between words caused errors previously (#11)
  • request headers now passed correctly to response object (#13)
  • response headers now parsed to a list for easier access (#14)
  • Now supporting multiple query parameters of the same name, wasn't possible in last version (#15)

crul 0.1.6

NEW FEATURES

  • Improved options for using curl options. Can manually add to list of curl options or pass in via .... And we check that user doesn't pass in prohibited options (curl package takes care of checking that options are valid) (#5)
  • Incorporated fauxpas package for dealing with HTTP conditions. It's a Suggest, so only used if installed (#6)
  • Added support for streaming via curl::curl_fetch_stream. stream param defaults to NULL (thus ignored), or pass in a function to use streaming. Only one of memory, streaming or disk allowed. (#9)
  • Added support for streaming via curl::curl_fetch_disk. disk param defaults to NULL (thus ignored), or pass in a path to write to disk instead of use memory. Only one of memory, streaming or disk allowed. (#12)

MINOR IMPROVEMENTS

  • Added missing raise_for_status() method on the HttpResponse class (#10)

BUG FIXES

  • Was importing httpcode but wasn't using it in the package. Now using the package in HttpResponse

crul 0.1.0

NEW FEATURES

  • Released to CRAN.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.