Client for the Captricity API

Get text from images of text using Captricity Optical Character Recognition (OCR) API. Captricity allows you to get text from handwritten forms --- think surveys --- and other structured paper documents. And it can output data in form a delimited file keeping field information intact. For more information, read < https://shreddr.captricity.com/developer/overview/>.


Build Status Build status CRAN_Status_Badge Coverage Status

OCR text and handwritten forms using Captricity. Captricity's big advantage over Abbyy Cloud OCR is that it allows the user to easily specify the position of text-blocks that want to OCR; they have a simple web-based UI. The quality of the OCR can be checked using compare_txt from recognize.

Installation

To get the latest version on CRAN:

install.packages("captr")

To get the current development version from GitHub:

install.packages("devtools")
devtools::install_github("soodoku/captr", build_vignettes = TRUE)

Using captr

Read the vignette:

vignette("using_captr", package = "captr")

or follow the overview below.

Start by getting an application token and setting it using:

set_token("token")

Then, create a batch using:

create_batch("batch_name")

Once you have created a batch, you need to get the template ID (it tells Captricity what data to pull from where). Captricity requires a template. These templates can be created using the Web UI.

set_template_id("id")

Next, assign the template ID to a batch:

set_batch_template("batch_id", "template_id")

Next, upload image(s) to a batch

upload_image(batch_id="batch_id", path_to_image="image_path")

Next, check whether the batch is ready to be processed:

test_readiness(batch_id="batch_id")

You may also want to find out how much would processing the batch set you back by:

batch_price(batch_id="batch_id")

Once you are ready, submit the batch:

submit_batch(batch_id="batch_id")

Captricity excels in nomenclature confusion. So once a batch is submitted, it is then called a job. The id for the job can be obtained from the list that is returned from submit_batch. The field name is related_job_id.

To track progress of a job, use:

track_progress(job_id ="job_id")

List all forms (instance sets) associated with a job:

list_instance_sets(job_id="job_id")

If you want to download data from a particular form, use the list_instance_sets to get the form (instance_set) id and run:

get_instance_set(instance_set_id="instance_set_id")

Get csv of all your results from a job:

get_all(job_id="job_id")

License

Scripts are released under the MIT License.

Contributor Code of Conduct

The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.

News

captr 0.3.0

  • Make returns visible
  • Improved documentation
  • Support the dots for list_jobs()
  • extensive linting
  • removed superfluous captr_CHECKAUTH() from a few functions
  • abstracted out captr_DELETE

captr 0.2.0

  • Using new abstract GET/POST functions: create_batch, get_batch_details, batch_price, list_batch_files, etc.
  • Functions using new abstract GET/POST functions also support the dots --- passing optional stuff to curl
  • Fixed implementation of user_profile
  • More unit tests

captr 0.1.5

  • Abstracted simple GET
  • More cats for functions
  • Support for delete_batch, delete_job

captr .1.4

  • More unit tests
  • Support list_batches, list_batch_files, list_instance_sets, list_docs, list_jobs and user_profile
  • Better documentation for main functions

captr .1.3

  • Uses environment to store token
  • Better auth function
  • Fixed error in vignette
  • Added basic tests
  • Check if file exists

captr .1.2

  • Readme now in the CRAN package
  • Vignette build through knitr + better code
  • Changed license to MIT

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("captr")

0.3.0 by Gaurav Sood, a year ago


http://github.com/soodoku/captR


Report a bug at http://github.com/soodoku/captR/issues


Browse source code at https://github.com/cran/captr


Authors: Gaurav Sood [aut, cre]


Documentation:   PDF Manual  


Task views: Web Technologies and Services


MIT + file LICENSE license


Imports curl, jsonlite

Suggests testthat, rmarkdown, knitr


See at CRAN