Client for the Captricity API

Get text from images of text using Captricity Optical Character Recognition (OCR) API. Captricity allows you to get text from handwritten forms --- think surveys --- and other structured paper documents. And it can output data in form a delimited file keeping field information intact. For more information, read https://shreddr.captricity.com/developer/overview/.


OCR text and handwritten forms using Captricity. Captricity's big advantage over Abbyy Cloud OCR is that it allows the user to easily specify the position of text-blocks that want to OCR; they have a simple web-based UI. The quality of the OCR can be checked using compare_txt from recognize.

To get the latest version on CRAN:

install.packages("captr")

To get the current development version from GitHub:

install.packages("devtools")
devtools::install_github("soodoku/captr", build_vignettes = TRUE)

Read the vignette:

vignette("using_captr", package = "captr")

or follow the overview below.

Start by getting an application token and setting it using:

set_token("token")

Then, create a batch using:

create_batch("batch_name")

Once you have created a batch, you need to get the template ID (it tells Captricity what data to pull from where). Captricity requires a template. These templates can be created using the Web UI.

set_template_id("id")

Next, assign the template ID to a batch:

set_batch_template("batch_id", "template_id")

Next, upload image(s) to a batch

upload_image(batch_id="batch_id", path_to_image="image_path")

Next, check whether the batch is ready to be processed:

test_readiness(batch_id="batch_id")

You may also want to find out how much would processing the batch set you back by:

batch_price(batch_id="batch_id")

Once you are ready, submit the batch:

submit_batch(batch_id="batch_id")

Captricity excels in nomenclature confusion. So once a batch is submitted, it is then called a job. The id for the job can be obtained from the list that is returned from submit_batch. The field name is related_job_id.

To track progress of a job, use:

track_progress(job_id ="job_id")

List all forms (instance sets) associated with a job:

list_instance_sets(job_id="job_id")

If you want to download data from a particular form, use the list_instance_sets to get the form (instance_set) id and run:

get_instance_set(instance_set_id="instance_set_id")

Get csv of all your results from a job:

get_all(job_id="job_id")

Scripts are released under the MIT License.

News

captr 0.2.0

  • Using new abstract GET/POST functions: create_batch, get_batch_details, batch_price, list_batch_files, etc.
  • Functions using new abstract GET/POST functions also support the dots --- passing optional stuff to curl
  • Fixed implementation of user_profile
  • More unit tests

captr 0.1.5

  • Abstracted simple GET
  • More cats for functions
  • Support for delete_batch, delete_job

captr .1.4

  • More unit tests
  • Support list_batches, list_batch_files, list_instance_sets, list_docs, list_jobs and user_profile
  • Better documentation for main functions

captr .1.3

  • Uses environment to store token
  • Better auth function
  • Fixed error in vignette
  • Added basic tests
  • Check if file exists

captr .1.2

  • Readme now in the CRAN package
  • Vignette build through knitr + better code
  • Changed license to MIT

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("captr")

0.2.0 by Gaurav Sood, 6 months ago


http://github.com/soodoku/captR


Report a bug at http://github.com/soodoku/captR/issues


Browse source code at https://github.com/cran/captr


Authors: Gaurav Sood [aut, cre]


Documentation:   PDF Manual  


Task views: Web Technologies and Services


MIT + file LICENSE license


Imports curl, jsonlite

Suggests testthat, rmarkdown, knitr


See at CRAN