Download Aggregate Trial Information and Results from ClinicalTrials.gov

ClinicalTrials.gov is a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world (see < https://clinicaltrials.gov/> for more information). Users can search for information about and results from those trials. This provides a set of functions to interact with the search and download features. Results are downloaded to temporary directories and returned as R objects.


Clinicaltrials.gov ClinicalTrials.gov is a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world. Users can search for information about and results from those trials. This package provides a set of functions to interact with the search and download features. Results are downloaded to temporary directories and returned as R objects.

Note that the clinicaltrials.gov API does not require an API key.

The rclinicaltrials package is in early development.

To install from github, use devtools::install_github(), as follows:

install.packages("devtools")
library(devtools)
install_github("sachsmc/rclinicaltrials")

The main function is clinicaltrials_search(). Here's an example of its use:

library(rclinicaltrials)
z <- clinicaltrials_search(query = 'lime+disease')
str(z)

This gives you basic information about the trials. Before searching or downloading, you can determine how many results will be returned using the clinicaltrials_count() function:

clinicaltrials_count(query = "myeloma")
clinicaltrials_count(query = "29485tksrw@")

The query can be a single string which will be passed to the "search terms" field on clinicaltrials.gov. Terms can be combined using the logical operators AND, OR, and NOT. Advanced searches can be performed by passing a vector of key=value pairs as strings. For example, to search for cancer interventional studies:

clinicaltrials_count(query = c("type=Intr", "cond=cancer"))

The possible advanced search terms are included in the advanced_search_terms data frame which comes with the package. The data frame has the keys, description, and a link to the help webpage which will explain the possible values of the search terms. To open the help page for cond, for instance, run browseURL(advanced_search_terms["cond", "help"]).

head(advanced_search_terms)

To download detailed study information, including results, use clinicaltrials_download(). Downloading lots of results may take a long time and use a substantial amount of hard drive space. You can limit the number of studies downloaded with the count option. By default, the count is limited to 20.

y <- clinicaltrials_download(query = 'myeloma', count = 10, include_results = TRUE)
str(y)

This returns a list of dataframes that have a common key variable: nct_id. Optionally, you can get the long text fields and/or study results (if available). Study results are also returned as a list of dataframes, contained within the list.

The data come from a relational database with lots of text fields, so it may take some effort to get the data into a flat format for analysis. For that reason, results come back from the clinicaltrials_download function as a list of dataframes. Each dataframe has a common key variable: nct_id. To merge dataframes, use this key. Otherwise, you can analyze the dataframes separately. They are organized into study information, locations, outcomes, interventions, results, and textblocks. Results, where available, is itself a list with three dataframes: participant flow, baseline data, and outcome data.

If you have any difficulties, or notice anything strange with the results, please report the issue here.

To view the tutorial for the rclinicaltrials package:

vignette("basics", "rclinicaltrials")

Licensed under the MIT license. (More information here.)

News

  • fixed bug caused by differences in search result based on IDs versus queries
  • more robust searching in download function
  • save memory by using httr::write_disk in clinicaltrials_download
  • fixed windows problem!
  • added treatment arms table
  • fixed bug where download with query returned all results
  • clarified issue with download when count > 100
  • Added overall_official field
  • return strings as strings, not factors
  • First release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("rclinicaltrials")

1.4.7 by Michael C Sachs, 7 months ago


Browse source code at https://github.com/cran/rclinicaltrials


Authors: Michael C Sachs <sachsmc@gmail.com>


Documentation:   PDF Manual  


Task views:


MIT + file LICENSE license


Imports httr, XML, plyr

Suggests rmarkdown, devtools, knitr, roxygen2, testthat, dplyr, ggplot2


See at CRAN