Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 284 packages in 0.01 seconds

tokenizers — by Lincoln Mullen, 2 years ago

Fast, Consistent Tokenization of Natural Language Text

Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words. The tokenizers have a consistent interface, and the package is built on the 'stringi' and 'Rcpp' packages for fast yet correct tokenization in 'UTF-8'.

bibtex — by James Joseph Balamuta, 2 years ago

Bibtex Parser

Utility to parse a bibtex file.

targets — by William Michael Landau, 2 months ago

Dynamic Function-Oriented 'Make'-Like Declarative Pipelines

Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, ).

osmdata — by Mark Padgham, 2 years ago

Import 'OpenStreetMap' Data as Simple Features or Spatial Objects

Download and import of 'OpenStreetMap' ('OSM') data as 'sf' or 'sp' objects. 'OSM' data are extracted from the 'Overpass' web server (< https://overpass-api.de/>) and processed with very fast 'C++' routines for return to 'R'.

rtweet — by Lluís Revilla Sancho, a year ago

Collecting Twitter Data

An implementation of calls designed to collect and organize Twitter data via Twitter's REST and stream Application Program Interfaces (API), which can be found at the following URL: < https://developer.twitter.com/en/docs>.

piggyback — by Carl Boettiger, 2 years ago

Managing Larger Data on a GitHub Repository

Because larger (> 50 MB) data files cannot easily be committed to git, a different approach is required to manage data associated with an analysis in a GitHub repository. This package provides a simple work-around by allowing larger (up to 2 GB) data files to piggyback on a repository as assets attached to individual GitHub releases. These files are not handled by git in any way, but instead are uploaded, downloaded, or edited directly by calls through the GitHub API. These data files can be versioned manually by creating different releases. This approach works equally well with public or private repositories. Data can be uploaded and downloaded programmatically from scripts. No authentication is required to download data from public repositories.

RefManageR — by Mathew W. McLean, 3 years ago

Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management

Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.

cffr — by Diego Hernangómez, 2 months ago

Generate Citation File Format ('cff') Metadata for R Packages

The Citation File Format version 1.2.0 is a human and machine readable file format which provides citation metadata for software. This package provides core utilities to generate and validate this metadata.

robotstxt — by Pedro Baltazar, 7 months ago

A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker

Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.

phylogram — by Shaun Wilkinson, 7 years ago

Dendrograms for Evolutionary Analysis

Contains functions for developing phylogenetic trees as deeply-nested lists ("dendrogram" objects). Enables bi-directional conversion between dendrogram and "phylo" objects (see Paradis et al (2004) ), and features several tools for command-line tree manipulation and import/export via Newick parenthetic text.