Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 35 packages in 0.09 seconds

ff — by Jens Oehlschlägel, a year ago

Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

tmapverse — by Martijn Tennekes, 6 months ago

Meta-Package for Thematic Mapping with 'tmap'

Attaches a set of packages commonly used for spatial plotting with 'tmap'. It includes 'tmap' and its extensions ('tmap.glyphs', 'tmap.networks', 'tmap.cartogram', 'tmap.mapgl'), as well as supporting spatial data packages ('sf', 'stars', 'terra') and 'cols4all' for exploring color palettes. The collection is designed for thematic mapping workflows and does not include the full set of packages from the R-spatial ecosystem.

tmap.cartogram — by Martijn Tennekes, a year ago

Extension to 'tmap' for Creating Cartograms

Provides new layer functions to 'tmap' for creating various types of cartograms. A cartogram is a type of thematic map in which geographic areas are resized or distorted based on a quantitative variable, such as population. The goal is to make the area sizes proportional to the selected variable while preserving geographic positions as much as possible.

EmpiricalCalibration — by Martijn Schuemie, a year ago

Routines for Performing Empirical Calibration of Observational Study Estimates

Routines for performing empirical calibration of observational study estimates. By using a set of negative control hypotheses we can estimate the empirical null distribution of a particular observational study setup. This empirical null distribution can be used to compute a calibrated p-value, which reflects the probability of observing an estimated effect size when the null hypothesis is true taking both random and systematic error into account. A similar approach can be used to calibrate confidence intervals, using both negative and positive controls. For more details, see Schuemie et al. (2013) and Schuemie et al. (2018) .

tmap.networks — by Martijn Tennekes, 10 months ago

Extension to 'tmap' for Creating Network Visualizations

Provides functions for visualizing networks with 'tmap'. It supports 'sfnetworks' objects natively but is not limited to them. Useful for adding network layers such as edges and nodes to 'tmap' maps. More features may be added in future versions.

CirceR — by Chris Knoll, 2 years ago

Construct Cohort Inclusion and Restriction Criteria Expressions

Wraps the 'CIRCE' (< https://github.com/ohdsi/circe-be>) 'Java' library allowing cohort definition expressions to be edited and converted to 'Markdown' or 'SQL'.

qbinplots — by Edwin de Jonge, a year ago

Quantile Binned Plots

Create quantile binned and conditional plots for Exploratory Data Analysis. The package provides several plotting functions that are all based on quantile binning. The plots are created with 'ggplot2' and 'patchwork' and can be further adjusted.

zonebuilder — by Robin Lovelace, a year ago

Create and Explore Geographic Zoning Systems

Functions, documentation and example data to help divide geographic space into discrete polygons (zones). The package supports new zoning systems that are documented in the accompanying paper, "ClockBoard: A zoning system for urban analysis", by Lovelace et al. (2022) . The functions are motivated by research into the merits of different zoning systems (Openshaw, 1977) . A flexible ClockBoard zoning system is provided, which breaks-up space by concentric rings and radial lines emanating from a central point. By default, the diameter of the rings grow according to the triangular number sequence (Ross & Knott, 2019) with the first 4 doughnuts (or annuli) measuring 1, 3, 6, and 10 km wide. These annuli are subdivided into equal segments (12 by default), creating the visual impression of a dartboard. Zones are labelled according to distance to the centre and angular distance from North, creating a simple geographic zoning and labelling system useful for visualising geographic phenomena with a clearly demarcated central location such as cities.

Eunomia — by Frank DeFalco, 7 months ago

Standard Dataset Manager for Observational Medical Outcomes Partnership Common Data Model Sample Datasets

Facilitates access to sample datasets from the 'EunomiaDatasets' repository (< https://github.com/ohdsi/EunomiaDatasets>).

CohortMethod — by Martijn Schuemie, 12 days ago

Comparative Cohort Method with Large Scale Propensity and Outcome Models

Functions for performing comparative cohort studies in an observational database in the Observational Medical Outcomes Partnership (OMOP) Common Data Model. Can extract all necessary data from a database. This implements large-scale propensity scores (LSPS) as described in Tian et al. (2018) , using a large set of covariates, including for example all drugs, diagnoses, procedures, as well as age, comorbidity indexes, etc. Large scale regularized regression is used to fit the propensity and outcome models as described in Suchard et al. (2013) . Functions are included for trimming, stratifying, (variable and fixed ratio) matching and weighting by propensity scores, as well as diagnostic functions, such as propensity score distribution plots and plots showing covariate balance before and after matching and/or trimming. Supported outcome models are (conditional) logistic regression, (conditional) Poisson regression, and (stratified) Cox regression. Also included are Kaplan-Meier plots that can adjust for the stratification or matching.