Found 1035 packages in 0.03 seconds
Parsing, Applying, and Manipulating Data Cleaning Rules
Please note: active development has moved to packages 'validate' and 'errorlocate'. Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the 'igraph' package.
Hash R Objects to Integers Fast
Apply an adaptation of the SuperFastHash algorithm to any R object. Hash whole R objects or, for vectors or lists, hash R objects to obtain a set of hash values that is stored in a structure equivalent to the input. See < http://www.azillionmonkeys.com/qed/hash.html> for a description of the hash algorithm.
Data Validation Infrastructure
Declare data validation rules and data quality indicators;
confront data with them and analyze or visualize the results.
The package supports rules that are per-field, in-record,
cross-record or cross-dataset. Rules can be automatically
analyzed for rule type and connectivity. Supports checks implied
by an SDMX DSD file as well. See also Van der Loo
and De Jonge (2018)
Split-Apply-Combine with Dynamic Groups
Estimate group aggregates, where one can set user-defined conditions
that each group of records must satisfy to be suitable for aggregation. If
a group of records is not suitable, it is expanded using a collapsing scheme
defined by the user. A paper on this package was published in the Journal
of Statistical Software
Fast, Robust, and High-Quality Synthetic Data Generation with a Tuneable Privacy-Utility Trade-Off
Synthesize numeric, categorical, mixed and time series data. Data circumstances including mixed (or zero-inflated) distributions and missing data patterns are reproduced in the synthetic data. A single parameter allows balancing between high-quality synthetic data that represents correlations of the original data and lower quality but more privacy safe synthetic data without correlations. Tuning can be done per variable or for the whole dataset.
Data Correction and Imputation Using Deductive Methods
Attempt to repair inconsistencies and missing values in data records by using information from valid values and validation rules restricting the data.
Deductive Correction, Deductive Imputation, and Deterministic Correction
A collection of methods for automated data cleaning where all actions are logged.
Modify Data Using Externally Defined Modification Rules
Data cleaning scripts typically contain a lot of 'if this change that' type of statements. Such statements are typically condensed expert knowledge. With this package, such 'data modifying rules' are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities.
'Drat' R Archive Template
Creation and use of R Repositories via helper functions to insert packages into a repository, and to add repository information to the current R session. Two primary types of repositories are support: gh-pages at GitHub, as well as local repositories on either the same machine or a local network. Drat is a recursive acronym: Drat R Archive Template.
Processing 'Gen5' 2.06 Exported Data
A collection of functions for processing 'Gen5' 2.06 exported data. 'Gen5' is an essential data analysis software for BioTek plate readers < https://www.biotek.com/products/software-robotics-software/gen5-microplate-reader-and-imager-software/>. This package contains functions for data cleaning, modeling and plotting using exported data from 'Gen5' version 2.06. It exports technically correct data defined in (Edwin de Jonge and Mark van der Loo (2013) < https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf>) for customized analysis. It contains Boltzmann fitting for general kinetic analysis. See < https://www.github.com/yanxianUCSB/gen5helper> for more information, documentation and examples.