Found 966 packages in 0.01 seconds
Track Changes in Data
A framework that allows for easy logging of changes in data.
Main features: start tracking changes by adding a single line of code to
an existing script. Track changes in multiple datasets, using multiple
loggers. Add custom-built loggers or use loggers offered by other
packages.
Extending 'dendrogram' Functionality in R
Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.
Adapt Numerical Records to Fit (in)Equality Restrictions
Minimally adjust the values of numerical records in a data.frame, such that each record satisfies a predefined set of equality and/or inequality constraints. The constraints can be defined using the 'validate' package. The core algorithms have recently been moved to the 'lintools' package, refer to 'lintools' for a more basic interface and access to a version of the algorithm that works with sparse matrices.
Hash R Objects to Integers Fast
Apply an adaptation of the SuperFastHash algorithm to any R object. Hash whole R objects or, for vectors or lists, hash R objects to obtain a set of hash values that is stored in a structure equivalent to the input. See < http://www.azillionmonkeys.com/qed/hash.html> for a description of the hash algorithm.
Synthesize Data Based on Empirical Quantile Functions and Rank Order Matching
Data is synthesized using a combination of inverse transform sampling using the empirical quantile functions for each variable, and then copying the rank order structure from the original dataset. The syntesizer method has a tunable parameter allowing to gradually move from realistic and possibly unsafe synthetic data to decorrelated data of less utility.
Split-Apply-Combine with Dynamic Groups
Estimate group aggregates, where one can set user-defined conditions
that each group of records must satisfy to be suitable for aggregation. If
a group of records is not suitable, it is expanded using a collapsing scheme
defined by the user. A paper on this package was published in the Journal
of Statistical Software
Data Correction and Imputation Using Deductive Methods
Attempt to repair inconsistencies and missing values in data records by using information from valid values and validation rules restricting the data.
Modify Data Using Externally Defined Modification Rules
Data cleaning scripts typically contain a lot of 'if this change that' type of statements. Such statements are typically condensed expert knowledge. With this package, such 'data modifying rules' are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities.
Deductive Correction, Deductive Imputation, and Deterministic Correction
A collection of methods for automated data cleaning where all actions are logged.
'Drat' R Archive Template
Creation and use of R Repositories via helper functions to insert packages into a repository, and to add repository information to the current R session. Two primary types of repositories are support: gh-pages at GitHub, as well as local repositories on either the same machine or a local network. Drat is a recursive acronym: Drat R Archive Template.