DataClean — by Xiaorui(Jeremy) Zhu, 6 years ago

Data Cleaning

Includes functions that researchers or practitioners may use to clean raw data, transferring html, xlsx, txt data file into other formats. And it also can be used to manipulate text variables, extract numeric variables from text variables and other variable cleaning processes. It is originated from a author's project which focuses on creative performance in online education environment. The resulting paper of that study will be published soon.

editrules — by Edwin de Jonge, 4 years ago

Parsing, Applying, and Manipulating Data Cleaning Rules

Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the 'igraph' package.

bdc — by Bruno Ribeiro, a day ago

Biodiversity Data Cleaning

It brings together several aspects of biodiversity data-cleaning in one place. 'bdc' is organized in thematic modules related to different biodiversity dimensions, including 1) Merge datasets: standardization and integration of different datasets; 2) Pre-filter: flagging and removal of invalid or non-interpretable information, followed by data amendments; 3) Taxonomy: cleaning, parsing, and harmonization of scientific names from several taxonomic groups against taxonomic databases locally stored through the application of exact and partial matching algorithms; 4) Space: flagging of erroneous, suspect, and low-precision geographic coordinates; and 5) Time: flagging and, whenever possible, correction of inconsistent collection date. In addition, it contains features to visualize, document, and report data quality – which is essential for making data quality assessment transparent and reproducible. The reference for the methodology is Bruno et al. (2022) .

clean — by Matthijs S. Berends, 2 years ago

Fast and Easy Data Cleaning

A wrapper around the new 'cleaner' package, that allows data cleaning functions for classes 'logical', 'factor', 'numeric', 'character', 'currency' and 'Date' to make data cleaning fast and easy. Relying on very few dependencies, it provides smart guessing, but with user options to override anything if needed.

datacleanr — by Alexander Hurley, 9 months ago

Interactive and Reproducible Data Cleaning

Flexible and efficient cleaning of data with interactivity. 'datacleanr' facilitates best practices in data analyses and reproducibility with built-in features and by translating interactive/manual operations to code. The package is designed for interoperability, and so seamlessly fits into reproducible analyses pipelines in 'R'.

cleaner — by Matthijs S. Berends, a year ago

Fast and Easy Data Cleaning

Data cleaning functions for classes logical, factor, numeric, character, currency and Date to make data cleaning fast and easy. Relying on very few dependencies, it provides smart guessing, but with user options to override anything if needed.

fMRIscrub — by Amanda Mejia, a month ago

Scrubbing and Other Data Cleaning Routines for fMRI

Data-driven fMRI denoising with projection scrubbing (Pham et al (2022) ). Also includes routines for DVARS (Derivatives VARianceS) (Afyouni and Nichols (2018) ), motion scrubbing (Power et al (2012) ), aCompCor (anatomical Components Correction) (Muschelli et al (2014) ), detrending, and nuisance regression. Projection scrubbing and DVARS are also applicable to other outlier detection tasks involving high-dimensional data.

bdclean — by Thiloshon Nagarajah, 3 years ago

A User-Friendly Biodiversity Data Cleaning App for the Inexperienced R User

Provides features to manage the complete workflow for biodiversity data cleaning. Uploading data, gathering input from users (in order to adjust cleaning procedures), cleaning data and finally, generating various reports and several versions of the data. Facilitates user-level data cleaning, designed for the inexperienced R user. T Gueta et al (2018) . T Gueta et al (2017) .

DataCombine — by Christopher Gandrud, 6 years ago

Tools for Easily Combining and Cleaning Data Sets

Tools for combining and cleaning data sets, particularly with grouped and time series data.

framecleaner — by Harrison Tietze, 9 months ago

Clean Data Frames

Provides a friendly interface for modifying data frames with a sequence of piped commands built upon the 'tidyverse' Wickham et al., (2019) . The majority of commands wrap 'dplyr' mutate statements in a convenient way to concisely solve common issues that arise when tidying small to medium data sets. Includes smart defaults and allows flexible selection of columns via 'tidyselect'.