Found 10000 packages in 0.08 seconds
Support Technical Processes Following 'Maelstrom Research' Standards
Functions to support rigorous processes in data cleaning,
evaluation, and documentation across datasets from different studies based
on Maelstrom Research guidelines. The package includes the core functions
to evaluate and format the main inputs that define the process, diagnose
errors, and summarize and evaluate datasets and their associated
data dictionaries. The main outputs are clean datasets and associated
metadata, and tabular and visual summary reports. As described in
Maelstrom Research guidelines for rigorous retrospective data
harmonization (Fortier I and al. (2017)
Import, Clean and Update Data from the New Zealand Freshwater Fish Database
Access the New Zealand Freshwater Fish Database from R and a few functions to clean the data once in R.
Prepare and Explore Data for Palaeobiological Analyses
Provides functionality to support data preparation and exploration for
palaeobiological analyses, improving code reproducibility and accessibility. The
wider aim of 'palaeoverse' is to bring the palaeobiological community together
to establish agreed standards. The package currently includes functionality for
data cleaning, binning (time and space), exploration, summarisation and
visualisation. Reference datasets (i.e. Geological Time Scales < https://stratigraphy.org/chart>)
and auxiliary functions are also provided. Details can be found in:
Jones et al., (2023)
Deductive Correction, Deductive Imputation, and Deterministic Correction
A collection of methods for automated data cleaning where all actions are logged.
United States Copyright Office Product Management Division SR Audit Data Dataset Cleaning Algorithms
Intended to be used by the United States Copyright Office Product Management Division Business Analysts. Include algorithms for the United States Copyright Office Product Management Division SR Audit Data dataset. The algorithm takes in the SR Audit Data excel file and reformat the spreadsheet such that the values and variables fit the format of the online database. Support functions in this package include clean_str(), which cleans instances of variable AUDIT_LOG; clean_data_to_excel(), which cleans and output the reorganized SR Audit Data dataset in excel format; clean_data_to_dataframe(), which cleans and stores the reorganized SR Audit Data data set to a data frame; format_from_excel(), which reads in the outputted excel file from the clean_data_to_excel() function and formats and returns the data as a dictionary that uses FIELD types as keys and NON-FIELD types as the values of those keys. format_from_dataframe(), which reads in the outputted data frame from the clean_data_to_dataframe() function and formats and returns the data as a dictionary that uses FIELD types as keys and NON-FIELD types as the values of those keys; support_function(), which takes in the dictionary outputted either from the format_from_dataframe() or format_from_excel() function and returns the data as a formatted data frame according to the original U.S. Copyright Office SR Audit Data online database. The main function of this package is clean_format_all(), which takes in an excel file and returns the formatted data into a new excel and text file according to the format from the U.S. Copyright Office SR Audit Data online database.
A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Exploratory Data Analysis for FishNet2 Data
Provides data processing and summarization of data from FishNet2.net in text and graphical outputs. Allows efficient filtering of information and data cleaning.
Modifying Rules on a DataBase
Apply modification rules from R package 'dcmodify' to the database, prescribing and documenting deterministic data cleaning steps on records in a database. The rules are translated into SQL statements using R package 'dbplyr'.
Language Mapping and Geospatial Analysis of Linguistic and Cultural Data
Streamlined workflows for geolinguistic analysis, including: accessing global linguistic and cultural databases, data import, data entry, data cleaning, data exploration, mapping, visualization and export.
A Magical Framework for Collaborative & Reproducible Data Analysis
A comprehensive data analysis framework for NIH-funded research that streamlines workflows for both data cleaning and preparing NIH Data Archive ('NDA') submission templates. Provides unified access to multiple data sources ('REDCap', 'MongoDB', 'Qualtrics') through interfaces to their APIs, with specialized functions for data cleaning, filtering, merging, and parsing. Features automatic validation, field harmonization, and memory-aware processing to enhance reproducibility in multi-site collaborative research as described in Mittal et al. (2021)