Found 10000 packages in 0.02 seconds
Sites, Population, and Records Cleaning Skills
Data cleaning including 1) generating datasets for time-series and case-crossover analyses based on raw hospital records, 2) linking individuals to an areal map, 3) picking out cases living within a buffer of certain size surrounding a site, etc. For more information, please refer to Zhang W,etc. (2018)
GIS Integration
Designed to facilitate the preprocessing and linking of GIS (Geographic Information System) databases < https://www.sciencedirect.com/topics/computer-science/gis-database>, the R package 'GISINTEGRATION' offers a robust solution for efficiently preparing GIS data for advanced spatial analyses. This package excels in simplifying intrica procedures like data cleaning, normalization, and format conversion, ensuring that the data are optimally primed for precise and thorough analysis.
Client for 'GenderAPI.io'
Provides an interface to the 'GenderAPI.io' web service (< https://www.genderapi.io>) for determining gender from personal names, email addresses, or social media usernames. Functions are available to submit single or batch queries and retrieve additional information such as accuracy scores and country-specific gender predictions. This package simplifies integration of 'GenderAPI.io' into R workflows for data cleaning, user profiling, and analytics tasks.
Economics and Pricing Tools
Functions to aid in micro and macro economic analysis and handling of price and currency data. Includes extraction of relevant inflation and exchange rate data from World Bank API, data cleaning/parsing, and standardisation. Inflation adjustment calculations as found in Principles of Macroeconomics by Gregory Mankiw et al (2014). Current and historical end of day exchange rates for 171 currencies from the European Central Bank Statistical Data Warehouse (2020).
Ecological Tolerance Indices
Computes the Road Tolerance Index (RTI) and the Human Footprint Tolerance Index (HFTI) for species occurrence data. It automates data cleaning and integrates spatial data (roads and human footprint) to produce reproducible tolerance metrics for biodiversity and conservation research. The HFTI calculation is based on the global human footprint dataset by Mu et al. (2022)
Automatic Database Normalisation for Data Frames
Automatic normalisation of a data frame to third normal form, with the intention of easing the process of data cleaning. (Usage to design your actual database for you is not advised.) Originally inspired by the 'AutoNormalize' library for 'Python' by 'Alteryx' (< https://github.com/alteryx/autonormalize>), with various changes and improvements. Automatic discovery of functional or approximate dependencies, normalisation based on those, and plotting of the resulting "database" via 'Graphviz', with options to exclude some attributes at discovery time, or remove discovered dependencies at normalisation time.
Prepare and Explore Data for Palaeobiological Analyses
Provides functionality to support data preparation and exploration for
palaeobiological analyses, improving code reproducibility and accessibility. The
wider aim of 'palaeoverse' is to bring the palaeobiological community together
to establish agreed standards. The package currently includes functionality for
data cleaning, binning (time and space), exploration, summarisation and
visualisation. Reference datasets (i.e. Geological Time Scales < https://stratigraphy.org/chart/>)
and auxiliary functions are also provided. Details can be found in:
Jones et al., (2023)
Client for 'GenderAPI.io' Phone Number Validation and Formatter API
Provides an interface to the 'GenderAPI.io' Phone Number Validation & Formatter API (< https://www.genderapi.io>) for validating international phone numbers, detecting number type (mobile, landline, Voice over Internet Protocol (VoIP)), retrieving region and country metadata, and formatting numbers to E.164 or national format. Designed to simplify integration into R workflows for data validation, Customer Relationship Management (CRM) data cleaning, and analytics tasks. Full documentation is available at < https://www.genderapi.io/docs-phone-validation-formatter-api>.
Flexible Dictionary-Based Cleaning
Provides flexible dictionary-based cleaning that allows users to specify implicit and explicit missing data, regular expressions for both data and columns, and global matches, while respecting ordering of factors. This package is part of the 'RECON' (< https://www.repidemicsconsortium.org/>) toolkit for outbreak analysis.
Draw Stratified Samples from the VADIR Database
Affords researchers the ability to draw stratified samples from the U.S. Department of Veteran's Affairs/Department of Defense Identity Repository (VADIR) database according to a variety of population characteristics. The VADIR database contains information for all veterans who were separated from the military after 1980. The central utility of the present package is to integrate data cleaning and formatting for the VADIR database with the stratification methods described by Mahto (2019) < https://CRAN.R-project.org/package=splitstackshape>. Data from VADIR are not provided as part of this package.