Normalisation of Multiple Variables in Large-Scale Datasets

The robustness of many of the statistical techniques, such as factor analysis, applied in the social sciences rests upon the assumption of item-level normality. However, when dealing with real data, these assumptions are often not met. The Box-Cox transformation (Box & Cox, 1964) < http://www.jstor.org/stable/2984418> provides an optimal transformation for non-normal variables. Yet, for large datasets of continuous variables, its application in current software programs is cumbersome with analysts having to take several steps to normalise each variable. We present an R package 'normalr' that enables researchers to make convenient optimal transformations of multiple variables in datasets. This R package enables users to quickly and accurately: (1) anchor all of their variables at 1.00, (2) select the desired precision with which the optimal lambda is estimated, (3) apply each unique exponent to its variable, (4) rescale resultant values to within their original X1 and X(n) ranges, and (5) provide original and transformed estimates of skewness, kurtosis, and other inferential assessments of normality.


CRAN_Status_Badge Build Status Coverage Status Rdoc

The normalr allows you to perform normalisation on multiple variables in large-scale datasets.

Installation

normalr is available from CRAN. Install it with:

install.packages("normalr")

You can install normalr from github with:

# install.packages("devtools")
devtools::install_github("kcha193/normalr")

Examples

The following example uses normalr to normalise 11 continous variables in mtcars dataset.

library(normalr)
 
normaliseData(mtcars, getLambda(mtcars))

Shiny apps

The following example run a shiny appliaction uses normalr to normalise any dataset that the user input.

library(normalr)
 
normalrShiny()

The shiny app is also available from shinyapps.io.

News

normalr 1.0.0

  • remove ddR package, due to this package is being archived.
  • add rlang package for checking the class of variables are numeric.
  • add the citation of the published paper.

normalr 0.0.3

  • embed a test dataset for the paper.

normalr 0.0.2

  • add the shiny app using normalrShiny().
  • fix on the negative data with negative lambda.

normalr 0.0.1

  • First release with three main functions:
    • getLambda() compute the lambda values based on the dataset.
    • normalise() normalised a numeric vector with the specific lambda value.
    • normaliseData() applies the normalisation on a dataset with multiple continuous variables.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("normalr")

1.0.0 by Kevin Chang, a year ago


https://github.com/kcha193/normalr


Report a bug at https://github.com/kcha193/normalr/issues


Browse source code at https://github.com/cran/normalr


Authors: Kevin Chang [aut, cre] , Matthew Courtney [aut]


Documentation:   PDF Manual  


GPL license


Imports MASS, parallel, purrr, magrittr, rlang, shiny

Suggests testthat, covr


See at CRAN