Projection based methods for preprocessing,
exploring and analysis of multivariate data used in chemometrics.
S. Kucheryavskiy (2020)
mdatools is an R package for preprocessing, exploring and analysis of multivariate data. The package provides methods mostly common for Chemometrics. It was created for an introductory PhD course on Chemometrics given at Section of Chemical Engineering, Aalborg University.
The general idea of the package is to collect most widespread chemometric methods and give a similar "user interface" for using them. So if a user knows how to make a model and visualise results for one method, he or she can easily do this for the others.
For more details and examples read a Bookdown tutorial.
New minor release (0.9.1) is available both from GitHub and CRAN (from 07.07.2018).
The latest major release (0.9.0) brings a set of new features, including methods for computing of critical limits for PCA/SIMCA residuals, adjuested residuals plot, and randomized algorithms for fast PCA decomposition of dataset with large number of rows. The text of tutorial has been amended correspondingly and now also includes a new chapter with detailed explanation of calculation of the critical limits.
A full list of changes is available here
The package is available from CRAN by usual installing procedure. However due to restrictions in CRAN politics regarding number of submissions (one in 3-4 month) only major releases will be published there (with 2-3 weeks delay after GitHub release as more thorought testing is needed). To get the latest release plase use GitHub sources. You can download a zip-file with source package and install it using the install.packages
command, e.g. if the downloaded file is mdatools_0.9.1.tar.gz
and it is located in a current working directory, just run the following:
install.packages('mdatools_0.9.1.tar.gz')
If you have devtools
package installed, the following command will install the latest release from
the master branch of GitHub repository (do not forget to load the devtools
package first):
install_github('svkucheryavski/mdatools')
opacity
parameter for semi-transparent colorsplotExtreme()
method for SIMCA modelssetResLimits()
method for PCA/SIMCA modelsplotProbabilities()
method for SIMCA resultsgetConfusionMatrix()
method for classification resultsplotPrediction()
for PLS resultsplotPrediction()
for PLS resultspls.getRegCoeffs()
now also returns standard error and confidence intervals calculated for unstandardized variablessummary()
for object with regression coefficients (regcoeffs
)mdaplot
for data frame with one or more factor columns, the factors are now transofrmed to dummy variables (before it led to an error)mdaplots
when using factor with more than 8 levels for color grouping led to an errorpca
with wrong calculation of eigenvalues in NIPALS algorithmlab.cex
and lab.col
now are also applied to colorbar labelsdocs
foldermdaplot()
and mdaplotg()
were rewritten completely and now are more easy to use (check tutorial)'d'
) for density scatter plotxlas
and ylas
in plots to rotate axis ticksplotBiplot()
)cgroup
) if no there is no test setprep.autoscale()
now do not scale columns with coefficient of variation below given thresholdprep.norm
)getRegcoeffs
was added to PLS modelcgroup
for plots now can work with factors correctly (including ones with text levels)lab.col
and lab.cex
for changing color and font size for data point labels?randtest
?crossval
roxygen2
packageclassres
class for representation and visualisation of classification resultsxticklabels
and yticklabels
to mdaplot
and mdaplotg
functionssimca
and simcares
classes for one-class SIMCA model and resultssimcam
and simcamres
classes for multiclass SIMCA model and resultsplsda
and plsdares
classes for PLS-DA model and resultsselectNumComp(model, ncomp)
instead
of pls.selectncomp(model, ncomp)
, test.x
ad test.y
instead of Xt
and yt
, finally separate logical
arguments center
and scale
are used instead of previously used autoscale
. By default scale = F
and center = T
.?pls
)mdaplot
or mdaplotg
functions, which extend basic functionality of R plots. For example,
they allow to make color groups and colorbar legend, calculate limits automatically depending on
elements on a plot, make automatic legend and many other things.