A collection of functions for top-down exploratory data analysis of spectral data obtained via nuclear magnetic resonance (NMR), infrared (IR) or Raman spectroscopy. Includes functions for plotting and inspecting spectra, peak alignment, hierarchical cluster analysis (HCA), principal components analysis (PCA) and model-based clustering. Robust methods appropriate for this type of high-dimensional data are available. ChemoSpec is designed with metabolomics data sets in mind, where the samples fall into groups such as treatment and control. Graphical output is formatted consistently for publication quality plots. ChemoSpec is intended to be very user friendly and help you get usable results quickly. A vignette covering typical operations is available.
NEWS file for package ChemoSpec Exploratory Chemometrics for Spectroscopy Bryan A. Hanson DePauw University, Greencastle Indiana USA URL: github.com/bryanhanson/ChemoSpec
Changes in version 4.4.1 2016-12-28 + check4Gaps overhauled and simplified. + Improved the way sumSpectra guesses at a tol value to send to check4Gaps. + Improved the way check4Gaps uses the tol value. + plotSpectraJS gains a "which" argument. + Removed function readJDX. There is now a package by that name which is much more powerful and will be used when needed. + Removed CITATION file. + Added Mike Bostock as a ctb in Authors@R. + Updated package references in vignette.
Changes in version 4.3.139 2016-08-06 + Further tweaks to namespace.
Changes in version 4.3.130 2016-08-05 + removeFreq needed a drop = FALSE argument for the case where there was just one sample. Reported by Daniel Montiel-Chicharro. + NOTE: check4Gaps may not identify gaps when only 1 spectrum is present. Need to follow up. + More clean up of namespace due to now using roxygen.
Changes in version 4.3.128 2016-07-21 + Cleaned up a lot of formatting & style issues in documentation. Merged documentation on selected functions. Miscellaneous improvements in documentation. + removeSample now checks for additional data. + sampleDistPlot was not plotting (!). Fixed. Also was not passing ... Fixed. + Converted documentation to roxygen2.
Changes in version 4.3.37 2016-07-18 + plotSpectra gains a showGrid argument, and any grid drawn is now underneath the spectra. Suggested by Jason James.
Changes in version 4.3.34 2016-07-15 + removeSample needed a drop = FALSE argument so one could remove all samples but one. + files2SpectraObject inherits the behavior of files2SpectraObject2 and is the new default. Message added to function warning of changes. + plotSpectraJS help file gains a note about NMR spectra. + No changes yet to readJDX: needs further testing.
Changes in version 4.3.30 2016-04-07 + Failed to export files2SpectraObject2: fixed it. + Some behind-the-scene changes to readJDX, working towards being able to import more formats.
Changes in version 4.3.28 2016-04-02 + Added function files2SpectraObject2 which offers much more flexibility for file importing. This new behavior will replace the files2SpectraObject function some time in summer 2016. Please test the new function!
Changes in version 4.3.22 2016-03-27 + Removed the help file about building a Spectra object by hand, and added a new function matrix2SpectraObject.
Changes in version 4.3.19 2016-03-26 + Added help file explaining how to build a Spectra object from a data matrix.
Changes in version 4.3.17 2016-03-24 + Improved error message from plotScores. + sumSpectra was not printing the group summary. Fixed. + The method of setting the optional tol argument in sumSpectra was improved. In truth, it looks like the old method was not passed to check4Gaps. Logic fixed. + New normalization option added to normSpectra: zero2one norms each spectrum separately to a [0...1] scale. Suggested by Nacho Pachon Jimenez. + Added function sgfSpectra to provide a convenient way to apply Savitzky-Golay filters to a Spectra object. + files2SpectraObject can now pass arguments properly through the ... mechanism. + Added another method for baseline correction to baselineSpectra ("linear"). + Fixed some logic in surveySpectra which led to warnings from lattice if by.gr = TRUE (long standing problem which was due to how groups with fewer than 3 members were handled). Plots no longer have empty panels for levels with insufficient data. As is often the case, there was more than one problem under the hood: groups were not being removed correctly and scales were specified incorrectly. + Added function plotSpectraDist which plots the distance between a reference spectrum and all the other spectra. Suggested by Juan Carlos Torres. + Minor updates to vignette. + Built and checked against R 3.4 r70368.
Changes in version 4.1.15 2015-07-23 + Labeling in surveySpectra2 changed (removed "relative"). + getMaxCovByFreq renamed to sortCrossPeaks. The function now ignores (flexibly) peaks near the diagonal, where the covariance is actually the variance of the peak. The column names of the return value were made clearer. Rd file adjusted accordingly. + collapseRowsOrCols is a helper function which is currently not exported or documented. + Function cleanSTOCSYpeaks added. + corSpectraJS now uses exCon2 rather than exCon. + Examples using corSpectraJS modified to not require js and V8. + Documentation of plotSpectraJS, covSpectraJS argument minify expanded to clarify the dependence on js and V8. + In functions corSpectra, covSpectra and covSpectraJS argument C was renamed to R. + hcaSpectra, hcaScores and plotHCA gain a leg.loc argument similar to plotScores. + plot2Loadings now returns a data frame containing the loadings. + sPlotSpectra now returns the frequency along with cov and cor. + labelExtremes, which serves several higher level functions, now formats frequencies to 2 decimal places. Previous method was giving too many digits. + sumSpectra now tries to guess a reasonable value for tol which it passes to check4Gaps, if none is provided. + NAMESPACE adjusted to conform to R 3.3 devel, coming to a mirror near you.
Changes in version 4.0.6 2015-06-19 + Package description re-written at CRAN's request to explain acronyms.
Changes in version 4.0.5 2015-06-18 + Internal workings of hmapSpectra changed due to changes in the underlying function in package seriation. + The reference grid draw by corSpectra if pmode = "contour" and drawGrid = TRUE is now draw behind the contours, not on top. + If using getMaxCovByFreq please see under Details in the help page.
Changes in version 4.0.4 2015-06-08 + Improved y-axis tick label formatting in corSpectra.
Changes in version 4.0.3 2015-06-07 + Fixed error in getMaxCovByFreq - the value of Quan was not being applied to the correct column. + covSpectra gains an argument yFree, which defaults to FALSE. In this case, the y axis is scaled to the range of the entire covariance matrix. If FALSE, the y axis is scaled to the data specified by argument freq. + For corSpectra, with pmode = "contour" and "image" only, the calculation of tick marks was improved. + corSpectra gains a drawGrid argument, which draws a faint gray grid on the plot to help with identifying shifts (for pmode = "contour" only).
Changes in version 4.0.2 2015-05-27 + Small tweaks to layout of plotSpectraJS + Added function covSpectraJS. + Added function surveySpectra2. + The method of computing colors in covSpectra and covSpectraJS is now based upon the average correlation value between two contiguous points. Previously it had been based on point n and not n & n + 1. + The output of getMaxCovByFreq now includes the signed covariance values. + Fixed some small internal issues in plotSpectraJS that were not noticeable to the end user. + Fixed the mouse / cursor tracking in plotSpectraJS.
Changes in version 4.0.1 2015-03-16 + Changes to imports and namespace to reduce overhead. + Fixed example in hmapSpectra.Rd. + Added R >= 3.1 due to needing anyNA in dependency speaq.
Changes in version 4.0-0 2015-03-10 + Added function plotSpectraJS for interactive plotting of spectra. + Added function corSpectra. + Added function covSpectra. + Added function sampleDistSpectra. + Added function getMaxCovByFreq. + Added cosine distance as an option to rowDist. + Revised evalClusters to use either NbClust or clusterCrit packages. + Several functions were renamed for consistency. YOUR CODE MAY BREAK. LoopThruSpectra -> loopThruSpectra, specSurvey -> surveySpectra, baselineSpec -> baselineSpectra, binBuck -> binSpectra, classPCA -> c_pcaSpectra, robPCA -> r_pcaSpectra, bootPCA -> cv_pcaSpectra, aovPCA -> aov_pcaSpectra. + Data set CuticleIR was removed. + Data sets alignMUD, metMUD1, metMUD2 added. + Removed alignTMS, calcSN, and modified files2SpectraObject. Spectra can now be aligned using clupaSpectra. + Removed functions that import Bruker files as they have not had enough testing. + First argument in baselineSpectra was Spectra, changed to spectra for consistency. + Fixed a bug in the title of loopThruSpectra. + Many updates to the documentation. + Revised package description. + Moved many full imports to importFrom to narrow namespace. + IDPmisc moved to imports; needed only for rfbaseline + Fixed a bad URL in plotScree.Rd. + Changed formatting of NEWS file so that the built-in reader parses it better.
Changes in version 3.0-1 2015-01-21 + Fixed a namespace conflict from baseline.
Changes in version 3.0-0 2015-01-20 + sumSpectra now checks for additional data and reports if it finds them. + removeGroup now checks for additional data and reports the indices to be removed in case the user wants to manually complete the editing. + Added probabalistic quotient normalization (PQN) to normSpectra, and made it the default. + Functions hcaSpectra, hcaScores, and plotHCA have been reorganized internally as far as which function does what. The return values are now different! Your code may break! + Preliminary version of evalClusters added. + At Last: NMR peak alignment is now available in function clupaSpectra, made possible by the availability of the package speaq. speaq must be installed from Bioconductor. + Updated URLs in Rd files. + Updated CITATION file for CRAN changes. + Vignette reorganized and updated. + Removed the last frequency in SrE.IR as the intensity values were all zero.
Changes in version 2.0-4 2014-09-29 + Fixed importing of csv files with integer frequency values in files2SpectraObject (thought this was fixed before).
Changes in version 2.0-3 2014-07-25 + removeGroup was not exported, now it is! + plotScores was given a new argument, leg.loc, to allow flexibility in placement of the legend.
Changes in version 2.0-2 2014-05-31 + This version is on CRAN. + hcaSpectra now returns the dendrogram object. + Fixed a stray graphic parameter in aovPCAscores. + Improved some documentation.
Changes in version 2.0-1 2014-02-04 + With format = "dx" in files2SpectraObject, now allow any of the following extensions: dx, DX, jdx, JDX.
Changes in version 2.0.0 2013-12-07 + normSpectra modified to avoid integer overflow. + Modified plotScree & plotScree2 so that data sets with less than 10 components plot. + Freshened up the README.md file. + To facilitate using expressions in plot titles, argument 'title' has been removed from all functions that use base or lattice graphics and now one should use 'main'. The 'title' argument remains for the rgl graphics functions. THIS WILL BREAK EARLIER CODE, hence the move to a new major version. + Semantic versioning is now in effect. See semver.org for details. + getManyCsv has been removed (deprecated for one year now). Use files2SpectraObject instead.
Changes in version 1.61-4 2013-11-20 + files2SpectraObject now returns the Spectra object so it is no longer necessary to read the object back into the workspace. + Added a new processing option to files2SpectraObject. Files written with Bruker Topspin command convbin2asc can now be read into a Spectra object. The format is 'Bascii'. When debug = TRUE the messages are clean and consistent; messages from other formats need to be cleaned up. alignTMS does not yet work!!!! <-- NOTE + findTMS and calcSN gained a debug argument.
Changes in version 1.61-3 2013-09-20 + Corrected vignette engine typo.
Changes in version 1.61-2 2013-09-11 + Added knitr as vignette engine at CRAN request. + Added .Rbuildignore at CRAN request (visible version is R.buildignore). + Removed leftover vignette in inst/doc.
Changes in version 1.61-1 2013-09-02 + Made sure plot labels were bold if an expression was passed to it.
Changes in version 1.61-0 2013-08-28 + Added ability to use expressions in plot titles. + Modified readJDX to accept EU and Latin American style dx files which have ',' instead of '.' as the decimal point.
Changes in version 1.60-9 2013-08-16 + Added dependency on R >= 3.0 to pass all CRAN checks.
Changes in version 1.60-8 2013-08-14 + Finally the vignette suitable for CRAN!
Changes in version 1.60-7 2013-08-14 + Removed BuildVignettes from DESCRIPTION per CRAN.
Changes in version 1.60-6 2013-08-13 + Moved vignette to vignettes directory for CRAN. + Set BuildVignettes: FALSE in DESCRIPTION.
Changes in version 1.60-5 2013-08-13 + Modified chkSpectra so that it reports on any extra data in the Spectra object. This will be useful with hyperChemBridge. + Added bug reports to DESCRIPTION + Fixed some import issues for CRAN
Changes in version 1.60-4 2013-04-02 + Fix hypTestScores for CRAN; remove attach/detach.
Changes in version 1.60-3 2013-03-29 + Ready for R 3.0.0 + Devel version merged in + Removed Suggests: rggobi
Changes in version 1.60-2 2013-02-04 + Various documentation tweaks and improvement to github site. + Removed plotScoresG due to continuing lack of universal ggobi availability. + Built using R 3.0 RC
Changes in version 1.60-1 2012-12-28 + Added function readJDX which can read the IR implementation of the JCAMP-DX standard. + Added function readBrukerTxt which can read NMR txt files from Bruker instruments. + Added function files2SpectraObject which is a greatly enhanced means of importing more spectral formats. + Added functions calcSN and findTMS which work with the above functions. + Modified getManyCsv so that it warns about deprecation when run.
Changes in version 1.51-2 + Added ... to sumSpectra so one can pass argument tol through to check4Gaps. Otherwise, for some data sets a gap is detected between each data point, which is annoying at best. + Made modifications so that ChemoSpec works with mclust 4.0. These were begun in 1.51-1 but not completed until this version.
Changes in version 1.51-1 2012-07-11 + Added ByteCompile: TRUE to description + Added code to getManyCsv to detect frequency values that are integer. Such cases are converted to numeric with a notice. + Modifed coordProjCS to conform to the update mclust package (v 4.?). Thanks to Luca Scrucca for the heads up.
Changes in version 1.51-0 2012-05-23 + First build under R 2.15.0. + Removed mvbutils as a dependency. It was only being used in the vignette, which is no longer being built automatically. So it was not needed, and people were occasionally running into problems due to it. + Added IDPmisc as an import. This was being requested at various points. Removed from Suggests. + Added new function removeGroup. See the help page. + removeSample had chkSpectra added right before returning the new spectra object. + Removed hardwired xlim and ylim from sPlotSpectra. Some labels may now be cut off with the defaults, but this allows zooming into the interesting parts. + Added function plotScree2. See the help page. + Many small improvements to the documentation. + Updated the vignette.
Changes in version 1.50-2 2012-01-04 + Added IDPmisc to Suggests: in DESCRIPTION (needed to pass CRAN checks)
Changes in version 1.50-1 2012-02-02 + Added full support for baseline correction via baselineSpec, which is a simple wrapper for the very nice baseline package. + Completely updated the vignette; switched the data demo set to SrE.IR.
Changes in version 1.49-1 2011-12-23 + Currently on Github devel branch only (never pushed to CRAN) + Added new function LoopThruSpectra + Changed plotSpectra so that the y axis shows and the default ylim allows for values below zero. + Made the NAMESPACE exports explicit.
Changes in version 1.48-5 2011-11-09 + Consider this a pre-release version. Will go to Github, not CRAN + Made some small changes for consistency on package checking.
Changes in version 1.48-4 2011-11-2 + Removed an outdated dependency on ggplot2 to pass CRAN and a few other tiny fixes. + Fixed a boo-boo in the vignette which plotted the wrong graph for Example 4. Why doesn't anyone tell me about these!?
Changes in version 1.48-3 2011-10-30 + A few changes to package structure for compatibility with R 2.14.
Changes in version 1.48-2 2011-10-19 + Some small user-invisible changes to make ChemoSpec compatible with R 2.14.0 due out in a few days.
Changes in version 1.48-1 2011-10-19 + NOTE: With these changes, some old code may break! Just a little� + Thanks to Roberto Canteri (RC) of Fondazione Bruno Kessler for bringing some small bugs to my attention, and making some very useful suggestions. Details below. + Added a method argument to normSpectra to pave the way for further options. + Converted sPlotSpectra to use base graphics (instead of ggplot2) so that existing functions could be used to label points extreme points. Thus an argument "tol" was added. + Fixed a few help file issues. + Fixed a bug in plotHCA (a mis-placed parenthesis; noted by RC) + Added a function rowDist, written and suggested by RC, to give more options for distance calculations when doing HCA. Changed functions hcaScores and hcaSpectra to use this new function, which required that the old argument method be renamed c.method and a new argument called d.method be added. The full range of distance methods and clustering methods can now be employed. + Changed some aspects of plotScores to handle the cases where the number of members of the group was 3 or less differently (suggested by RC). I don't think there are any changes to what the user sees, except possibly some warnings may have improved. + Changed groupNcolor so that it does not need to re-read the files in the directory to assign the groups. This may make it easier to semi-manually build a Spectra object. + Changed groupNcolor to issue a warning if no match was found for a gr.crit value among the file names, or vice versa. + Fixed a bug in check4Gaps wherein the FALSE answer, i.e. no gaps, was only set properly if silent = FALSE. Oddly, this caused the wrong answer in the help file example but had not been detected until now! + Changed the binData code, as suggested by RC, to chop off points differently. Previously, there had been a limit of removing 50 points. Now there is no limit, and the number of points removed is reported to the console. Tested pretty thoroughly. + Made a small change to sumGroups (grep -> which) at the recommendation of RC.
Changes in version 1.47-1 2011-07-28 + Fixed a few things needed to pass CRAN checks.
Changes in version 1.47-0 2011-07-26 + Fixed a problem with splitSpectraGroups which caused an error when only one instruction was passed to the function. + Also modified splitSpectraGroups to allow a new set of colors to be assigned based upon the splitting instructions (when only one splitting instruction is given). This is a different way of changing colors than conColScheme. + Changed plotScores so that the key to the type of ellipses drawn and the description of the type of PCA conducted are both in the upper lefthand corner of the plot. This avoids having the two pieces of text overlap which occurred when physically small plots were created. + Added a centering argument to classPCA. This is for compatibility with the aovPCA series of functions which need to carry out PCA but have already centered the data matrix. The default is still to center so no earlier code should break and users really don't need to do anything different. + Added functions aovPCA, avgFacLvls, aovPCAscores and aovPCAloadings, written by Matt Keinsley. These functions carry out ANOVA-PCA as described by Harrington. See the help pages. + Added function sPlotSpectra which implements an s-plot which is a useful adjunct to loadings plots (written by Matt Keinsley). + Data set gasNIR was removed.
Changes in version 1.46-4 2011-03-26 + Fixed a few MORE problems giving CRAN trouble!
Changes in version 1.46-3 2011-03-24 + Fixed a few little problems giving CRAN trouble
Changes in version 1.46-2 2011-03-24 + Improved the help file for getManyCsv. + Something is up with the orthogonal distance plot or calculation. Need to fix. + Fixed a few little problems giving CRAN trouble
Changes in version 1.46-1 2010-12-21 + Fixed a typo which caused an error in an example.
Changes in version 1.46 2010-8-27 + Modified classPCA and robPCA so that they no longer normalized samples by rows. Now users must normalize manually if desired, using normSpectra. + Added a function splitSpectra groups which allows one to use the existing spectra$groups entries to create new groups, stored in new elements of the Spectra object. + Added a function hypTestScores which facilitates carrying out hypothesis tests on PCA results. + Added function hmapSpectra which creates a seriated heat map of samples and spectral data. + WARNING: Several of the vignette examples seem to be behaving strangely. May be a problem with CuticleIR, the code seems OK. + 51 Functions total.
Changes in version 1.45 2010-08-04 + No changes; version 1.44 didn't seem to build correctly.
Changes in version 1.44 2010-07-13 + When using robPCA with choice "mad" a warning is issued which comes from the latest version of package pcaPP. It only concerns performance (speed) and may be ignored. + ggobi is on "life support" and hence development of plotScoresG is on hold until ggobi comes back to life or is replaced. Related to this, some of the color schemes presented in the vignette and discussed in relationship to ggobi appear to have been changed. + Added warnings to specSurvey if there are 1-3 group members, as this makes interpreation of the descriptive statistics tricky. If there are 3 or fewer group members, the survey results for this group are dropped from the display and a warning is issued. + Made significant changes to removeFreq. There was a problem with how NMR data were handled; this led to a significant re-write. Everything seems to test out now. + Added two data sets: SrE.IR and SrE.NMR, see the documentation. + Fixed a problem with the subtitle in plotLoadings. + Fixed a few typos of course. + Still 48 functions.
Changes in version 1.43 2009-12-27 + A revised version of specSurvey was implemented which permits one to use not only sd, but also sem, iqr and mad to examine the data sets. In addition, one can look at each group of spectra separately, or analyze the entire data set. + Fixed a axis labeling error in plotScoresRGL & plotScores3D + Fixed an error in getManyCsv in which the entire data matrix in the final Spectra object originated from the first spectrum. This had been introduced when format = "csv2" was added in 1.42. + 48 functions total
Changes in version 1.42 2009-12-21 + Added optional outlier labeling to plotScoresRGL. + Added robust or classical confidence ellipses as an option to plotScoresRGL and plotScores3D. + Further development of plotScoresG is on hold until rggobi is rebuilt for R 2.10. When it returns, will add use.sym and confidence ellipses to it. + Added mclustSpectra, a wrapper to mclust methods. + Added mclust3D, a function which permits 3D display of groups identified by mclust. Either robust or classical confidence ellipses are displayed. mclust3dSpectra is the front end in the package. + Several support functions added (labelExtremes3d, normVec, makeEllipsoid). + Added a format option "csv2" to getManyCsv for folks in the EU. Thanks to Richard Pena bringing this to my attention. + Updated all help files. + Expanded vignette. + 43 Functions total
Changes in version 1.41 2009-12-09 + Completed implementation of the use.sym argument to plotting routines which require it. + 36 Functions total.
Changes in version 1.40 2009-12-09 + Added the first version of the vignette, which lacks detailed references. + Added man page colorSymbol.Rd to collect all recommendations about color and its use in one place. + Added a function to convert color schemes of an existing Spectra object into a new color scheme (conColScheme). + Expanded the creation and use of Spectra objects to include plotting symbols and alternate plotting symbols for each group, randomly assigned. These are needed for b/w plots and also for viewing by color-blind individuals. The relevant functions now have argument use.sym = FALSE which controls this feature. + Fixed an error in how colors and groups were matched up - it was being done differently in different functions. Function sumGroups now handles this task for consistency. + Updated the help files. + The GitHub site now gets a .tar.gz source file for Mac users, and a .zip file for Windows users. + Also included is a file with installation instructions.
Changes in version 1.31 2009-11-30 + Added function plotScoresRGL which does interactive 3D plots of scores using the rgl package. This is an alternative to plotScoresG which uses ggobi and rggobi. Necessitates new dependency on rgl. + Added data set gasNIR.
Changes in version 1.30 2009-11-25 + Fixed certain .Rd files so the reference manual prints correctly. + Fixed a problem in plotSpectra having to do with the placement of the label if the frequency range was descending. + Removed default xrange in plotSpectra so that users can specify their own xlim via the ... concept. + Added function binData and split trimNbin into removeFreq and binBuck; in turn, this required adding a check4Gaps function. + Changed sumSpectra to report the number of frequency points, report any gaps in the data, and report the frequency unit/data point correctly. Also set it up so that users can pass ... arguments. + Added a function to normalize Spectra objects by the sum of each data row (other normalization schemes can be added later). + Added a helper function isWholeNo. + Revised the Spectra object description standard so that unit is only two characters long (NMR = logical was eliminated as it was not used anywhere). + Added and uppdated Rd files.
Changes in version 1.2 2009-10-14 + Added plotScoresG to create ggobi visualizations of the pca results. + Made color schemes consistent with ggobi color schemes. + Clarified and expanded selected documentation files.
Changes in version 1.1 2009-09-23 + Modified colLeaf so that the leaf label size would vary with the number of leaves to be plotted. + Added ... to plot arguments in HCA so that user can pass additional plotting parameters. + Added function plotSpectra which was overlooked in the first build! + Added a description of Spectra objects to the man pages. + Added function plotScores3D to produce simple 3D plots. + Added plotScoresG to create ggobi visualizations of the pca results. + Made color schemes consistent with ggobi color schemes. + Clarified and expanded selected documentation files.
Changes in version 1.0 2009-09-08 + First Release