Compositional analysis of differentially expressed proteins in
cancer and cell culture proteomics experiments. The data include lists of up-
and down-regulated proteins in different cancer types (breast, colorectal,
liver, lung, pancreatic, prostate) and laboratory conditions (hypoxia,
hyperosmotic stress, high glucose, 3D cell culture, and proteins secreted in
hypoxia), together with amino acid compositions computed for protein sequences
obtained from UniProt. Functions are provided to calculate compositional metrics
including protein length, carbon oxidation state, and stoichiometric hydration
state. In addition, phylostrata (evolutionary ages) of protein-coding genes are
compiled using data from Liebeskind et al. (2016)
Datasets are collected here for differentially (up- and down-) expressed proteins identified in proteomic studies of cancer and in cell culture experiments. Tables of amino acid compositions of proteins are used for calculations of chemical composition, projected into selected basis species. Plotting functions are used to visualize the compositional differences and thermodynamic potentials for proteomic transformations.
The manual (help pages) and vignettes can be viewed at http://chnosz.net/canprot/html/00Index.html.
First install the devtools package from CRAN:
Then install canprot from Github:
To install the package including the vignettes:
devtools::install_github("jedick/canprot", build_vignettes = TRUE)
You may need to re-run this command one or more times. Note that this pulls in more R packages as dependencies, and pandoc is also required.
Replace data(canprot) with automatic loading of data when package loads, into an environment that is now an exported object ('canprot').
Because of similar changes in CHNOSZ, we now need library(CHNOSZ) in more places in examples and vignettes.
New function get_comptab() merges and replaces ZC_nH2O() and CNS(), and adds capability to calculate standard molal volumes.
Add protein length ('nAA') as variable in get_comptab().
Add 'mfun' argument to get_comptab() to choose median or mean.
Add 'vars' argument to xsummary() to choose variables to tabulate.
In pdat_ functions, add =NT tag for datasets involving comparisons with normal tissue.
Use precomputed colors to remove colorspace dependency.
DESCRIPTION: Add KernSmooth to Suggests to avoid R CMD check error (it is needed for smoothScatter() in basis_comparison.Rmd).
Add basis_comparison.Rmd and potential_diagrams.Rmd.
New functions groupplots() to make potential diagrams for groups of datasets and mergedplot() to merge those diagrams.
First release on CRAN.
New function CNS() calculates proteomic differences of elemental abundances per residue.
Modify diffplot() to accept output from either ZC_nH2O() or CNS().
Change "AA" and "AA4" in setbasis() to "QEC" and "QEC4"; add "QEC+" (basis including H+).
New export: get_colors().
Plot text labels in diffplot().
Return values in rankplot() and xsummary().
Change chemical activities in setbasis("AA") (use setbasis("AA4") for old ones).
Move protein expression data to extdata/expression/[condition name]/.
Add LXM+16 dataset for colorectal cancer.
Add datasets from 17 studies for pancreatic cancer.
Add datasets from 20 studies for hypoxia or 3D culture.
Add datasets from 13 studies for hyperosmotic stress.
Add 'updates_file' argument to check_ID() and protcomp().
Rename stabplot() to rankplot().
Initial upload to GitHub.
Package development began on 2016-07-03, based on code and data in Supplemental Information Dataset S1 of Dick, 2016 (http://doi.org/10.7717/peerj.2238).
Exported functions (in approximate order of development): "protcomp", "check_ID", "get_pdat", "ZC_nH2O", "CLES", "xsummary", "rankdiff", "stabplot", "Ehplot", "pdat_CRC", "remove_entries", "diffplot", "lapply_canprot".
Datasets in 'canprot' environment: human_base.Rdata (21006 proteins), human_additional.Rdata (71173 proteins), human_extra.csv (72 proteins), uniprot_updates.csv (26 proteins).
Datasets in inst/extdata: AKP+10.csv, BPV+11.csv, JCF+11.csv, JKMF10.csv, KKL+12.csv, KWA+14.csv, KYK+12.csv, LPL+16.csv, MCZ+13.csv, MRK+11.csv, PHL+16.csv, STK+15.csv, UNS+14.csv, WDO+15.csv, WKP+14.csv, WOD+12.csv, WTK+08.csv, XZC+10.csv, YLZ+12.csv, ZYS+10.csv.
Vignettes: data_sources.Rmd, summary_table.Rmd, stability_plots.Rmd.