Proteomics Data Analysis Functions
Data analysis of proteomics experiments by mass spectrometry is supported by this collection of functions mostly dedicated to the analysis of (bottom-up) quantitative (XIC) data.
Fasta-formatted proteomes (eg from UniProt Consortium <10.1093>) can be read with automatic parsing and multiple annotation types (like species origin, abbreviated gene names, etc) extracted.
Quantitative proteomics measurements frequently contain multiple NA values, due to physical absence of given peptides in some samples, limitations in sensitivity or other reasons.
The functions provided here help to inspect graphically the data to investigate the nature of NA-values via their respective replicate measurements and to help/confirm the choice of NA-replacement by low random values.
Dedicated filtering and statistical testing using the framework of package 'limma' <10.18129> can be run, enhanced by multiple rounds of NA-replacements to provide robustness towards rare stochastic events.
Multi-species samples, as frequently used in benchmark-tests (eg Navarro et al 2016 <10.1038>, Ramus et al 2016 <10.1016>), can be run with special options separating the data into sub-groups during normalization and testing.
As example the data-set from Ramus et al 2016 <10.1016>) is provided quantified by MaxQuant (Tyanova et al 2016 <10.1038>), ProteomeDiscoverer,
OpenMS (<10.1038>) and Proline (Bouyssie et al 2020 <10.1093>).
Subsequently, ROC curves (Hand and Till 2001 <10.1023>) can be constructed to compare multiple analysis approaches.10.1023>10.1093>10.1038>10.1038>10.1016>10.1016>10.1038>10.18129>10.1093>