An approach to identifies metabolic biomarker signature for metabolic data by discovering predictive metabolite for predicting survival and classifying patients into risk groups.
Classifiers are constructed as a linear combination of predictive/important metabolites, prognostic factors and treatment effects if necessary.
Several methods were implemented to reduce the metabolomics matrix such as the principle component analysis of Wold Svante et al. (1987)

R package : A biomarker validation approach for predicting survival using metabolic signature, this package develope biomarker signature for metabolic data. It contains a set of functions and cross validation methods to validate and select biomarkers when the outcome of interest is survival. The package can handle prognostic factors and mainly metabolite matrix as input, the package can served as biomarker validation tool.

- It can be used with any form of high dimensional/omics data such as: Metabolic data, Gene expression matrix, incase you dont have a data it can simulate hypothetical scinerio of a high dimensional data based on the desired biological parameters
- It developed any form of signature from the high dimensional data to be used for other purpose
- It also employs data reduction techniques such as PCA, PLS and Lasso
- It classifies subjects based on the signatures into Low and high risk group
- It incorporate the use of subject prognostic information for the to enhance the biomarker for classification
- It gives information about the surival rate of subjects depending on the classification

You can install the released version of MetabolicSurv from CRAN with:

install.packages("MetabolicSurv")

Apart from the survival prediction and classification, \pkg{MetabolicSurv} can also be used to generate an artificial Metabolomic profile matrix, survival data (Survival time and censoring indiicator) and clinical covariates which will be referred to as prognostic factors to be used for further analysis or for other pursoses. Since there a few publicly available metabolic profile matrix this package can be used to firstly simulate each of this respective dataset which is required to evaluate the other basic and advance function in the package.

library(MetabolicSurv)Data <- MSData(nPatients = 200, nMet = 3000, Prop = 0.5)Metdata <- Data$MdataSurvdata <- Data$SurvivalCensordata <- Data$CensorProgdata <- Data$Prognostic

The code above was used to simulate a metabolomic, survival and prognostic data with a total of 200 patients with 3000 metabolites in the metabolomic profile matriix assuming that the proportion of patients having low risk is 0.5 . The proportion can be adjusted depending on how strict one need to be in assuming equal or unequal proportion of classification based on biological findings or intelligent guess. The Metabolomic profile matrix is stored in Metdata, the survival time is stored in Survdata, Censoring information in Censordata and the Prognosticfactor/clinical covariates in Progdata.

"Problem of interest""Given a set of subjects with known riskscores and prognostic features how can we use this information to obtain their risk of surving and what group does each respective subject belongs to?"

## Loading the packagelibrary("MetabolicSurv")## Loading one of the inbuilt datadata(DataHR)names(DataHR)## This function does Classification, Survival Estimation and VisualizationResult = EstimateHR(Risk.Scores=DataHR[,1],Data.Survival=DataHR[,2:3],Prognostic=DataHR[,4:5],Plots=FALSE,Quantile=0.50)## Survival informationResult$SurvResult## Group informationResult$Riskgroup

Category | Functions | Description |
---|---|---|

Basic | MSpecificCoxPh | Metabolite by metabolite Cox proportional hazard analysis |

SurvPcaClass | Classifier based on first PCA | |

SurvPlsClass | Classifier based on first PLS | |

Majorityvotes | Classifiction for Majority Votes | |

Lasoelacox | Wrapper function for glmnet | |

MSData | Generate Artificial Metabolic Survival Data | |

Advance | CVLasoelacox | Cross Validations for Lasso Elastic Net predictive models and Classification |

CVSim | Cross-validation for Top $K_{1}, \ldots, K_{n}$ metabolites | |

CVPcaPls | Cross-validations for PCA and PLS based methods | |

CvMajorityvotes | Cross-validation for majority votes | |

MetFreq | Frequency of Selected Metabolites from the Metabolite specific Cross Validation | |

QuantileAnalysis | Sensitivity of the quantile used for classification | |

Icvlasoel | Inner and outer cross-validations for shrinkage methods | |

DistHR | Null distribution of the estimated HR | |

SIMet | Sequentially increase the number of top $K$ metabolites |