Automated Transcriptome Classifier Pipeline: Comprehensive Transcriptome Analysis

An unsupervised fully-automated pipeline for transcriptome analysis or a supervised option to identify characteristic genes from predefined subclasses. We rely on the 'pamr' < http://www.bioconductor.org/packages//2.7/bioc/html/pamr.html> clustering algorithm to cluster the Data and then draw a heatmap of the clusters with the most significant genes and the least significant genes according to the 'pamr' algorithm. This way we get easy to grasp heatmaps that show us for each cluster which are the clusters most defining genes.


Transcriptomic is the large-scale identification of gene expression across multiple samples. Gene expression mirrored functional aspects and included important information about biological functions and pathway activation. Their analysis can either uncover molecular functions on the one side and improve classification of large cohorts for improved clinical understanding on the other side. This tool aimed to design a standard-pipeline to integrate classification and functional aspect and generate a visual output to integrate transcriptomic data, clinical information and Gene Set Enrichment Analysis for functional aspects.

The pipeline was designed to integrate following aspects:

Reproducibility: Analysis needs to be easily reproduced by external researchers.

Easy-to-Use: The pipeline was designed to be user-friendly and applicable for non-expert users.

Compatible: The pipeline should be feasible for array based transcriptomic data as well as RNA sequencing outputs. For further clinical interpretation, external traits need to be easily integrated and included in the analysis.

How to install the package from GitHub

Install with devtools

install.packages("devtools")
library(devtools)
install_github("falafel19/AutoPipe")

Unsupervised Cluster Analysis

Description

A function for unsupervised Clustering of the data

#Load data with Gene ENTREZ in rownames and samples in colnames
data(y)
dim(data)

#Optional: Read in clinical Infos with samples in rownames

UnSuperClassifier(data,clinical_data=NULL,thr=2)

Produce a Heatmap using a Supervised Clustering Algorithm

Description

This function produces a plot with a Heatmap using a supervised clustering algorithm which the user choses. with a the mean Silhouette width plotted on the right top corner and the Silhouette width for each sample on top. On the right side of the plot the n highest and lowest scoring genes for each cluster will added. And next to them the coressponding pathways (see Details)

##load the org.Hs.eg Library
library(org.Hs.eg.db)
#' ## load data
data(rna)
me=rna

## calculate best number of clusters 
res<-TopPAM(me, max_clusters = 8, TOP=1000)

me_TOP=res[[1]]
number_of_k=res[[3]]

## Compute top genes of each cluster, with "TRw" samples with a negative Silhouette widths could be cut-off

File_genes=Groups_Sup(me_TOP, me=me, number_of_k,TRw=-1)

groups_men=File_genes[[2]]
me_x=File_genes[[1]]

# groups_men contain informations of each sample and cluster, this could be adapted in case of a supervised analysis

o_g<-Supervised_Cluster_Heatmap(groups_men = groups_men, gene_matrix=me_x, method="PAMR",show_sil=TRUE,print_genes=TRUE, TOP = 1000,GSE=TRUE,plot_mean_sil=TRUE,sil_mean=res[[2]])

#Validate with Consensus Cluster or tSNE

cons_clust(me_x,max_clust=8, TOP=1000)
AutoPipe.tSNE(me=me_x)


Authors

D. H. Heiland & K. Daka, Translational Research Group, Medcal-Center Freiburg, University of Freiburg

News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("AutoPipe")

0.1.6 by Karam Daka, 6 months ago


Browse source code at https://github.com/cran/AutoPipe


Authors: Karam Daka [cre, aut] , Dieter Henrik Heiland [aut]


Documentation:   PDF Manual  


GPL-3 license


Imports cluster, pamr, siggenes, annotate, fgsea, org.Hs.eg.db, RColorBrewer, ConsensusClusterPlus, Rtsne, clusterProfiler, msigdbr


See at CRAN