Provides the 'Molecular Signatures Database' (MSigDB) gene sets
typically used with the 'Gene Set Enrichment Analysis' (GSEA) software
(Subramanian et al. 2005
The msigdbr
R package provides Molecular Signatures Database (MSigDB) gene sets typically used with the Gene Set Enrichment Analysis (GSEA) software:
Details and examples are described in the vignette.
The package can be installed from CRAN.
install.packages("msigdbr")
Load package.
library(msigdbr)
Check the available species.
msigdbr_show_species()#> [1] "Bos taurus" "Caenorhabditis elegans" "Canis lupus familiaris" #> [4] "Danio rerio" "Drosophila melanogaster" "Gallus gallus" #> [7] "Homo sapiens" "Mus musculus" "Rattus norvegicus" #> [10] "Saccharomyces cerevisiae" "Sus scrofa"
Retrieve all human gene sets.
m_df = msigdbr(species = "Homo sapiens")head(m_df)#> # A tibble: 6 x 9#> gs_name gs_id gs_cat gs_subcat human_gene_symb… species_name entrez_gene gene_symbol#> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr> #> 1 AAACCAC_M… M126… C3 MIR ABCC4 Homo sapiens 10257 ABCC4 #> 2 AAACCAC_M… M126… C3 MIR ACTN4 Homo sapiens 81 ACTN4 #> 3 AAACCAC_M… M126… C3 MIR ACVR1 Homo sapiens 90 ACVR1 #> 4 AAACCAC_M… M126… C3 MIR ADAM9 Homo sapiens 8754 ADAM9 #> 5 AAACCAC_M… M126… C3 MIR ADAMTS5 Homo sapiens 11096 ADAMTS5 #> 6 AAACCAC_M… M126… C3 MIR AGER Homo sapiens 177 AGER #> # ... with 1 more variable: sources <chr>
Retrieve mouse hallmark collection gene sets.
m_df = msigdbr(species = "Mus musculus", category = "H")head(m_df)#> # A tibble: 6 x 9#> gs_name gs_id gs_cat gs_subcat human_gene_symb… species_name entrez_gene gene_symbol#> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr> #> 1 HALLMARK_… M5905 H "" ABCA1 Mus musculus 11303 Abca1 #> 2 HALLMARK_… M5905 H "" ABCB8 Mus musculus 74610 Abcb8 #> 3 HALLMARK_… M5905 H "" ACAA2 Mus musculus 52538 Acaa2 #> 4 HALLMARK_… M5905 H "" ACADL Mus musculus 11363 Acadl #> 5 HALLMARK_… M5905 H "" ACADM Mus musculus 11364 Acadm #> 6 HALLMARK_… M5905 H "" ACADS Mus musculus 11409 Acads #> # ... with 1 more variable: sources <chr>