Analysis of Codon Data under Stationarity using a Bayesian Framework

Is a collection of models to analyze genome scale codon data using a Bayesian framework. Provides visualization routines and checkpointing for model fittings. Currently published models to analyze gene data for selection on codon usage based on Ribosome Overhead Cost (ROC) are: ROC (Gilchrist et al. (2015) ), and ROC with phi (Wallace & Drummond (2013) ). In addition 'AnaCoDa' contains three currently unpublished models. The FONSE (First order approximation On NonSense Error) model analyzes gene data for selection on codon usage against of nonsense error rates. The PA (PAusing time) and PANSE (PAusing time + NonSense Error) models use ribosome footprinting data to analyze estimate ribosome pausing times with and without nonsense error rate from ribosome footprinting data.


Build Status

  • AnaCoDa is a collection of codon models.
  • the release version can be obtained from ...

Examples: Running models

Example 1: Using codon data in the form of CDS in fasta format with one mixture (ROC)

The following example illustrates how you would estimates parameters under the ROC model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, gene.assignment = rep(1, length(genome)))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "ROC")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Example 2: Using codon data in the form of CDS in fasta format with one mixture (FONSE)

The following example illustrates how you would estimates parameters under the FONSE model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, gene.assignment = rep(1, length(genome)))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "FONSE")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Example 3: Using codon data in the form of Ribosome footprints with one mixture (PA)

The following example illustrates how you would estimates parameters under the PA model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

genome <- initializeGenomeObject(file = "rfpcounts.tsv", fasta = FALSE)
parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, gene.assignment = rep(1, length(genome)))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "PA")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Examples: Advanced examples

  • As the above examples illustrated the commonalities in the way all models are called. The following example will use the default ROC model for illustration purposes

Example 4

  • multiple mixture distributions with genes being initially randomly assigned to a mixture distribution. The mixture assignment of each gene will be estimated. As the below example shows, only arguments passed to the parameter object have to be adjusted to reflect a change in the number of assumed mixture distributions.
genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = c(1,2,3), num.mixtures = 3, gene.assignment = sample(1:3, length(genome), replace=TRUE))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "ROC")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Example 5

  • This example is based on the previous one, but instead of estimating the assignemnt of each gene to one of the three mixture distributions, we will fix the mixture assignemt to the initial assignment
genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = c(1,2,3), num.mixtures = 3, gene.assignment = sample(1:3, length(genome), replace=TRUE))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50, est.mix = FALSE)
model <- initializeModelObject(parameter = parameter, model = "ROC")
runMCMC(mcmc = mcmc, genome = genome, model = model)

News

CHANGES IN AnaCoDa 0.1.2

BUG FIXES

  • fixed a bug were the scaling of observed phi values was used inconsitently, causing problems with estimates of Aphi and Sepsilon

NEW FEATURES

  • Added SCUO calculation and improved getCSPEstimates to include reference codons

CHANGES IN AnaCoDa 0.1.1

BUG FIXES

  • fixed problem with getCSPEstimates where log scaling was falsely enabled

  • fixed problem where the grouplist was not stored by writeParameterObject

NEW FEATURES

  • Added functions to calculate the Codon Adaptation Index, Effective Number of Codons and selection coefficients.

  • Allow to set initial phi values based on observed phi values stored in genome object.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("AnaCoDa")

0.1.3.0 by Cedric Landerer, 2 months ago


https://github.com/clandere/AnaCoDa


Browse source code at https://github.com/cran/AnaCoDa


Authors: Cedric Landerer [aut, cre] , Gabriel Hanas [ctb] , Jeremy Rogers [ctb] , Alex Cope [ctb] , Denizhan Pak [ctb]


Documentation:   PDF Manual  


GPL (>= 2) license


Depends on Rcpp, methods, mvtnorm

Suggests knitr, Hmisc, VGAM, coda, testthat, lmodel2

Linking to Rcpp


See at CRAN