Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids

Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) . A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020) .


Changes in v1.0

Genotype likelihoods are now estimated under a beta-binomial distribution rather than the binomial distribution. This change was made so that real sequencing data would be accurately modeled; even in diploid heterozygotes, read depth of two alleles is often very different from a 1:1 ratio, due to many underlying issues with sequencing data that would be difficult to model. Under the beta-binomial with respect to the binomial, there is an increased probability of read depth ratios that differ from the true allele copy ratio. In a practical sense, this means reduced certainty in the estimation of allele copy number from read depth alone, and an increased importance of genotype prior probabilities. The exact shape of the beta-binomial distribution is determined by an overdispersion parameter, which the user can optimize using the TestOverdispersion function.

When using linkage disequilibrium to update genotype priors, the square of Pearson's correlation coefficient is now used for weighting markers, where Pearson's correlation coefficient was used previously without being squared. This applies to both mapping populations and diversity panels, and results in improved genotyping accuracy.

The functions Export_polymapR, readTASSELGBSv2, RemoveHighDepthLoci, AddGenotypePriorProb_Even, and TestOverdispersion have been added.

This version of polyRAD is incompatible with RADdata objects generated by previous versions of polyRAD due to a change in format of the depthSamplingPermutations slot. This slot was changed to simplify the estimation of genotype likelihood.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.5 by Lindsay V. Clark, 5 months ago

Browse source code at

Authors: Lindsay V. Clark [aut, cre] , U.S. National Science Foundation [fnd]

Documentation:   PDF Manual  

GPL (>= 2) license

Imports fastmatch, pcaMethods, Rcpp, stringi

Depends on methods

Suggests rrBLUP, Rsamtools, GenomeInfoDb, Biostrings, GenomicRanges, VariantAnnotation, SummarizedExperiment, S4Vectors, IRanges, BiocGenerics, knitr, rmarkdown, GenomicFeatures, qqman, ggplot2, adegenet

Linking to Rcpp

Suggested by polymapR.

See at CRAN