Genotype Calling with Uncertainty from Sequencing Data in
Polyploids and Diploids

Read depth data from genotyping-by-sequencing (GBS) or restriction
site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian
probability estimates of genotypes in polyploids or diploids. The genotype
probabilities, posterior mean genotypes, or most probable genotypes can then
be exported for downstream analysis. 'polyRAD' is described by Clark et al.
(2019) . A variant calling pipeline for highly
duplicated genomes is also included and is described by Clark et al. (2020)
.

News

Changes in v1.0

Genotype likelihoods are now estimated under a beta-binomial distribution
rather than the binomial distribution. This change was made so that real
sequencing data would be accurately modeled; even in diploid heterozygotes,
read depth of two alleles is often very different from a 1:1 ratio, due to
many underlying issues with sequencing data that would be difficult to model.
Under the beta-binomial with respect to the binomial, there is an increased
probability of read depth ratios that differ from the true allele copy
ratio. In a practical sense, this means reduced certainty in the estimation of
allele copy number from read depth alone, and an increased importance of
genotype prior probabilities. The exact shape of the beta-binomial
distribution is determined by an overdispersion parameter, which the user can
optimize using the TestOverdispersion function.

When using linkage disequilibrium to update genotype priors, the square of
Pearson's correlation coefficient is now used for weighting markers, where
Pearson's correlation coefficient was used previously without being squared.
This applies to both mapping populations and diversity panels, and results
in improved genotyping accuracy.

The functions Export_polymapR, readTASSELGBSv2, RemoveHighDepthLoci,
AddGenotypePriorProb_Even, and TestOverdispersion have been added.

This version of polyRAD is incompatible with RADdata objects generated by
previous versions of polyRAD due to a change in format of the
depthSamplingPermutations slot. This slot was changed to simplify the
estimation of genotype likelihood.