Read depth data from genotyping-by-sequencing (GBS) or restriction
site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian
probability estimates of genotypes in polyploids or diploids. The genotype
probabilities, posterior mean genotypes, or most probable genotypes can then
be exported for downstream analysis. 'polyRAD' is described by Clark et al.
Genotype likelihoods are now estimated under a beta-binomial distribution rather than the binomial distribution. This change was made so that real sequencing data would be accurately modeled; even in diploid heterozygotes, read depth of two alleles is often very different from a 1:1 ratio, due to many underlying issues with sequencing data that would be difficult to model. Under the beta-binomial with respect to the binomial, there is an increased probability of read depth ratios that differ from the true allele copy ratio. In a practical sense, this means reduced certainty in the estimation of allele copy number from read depth alone, and an increased importance of genotype prior probabilities. The exact shape of the beta-binomial distribution is determined by an overdispersion parameter, which the user can optimize using the TestOverdispersion function.
When using linkage disequilibrium to update genotype priors, the square of Pearson's correlation coefficient is now used for weighting markers, where Pearson's correlation coefficient was used previously without being squared. This applies to both mapping populations and diversity panels, and results in improved genotyping accuracy.
The functions Export_polymapR, readTASSELGBSv2, RemoveHighDepthLoci, AddGenotypePriorProb_Even, and TestOverdispersion have been added.
This version of polyRAD is incompatible with RADdata objects generated by previous versions of polyRAD due to a change in format of the depthSamplingPermutations slot. This slot was changed to simplify the estimation of genotype likelihood.