Genotype Quality Control with 'PLINK'

Genotyping arrays enable the direct measurement of an individuals genotype at thousands of markers. 'plinkQC' facilitates genotype quality control for genetic association studies as described by Anderson and colleagues (2010) . It makes 'PLINK' basic statistics (e.g. missing genotyping rates per individual, allele frequencies per genetic marker) and relationship functions accessible from 'R' and generates a per-individual and per-marker quality control report. Individuals and markers that fail the quality control can subsequently be removed to generate a new, clean dataset. Removal of individuals based on relationship status is optimised to retain as many individuals as possible in the study.


plinkQC

plinkQC is a R/CRAN package for genotype quality control in genetic association studies. It makes PLINK basic statistics (e.g.missing genotyping rates per individual, allele frequencies per genetic marker) and relationship functions easily accessible from within R and allows for automatic evaluation of the results.

Full documentation is available at http://HannahVMeyer.github.io/plinkQC/.

plinkQC generates a per-individual and per-marker quality control report. A step-by-step guide on how to run these analyses can be found here.

Individuals and markers that fail the quality control can subsequently be removed with plinkQC to generate a new, clean dataset.

plinkQC facilitates an ancestry check for study individuals based on comparison to reference datasets. The processing of the reference datasets is documented in detail here.

Removal of individuals based on relationship status via plinkQC is optimised to retain as many individuals as possible in the study.

Installation

The current github version of plinkQC is: 0.2.1 and can be installed via

library(devtools)
install_github("HannahVMeyer/plinkQC")

The current CRAN version of plinkQC is: 0.2.0 and can be installed via

install.packages("plinkQC")

A log of version changes can be found here.

News

plinkQC 0.2.1

minor changes

  • Fix path check bug in checkPlink
  • Include test data in build!

plinkQC 0.2.0

major changes

  • All system calls to plink are conducted with sys::exec_wait - this should solve platform dependent issues, mainly comptatibility with windows.
  • Make path construction compatible with windows.
  • path2plink now requires full path to plink executable, no tilde expansion or simple pointer to directory supported.
  • Fix bug in return of cleanData function: list now contains keep and fail IDs.
  • Fix bug in return of maf computation: if fail.IDs does not exist, set fail_samples to zero.

minor changes

  • IBD-fail.IDs now saved without column names to be consistent with other xxx-fail.IDs files.
  • Include additional progress messages in cleanData()
  • Remove default double-specification of mafTh and macTh
  • use checkPlink to return correct path2plink and export to make checkPlink directly accesible to user.

plinkQC 0.1.1

major changes

  • run_check_relatedness will only save IBD estimates of individuals whose estimates are higher than the threshold.

minor changes

  • Fix examples in vignettes 1000 Genomes and HapMap III reference.
  • Change file access in function examples
  • Add additional checks in check_ancestry and fix missing refSamplesFile test

plinkQC 0.1.0

  • Added a NEWS.md file to track changes to the package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("plinkQC")

0.2.2 by Hannah Meyer, 2 months ago


https://github.com/HannahVMeyer/plinkQC


Report a bug at https://github.com/HannahVMeyer/plinkQC/issues


Browse source code at https://github.com/cran/plinkQC


Authors: Hannah Meyer [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports methods, optparse, data.table, R.utils, ggplot2, ggforce, ggrepel, cowplot, UpSetR, dplyr, sys

Suggests testthat, knitr, rmarkdown

System requirements: plink (1.9)


See at CRAN