
Calculate genetic distances
gen_dist.RdCalculate genetic distances
Usage
gen_dist(
gen = NULL,
dist_type = "euclidean",
plink_file = NULL,
plink_id_file = NULL,
npc_selection = "auto",
criticalpoint = 2.0234
)Arguments
- gen
path to vcf file, a
vcfRtype object, or a dosage matrix- dist_type
the type of genetic distance to calculate (options:
"euclidean"(default),"bray_curtis","dps"for proportion of shared alleles (requires vcf),"plink", or"pc"for PC-based)- plink_file
if
"plink"dist_type is used, path to plink distance file (typically ".dist"; required only for calculating plink distance)- plink_id_file
if
"plink"dist_type is used, path to plink id file (typically ".dist.id"; required only for calculating plink distance)- npc_selection
if
dist_type = "pc", how to perform K selection (options:"auto"for automatic selection based on significant eigenvalues from Tracy-Widom test (default), or"manual"to examine PC screeplot and enter no. PCs into console)- criticalpoint
if
dist_type = "pc"used withnpc_selection = "auto", the critical point for the significance threshold for the Tracy-Widom test within the PCA (defaults to 2.0234 which corresponds to an alpha of 0.01)
Details
Euclidean and Bray-Curtis distances calculated using the ecodist package: Goslee, S.C. and Urban, D.L. 2007. The ecodist package for dissimilarity-based analysis of ecological data. Journal of Statistical Software 22(7):1-19. DOI:10.18637/jss.v022.i07. Proportions of shared alleles calculated using the adegenet package: Jombart T. and Ahmed I. (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. doi:10.1093/bioinformatics/btr521. For calculating proportions of shared alleles, missing values are ignored (i.e., prop shared alleles calculated from present values; no scaling performed)