Abstract

Quantification of the genetic distance between populations is instrumental in many genetic research initiatives, and a large number of formulas for this purpose have been proposed. However, selection of an appropriate measure for assessing genetic distance between real-world human populations that diverged as a result of mechanisms that are not fully known can be a challenging task. We compared results from nine widely used genetic distance measures to high-density whole-genome SNP genotype data obtained on individuals from 51 world populations. Using population trees and generalized analysis of molecular variance, we found that contradictory inferences could be drawn from analyses that used different distance measures. We determined the grouping of the distance measures in terms of similarity and consistency of their values using concordance, consistency, and Procrustes analyses. Overall, the Cavalli-Sforza and Edwards distance measure differed the most from the other measures. Wright's FST for diploid data, the Latter and Reynolds distances, and Nei's minimum distance measures each yielded values that were most consistent with the other eight distance measures in terms of ordering populations based on genetic distance. The Cavalli-Sforza and Edwards distance and Nei's geometric distance were least consistent. Simulation studies showed that the Cavalli-Sforza and Edwards distance is relatively more sensitive in distinguishing genetically similar populations and that the Reynolds genetic distance provides the highest sensitivity for highly divergent populations. Finally, our study suggests that using the Cavalli-Sforza and Edwards distance may provide less power for studies concerning human migration history.

pdf