Ancestry-constrained phylogenetic analysis supports the Indo-European steppe hypothesis
Abstract

Abstract:

Discussion of Indo-European origins and dispersal focuses on two hypotheses. Qualitative evidence from reconstructed vocabulary and correlations with archaeological data suggest that Indo-European languages originated in the Pontic-Caspian steppe and spread together with cultural innovations associated with pastoralism, beginning c. 6500–5500 bp . An alternative hypothesis, according to which Indo-European languages spread with the diffusion of farming from Anatolia, beginning c. 9500–8000 bp , is supported by statistical phylogenetic and phylogeographic analyses of lexical traits. The time and place of the Indo-European ancestor language therefore remain disputed. Here we present a phylogenetic analysis in which ancestry constraints permit more accurate inference of rates of change, based on observed changes between ancient or medieval languages and their modern descendants, and we show that the result strongly supports the steppe hypothesis. Positing ancestry constraints also reveals that homoplasy is common in lexical traits, contrary to the assumptions of previous work. We show that lexical traits undergo recurrent evolution due to recurring patterns of semantic and morphological change.