Final obstruent voicing in Lakota: Phonetic evidence and phonological implications
Final obstruent devoicing is common in the world’s languages and constitutes a clear case of parallel phonological evolution. Final obstruent voicing, in contrast, is claimed to be rare or nonexistent. Two distinct theoretical approaches crystalize around obstruent voicing patterns. Traditional markedness accounts view these sound patterns as consequences of universal markedness constraints prohibiting voicing, or favoring voicelessness, in final position, and predict that final obstruent voicing does not exist. In contrast, phonetic-historical accounts explain skewed patterns of voicing in terms of common phonetically based devoicing tendencies, allowing for rare cases of final obstruent voicing under special conditions. In this article, phonetic and phonological evidence is offered for final obstruent voicing in Lakota, an indigenous Siouan language of the Great Plains of North America. In Lakota, oral stops /p/, /t/, and /k/ are regularly pronounced as [b], [l], and [ɡ] in word- and syllable-final position when phrase-final devoicing and preobstruent devoicing do not occur.*
final voicing, final devoicing, markedness, Lakota, rare sound patterns, laboratory phonology
1. Final obstruent devoicing and final obstruent voicing in phonological theory
There is wide agreement among phonologists and phoneticians that many of the world’s languages show evidence of final obstruent devoicing (Iverson & Salmons 2011). Like many common sound patterns, final obstruent devoicing has two basic instantiations: an active form, involving alternations, and a passive form, involving static distributional constraints. In languages with active final obstruent devoicing, voiced obstruents like /b/, /d/, and /g/ are pronounced as voiceless [p], [t], and [k] in word- or syllable-final position. One language with this pattern is Czech (Šimáčková, Podlipský, & Chládková 2012), as illustrated in Table 1, where words are given in orthographic form, followed by International Phonetic Alphabet (IPA) transcriptions in square brackets.1
In languages with static final obstruent voicelessness, the contrast between voiced and voiceless obstruents is neutralized in favor of the voiceless series in word- or syllable-final position, though there is no synchronic evidence of productive alternations. This pattern is illustrated by the representative Basque data in Table 2. Though Basque has a contrast between voiced /b/, /d/, /g/ and voiceless /p/, /t/, /k/, the voicing contrast [End Page 294] is possible only in word-initial and word-medial position (Egurtzegi 2013).2 In word-final position, of the oral stops, only /p/, /t/, and /k/ are attested, though [p] is restricted to sound-symbolic words, [t] is rare, and [k] is primarily found in a handful of highly productive suffixes (e.g. -ak plural, -k ergative, -tik ablative). Forms in Table 2 are written in the standard Basque orthography, which is phonemic: word-final /p/, /t/, and /k/ represent [p], [t], and [k], respectively.
Sound patterns similar to Czech and Basque active and static final obstruent devoicing have been described for many other languages. Blevins (2006a) lists over a dozen modern Indo-European languages with final obstruent devoicing, including Bulgarian, Catalan, Dutch, Lithuanian, Polish, Russian, and Zaza. At the same time, she illustrates that the sound pattern has clearly evolved independently in unrelated languages around the world, from Afar, a Cushitic language spoken in the Horn of Africa, to Awara, a Finisterre-Huon language of Morobe Province, Papua New Guinea.3 Further, as new languages are described, new cases of final obstruent devoicing continue to be discovered: acoustic analysis confirms the sound pattern in Camuno, a Gallo-Romance language of Valcamonica (Cresci 2014), and in Ganza, an Omotic language of Ethiopia and the Sudan (Smolders 2016).4
Final obstruent devoicing has received a great deal of attention in the phonological literature, since the understanding of this common sound pattern has important implications [End Page 295] for phonological theory, as summarized in Iverson & Salmons 2011. One important area of research focuses on explanations for the sound pattern itself and asks why final obstruent devoicing is a common sound pattern crosslinguistically. Two distinct theoretical approaches offer two very different answers to this question.
Under traditional markedness accounts inspired by Trubetzkoy (1939) (e.g. Wetzels & Mascaró 2001) and modern optimality treatments (e.g. Kager 1999, Lombardi 1999), final obstruent devoicing is viewed as a direct consequence of universal phonological markedness constraints. Traditionally, voiced obstruents are marked, voiceless obstruents are unmarked, and final devoicing, as neutralization, constitutes a shift to the unmarked. In modern optimality terms, a markedness constraint prohibiting voicing in obstruents combines with positional markedness or faithfulness constraints. As components of universal grammar, these markedness constraints determine that obstruent voicing will be generally disfavored, and particularly disfavored in final (or noninitial) position.5 The same kinds of markedness accounts make explicit predictions that final obstruent voicing should not exist (Kiparsky 2006, 2008): a language with /p/, /t/, /k/ regularly pronounced as [b], [d], and [ɡ] in word- or syllable-final position is ruled out, since, under any analysis, the voiced obstruents are marked in contrast to their voiceless counterparts.
In contrast to markedness theories, phonetic-historical approaches to final obstruent devoicing, like evolutionary phonology (Blevins 2004, 2006a,b, 2008, 2015, 2017), attempt to explain the recurrent sound pattern as phonologized instances of natural phonetic processes. Under these accounts, inspired by the Neogrammarian tradition and the early work of John J. Ohala (e.g. 1981, 1983), final obstruent devoicing is common crosslinguistically because of the way we speak and the way we perceive speech. More specifically, phrase-final laryngeal gestures, phrase-final lengthening, and final consonant nonrelease, along with perception and phonologization of these articulatory routines, can all give rise to voicelessness, or perception of voicelessness, in final obstruents, yielding sound patterns like those illustrated in Tables 1 and 2 above. Within evolutionary phonology, nothing prohibits sound patterns of final obstruent voicing, though they are expected to be rare, due to the articulatory and perceptual factors just mentioned that yield devoicing (Blevins 2006a,b).
The debate between markedness and phonetic-historical approaches has led to a special interest in languages that may show evidence of word- or syllable-final voicing. Lezgian, a Nakh-Daghestanian language, is one of these. In Lezgian, there is a contrast between plain voiceless, voiceless aspirated, voiced, and glottalized stops, with plain voiceless stops alternating with voiced stops word-finally. Yu (2004) provides acoustic and phonological evidence for a synchronic process of final obstruent voicing and lengthening. In an attempt to maintain predictions of markedness theory, Kiparsky (2006, 2008) offers an alternative analysis of the Lezgian sound pattern: final voiced stops are taken as basic, analyzed as phonologically voiced geminate stops, and degeminated and devoiced in syllable onsets. Another language with possible word-final voicing of obstruents is Somali (Blevins 2006a,b). However, there is a great deal of variation in how final obstruents are pronounced, and Kiparsky (2006, 2008) chooses to analyze Somali final stops as lenis unaspirated, in contrast to aspirated stops that occur syllable-initially. Though Iverson and Salmons (2011:1638) conclude that Kiparsky’s proposed markedness universal ‘does not ultimately hold up empirically’, the absence of any constraint [End Page 296] against final obstruent voicing within evolutionary phonology has led us to search for more convincing examples of this sound pattern. In this context, we offer the present study of Lakota, an indigenous Siouan language of the Great Plains.
Our central argument is that Lakota has a true synchronic process of syllable-final obstruent voicing. This argument is supported by phonological and phonetic evidence that Lakota voiceless oral stops /p/, /t/, and /k/ are regularly pronounced as [b], [l], and [ɡ], respectively, in syllable-final position. Section 2 provides an introduction to the Lakota language, its speakers, and the phonology of the language, with a focus on the distribution of obstruent voicing, and a brief summary of earlier analyses. Acoustic analyses of Lakota segments in different positions of the word are presented in §3, demonstrating a voicing contrast in prevocalic position and a neutralization of oral stops to the voiced series in syllable-final position. Other patterns of note are optional phrase-initial devoicing, gradient phrase-final devoicing, regressive devoicing of oral stops followed by voiceless segments, categorical syllable-final fricative devoicing, and presonorant oral stop voicing, though our acoustic analysis is focused on showing that /p/ and /k/ undergo final voicing. In §4 we suggest that final obstruent voicing in Lakota is a continuation of an earlier coarticulatory sound change that voiced *p, *t, *k to [b], [d], [ɡ] intervocalically before final unstressed vowels concomitant with devoicing and loss of those vowels. This sound change was followed by a shift of *d > l in Lakota. Under this account, the historical origins of final stop voicing are tied to retiming of the final vowel gesture. Section 5 summarizes the implications of this study for phonological theory.
2. Lakota obstruent voicing patterns
2.1. A brief introduction to the lakota language
Lakota (a.k.a. Lakhota) is an endangered indigenous language of North America. Today, it is mainly spoken on five reservations in North and South Dakota. The number of fluent speakers of Lakota has been declining steadily since the 1950s, and intergenerational transmission of the language ended during the 1960s, with a very small and decreasing number of isolated families continuing to speak Lakota to their children up to the 1990s. According to the Lakota Language Consortium, since that time, the number of first language speakers has decreased from approximately 6,000 to about 2,000 speakers today (Ullrich 2018:33).
Lakota is a member of the Siouan language family,6 and within Siouan, it is usually classified as a member of the Mississippi Valley subgroup. Siouan languages were spoken primarily in the Great Plains, and in the Ohio and Mississippi valleys. Lakota is a member of a dialect continuum that includes five distinguishable languages, namely: Lakȟóta, the subject of this study, Western Dakhóta (a.k.a. Yankton-Yanktonai), Eastern Dakhóta (a.k.a. Santee-Sisseton), Assiniboine Nakhóta, and Stoney Nakhóta. Some phonological differences between these languages are illustrated in Table 3.
Where Lakota has /l/, it regularly corresponds to /d/ in all languages but Assiniboine, where it corresponds instead to /n/. (In final position /-n/ has diffused to former /-d/ dialects.) For some morphemes, however, like the Lakota suffix /-la/ illustrated in the word for ‘lizard’, unexpected n-forms occur in Yanktonai, Yankton, and Sisseton, which are d-dialects. (In most instances, these can be attributed to lexical diffusion.) Note also that Lakota /bl-/ corresponds to Assiniboine /mn-/, while Lakota /gl-/ corresponds to Yanktonai /gd-/, Yankton /kd-/, Sisseton-Santee /hd-/, and Assiniboine /kn-/. More will [End Page 297] be said about these correspondences in our discussion of the evolution of final obstruent voicing in §4. In the subsections that follow we focus solely on Lakota.
The Lakota language can be divided into two dialects: Northern Lakota, represented by speakers of the Standing Rock Reservation and parts of the Cheyenne River Reservation, and Southern Lakota, spoken by the Oglála and Sičháŋǧu tribes, who reside on the Rosebud and Pine Ridge Reservations, respectively, and by some speakers from Cheyenne River (Ullrich 2018:38). Since these two dialects show virtually no phonological variation and are characterized by only a small number of lexical variants, they are treated as one for the purposes of this study.
Descriptive and teaching grammars of Lakota include Buechel 1939, Boas & Deloria 1941, Rood & Taylor 1976, 1996, Ingham 2003, the grammar section of Ullrich 2008, and Ullrich & Black Bear 2016. This study makes extensive use of the New Lakota dictionary (Ullrich 2008, 2011, 2019), a publication of the Lakota Language Consortium. The app version of the dictionary contains over 40,000 entries and includes not only full forms of words and thousands of audio files, but also truncated word forms, which are important to this study.
2.2. Segment inventory and orthography
The Lakota vowel system is shown in Table 4, and the consonant inventory is shown in Table 5, following Rood and Taylor (1996) and Ullrich (2011). Here and throughout, we use the orthography of the New Lakota dictionary (NLD; Ullrich 2011, 2019), sometimes called the ‘Standard Lakota orthography’. Where orthographic symbols in these tables differ from symbols of the IPA, the IPA symbol is given in square brackets.
Lakota has a basic five-vowel system /i, u, e, o, a/, along with three nasalized vowels, including high /iŋ/, /uŋ/ and nonhigh /aŋ/. Another vowel symbol in use in the NLD is <A>; this symbol represents a root/stem-final vowel that alternates between /a/, /e/, and /iŋ/ and is often subject to deletion in morphologically complex forms. In addition, the acute accent is used to mark a stressed vowel in this orthography.7 It should also be noted that in addition to the three contrastively nasalized vowels shown in Table 4, Lakota also has allophonically nasalized vowels that are due to coarticulation with a preceding or following nasal consonant (Scarborough et al. 2015). In our study of properties of oral stop consonant voicing, we do not measure stops in the context of nasalized vowels, since these stops are typically nasal and voiced. Given that our study is [End Page 298] limited to a discussion of (oral) stops adjacent to oral vowels, it provides only a partial picture of voicing alternations in Lakota.
Table 5 shows the consonant inventory of Lakota, with obstruents at the top of the table and sonorants at the bottom.
Several comments are in order regarding the consonant contrasts in Table 5. Note that all Lakota sonorants—/m, n, l, w, y/—are voiced. In contrast, the nonlaryngeal fricatives all have voiced and voiceless counterparts. For the oral stops and affricates, there appear to be at least five contrastive laryngeal series: voiceless unaspirated, voiceless aspirated, voiceless with velar fricative release, voiceless ejective, and voiced. However, the voiceless aspirated, voiceless with velar fricative release, and voiceless ejective series have alternative analyses as clusters of plain voiceless obstruent plus /h/, /ȟ /, and /ʼ/, respectively (Boas & Deloria 1941:5). Since the focus of this study is the plain voiceless and voiced series of obstruents, we leave the analysis of these other laryngeal series open and focus on contrastive (and noncontrastive) obstruent voicing in Lakota.
2.3. The distribution of obstruent voicing
Let us begin by observing a voicing contrast in fricatives. As illustrated in Table 6, the pairs /s/ vs. /z/, /š/ vs. /ž/, and /ȟ / vs. /ǧ/ contrast in prevocalic (onset) position, but not word- or syllable-finally, where, as in Czech, Basque, and many other languages, there is neutralization to the voiceless series, illustrated by the bold characters in the final column. The fricative devoicing pattern exemplified in Table 6 is most obvious in full and contracted (cont) pairs like čháǧa ‘ice’ vs. čháȟ (cont). Contracted forms are the primary source of obstruent codas in Lakota and are discussed further in §2.4. Of importance now is to observe that the voicing contrast in Lakota fricatives is unremarkable from a typological perspective. There is a voicing contrast that occurs in single-member onsets at three distinct points of articulation, and that contrast is neutralized in final position to the voiceless series.
Since it is rare to have a voicing contrast in fricatives without having a voicing contrast in oral stops, we expect Lakota to show a series of voiced stops. However, the voiced series of oral stops appears to have only a single member, /b/. Though there are very few contrasts between /b/ and /p/, data like that in example 1 argue that voicing is contrastive for prevocalic bilabial stops in Lakota.8 In addition to potentially native [End Page 299] roots, like bá ‘to blame someone’ and bú ‘make a deep noise’, there is at least one likely loan, bébela ‘baby’ (<< Fr. bébé), which also suggests that Lakota /b/ was contrastive at the time of borrowing. If there was no voicing contrast between /b/ and /p/ in the language, we would expect the word to be borrowed as /pepe…/.
(1) The /b/ vs. /p/ contrast in Lakota
a. bá ‘to blame sb.’ (not widely known) vs. pa- ‘by pushing’
b. bébela ‘baby’ (<< Fr. bébé) vs. -pi plural
c. bú ‘make a deep noise’ vs. pu- ‘by pressure’
d. ábela ‘scattered’, ábeya ‘scattering’ vs. apé ‘leaf’
e. kabú ‘to play the drum’ (ka- ‘by hitting’, bu ‘make a deep noise’) vs. kapúza ‘to become dry in the wind’ (ka- vbz, púzA ‘to be dry’)
f. hibú ‘I am coming’ (archaic form of 1sg of hiyú ‘to start coming’) vs. ipáblaye ‘rolling pin’
In other positions within the word, the pronunciation of /p/ as [b] is predictable, as discussed further below. Though the voicing contrast for /p/ vs. /b/ has a low functional load, it is supported by the data in 1.
For the velar stop /k/, the situation is different, and this is why /g/ is in parentheses in Table 5. Though [ɡ] is a common and predictable allophone of /k/, there are no contexts where /k/ and /g/ contrast. In example 2, we illustrate two of the three positions where /k/ has predictable allophones: prevocalically (2a), where it is voiceless unaspirated, and before sonorant consonants /l, m, n, w/, where it is voiced (2b). Notice that unlike /b/ in bébela, in Lakota spakéli, the English prevocalic [ɡ] of [spəˈɡɛɾi] ‘spaghetti’ is borrowed as /k/ (in 2a); only when [ɡ] is in presonorant position in source words like magnet or anglais is it borrowed as [ɡ] (in 2b).
(2) /k/ with predictable [k] and [ɡ] allophones
a. prevocalic [k]: akábu ‘to drum on sth.’, kibá ‘to regret’, -lake ‘very, really’, spakéli ‘spaghetti’ (<< Eng.)
b. presonorant syllable-initial [ɡ]: glalú ‘to fan one’s own’, gmá ‘walnut’, gnúni ‘to lose one’s own’ (< ki-núni), gwéza ‘rippled, ridged’, magnéta ‘magnet’ (<< Eng.), šagláša ‘English’ (<< Fr. les Anglais) [End Page 300]
Like /k/, Lakota /t/ also lacks a voiced counterpart. Recall from Table 3 that where other closely related languages show /d/, the corresponding sound in Lakota is /l/. Internal to Lakota, there is also evidence that /l/ is, in some sense, the ‘voiced’ counterpart of /t/. For example, consider the data in Table 7, where contracted forms of words with medial /p/, /t/, and /k/ show final [b], [l], and [ɡ], respectively. When the preceding vowel is nasalized, the voiced consonants [b] and [ɡ] may be realized as [m] and [ŋ], respectively, or as partially nasalized [mb]/[mp] and [ŋɡ]/[ŋk], respectively, while [l] is often nasalized and lenited in the same context. For example, compare núŋpa [ˈnʊ̃.pa] ‘two, twice’ with shortened forms núŋm [ˈnʊ̃m], núŋp [ˈnʊ̃mp] and reduplicated núŋmnuŋpa [ˈnʊ̃mə.ˈnʊ̃.pa] ‘by twos, two each’. We analyze these variants as nasalized instances of [b] and [ɡ], and we focus on oral contexts in most of the discussion that follows.
If the alternations in Table 7 represent a unified process, the expected pronunciation of /t/ is [d], not [l]. A further piece of evidence that Lakota [l] is, in some sense, the voiced counterpart of /t/ comes from place of articulation. As illustrated in Table 5, /t/ (along with /th/, /tȟ/, and /tʼ/) has a dental place of articulation, while /s/, /z/, and /n/ are alveolar. Since /l/ is a sonorant like /n/, and sometimes considered a continuant like /s/ and /z/, it might be expected to have an alveolar place of articulation. However, like /t/, it is dental. In §4 we suggest that the voicing alternations in Table 7 and the /d/ correspondent of /l/ in other Siouan languages support a historical sound change of *d > l in Lakota. In other words, the seemingly unnatural synchronic change of /p, t, k/ to [b, l, ɡ] as opposed to [b, d, ɡ] in Lakota is due to telescoping of two sound changes, obstruent voicing followed by *d > l.9
A final context where obstruent voicing is predictable is in word-initial consonant clusters. Given that word-initial position defines the beginning of a syllable at the beginning of an utterance, we take word-initial phonotactics to define (at least partially) syllable-initial phonotactics. Word-initially, any consonant in Table 5 (except [ɡ], which, recall, is not contrastive) can constitute a single-member prevocalic syllable onset. Attested word-initial clusters, shown in Table 8, are highly restricted:10 (i) they are limited to two consonants, #C1C2; (ii) C2 can be either a plain voiceless obstruent or a sonorant; (i) if C2 is a plain voiceless obstruent, C1 is also a plain voiceless obstruent; (iv) if C2 is a sonorant, C1 is either a voiceless fricative, a voiced oral stop [b] or [ɡ], or [m], which we interpret as a nasalized [b] / n. In other words, singleton onsets show voicing contrasts in Lakota, but within consonant clusters, obstruents may not contrast in voicing. Independent [End Page 301] of obstruent voicing, notice that there are no sequences of identical consonants (**), there are no fricative clusters (***), and, though C2 may be an affricate, there are no clusters with an affricate in C1 position and a consonant other than the laryngeals /h/, /ʼ/ in C2 position (*$). Lakota words exemplifying the initial consonant clusters in Table 8 are shown in 3.11
(3) Illustration of word-initial consonant clusters in Table 8
a. /p/-initial: pté ‘buffalo cow’, pčéčela ‘to be short’, psá ‘reed, straw’, pšíŋ ‘onion’, (pȟá ‘to be bitter’), mní ‘water’, blé ‘lake’, (pʼé ‘American elm’)
b. /t/-initial: tké ‘to be heavy’, (tȟápa ‘ball’), (tʼéča ‘to be lukewarm’)
c. /k/-initial: kpá ‘to be gouged out’, kté ‘to kill sb./sth.’, kčeyá ‘to broil sth. over coals’, ksúyeya ‘to hurt or injure sb.’, kšú ‘to bead sth.’, (kȟákA ‘to clack, clatter’), gmá ‘walnut’, gnákA ‘to lay sth. by’, glá ‘to loathe sth./sb.’, (kʼá ‘to dig sth.’)
d. /s/-initial: spáyA ‘to be wet’, stákA ‘tired (of bodypart)’, sčú ‘be shy’, ská ‘to be clear white’, smiyáŋyaŋ ‘bare of any outside layer’, sní ‘it is cold’, slá ‘it is greasy’, swaká ‘to be frayed at the edge’, (sʼa ‘as a habit’)
e. /š/-initial: špáŋ ‘to be burned by heat or cold’, štákA ‘to be melting’, ščépȟaŋ (v. sčépȟaŋ) ‘sister-in-law’, škátA ‘to play’, šmeyá ‘deeply’, šnížA ‘to shrink, shrivel’, šlayá ‘being bare’, šwokÁ ‘it is overflowing’, (šʼákA ‘it is strong/powerful’)
f. /ȟ/-initial: ȟpáyA ‘to lie’, ȟtálehaŋ ‘yesterday’, ȟčá ‘to blossom’, ȟmiyáŋ ‘crookedly’, ȟná ‘to groan, snort’, ȟwayÁ ‘to cause sb. to be sleepy/ bored’, ȟlí ‘to be muddy or slimy’, (ȟʼé ‘it is rough’) [End Page 302]
While words like those in 3 support the analysis of clusters in Table 8 as syllable-initial clusters, further support for these as true complex onsets comes from distinct phonetic realizations of the same phonological clusters in word-medial position, across a syllable boundary. In Table 9, word-initial and word-medial complex onsets are compared with word-medial heterosyllabic clusters.
In C1C2 onsets where C2 is nonnasal, C1 is consistently voiceless, but when heterosyllabic, the same clusters may show voiced codas [b], [l], [ɡ], as in the final column of Table 9. Also, note that the possibility of a medial coda consonant, followed by an onset consonant cluster, predicts VC1.C2C3V sequences in the language, provided that C1 is [l], [s], [ ʃ ], [χ], [b], or [ɡ] and C2C3 is one of the clusters shown in Table 8. Some examples [End Page 303] of triconsonantal clusters are: pȟelmná ‘to smell of fire’, lešmná ‘to smell of urine’ (cf. pȟéta ‘fire’, léžA ‘to urinate’, mná ‘to have a particular smell’), škalškátA ‘to play frolicking’ (cf. škátA ‘to play’), and wetȟábskala ‘white blood cells’ (cf. tȟápa ‘ball’, ská ‘white’).
Note that the syllable structure described for Lakota is not typologically unusual. The majority of Lakota syllables are open, ending in a vowel. Assuming two major classes of consonants, obstruents with low sonority and sonorants with high sonority, one can view Lakota as a language that weakly adheres to a general sonority sequencing principle: within the syllable, there is a sonority plateau or rise to the nucleus, and an optional sonority fall after the nucleus (i.e. an optional coda). Under this analysis, Lakota single-member onsets are unrestricted, onset clusters are those shown in Table 8, and coda consonants are restricted to [l], [s], [ ʃ ], [χ], [b], and [ɡ]. However, what has yet to be detailed is the origin of these coda consonants. With only a few exceptions, all Lakota roots, stems, and words end in vowels. As a consequence, the majority of coda consonants are found only in derived forms, where truncation or final vowel loss may occur under compounding, derivation, or inflection. A discussion of truncation is offered in §2.4.
Before examining the status of derived coda consonants, let us summarize the distribution of voicing for obstruents that we have seen thus far. Voicing is contrastive for all fricatives in Lakota, with evidence of syllable-final fricative devoicing in Table 6. For oral stops, the situation is different. Of the oral stops, /p/, /t/, and /k/, only /p/ has a voiced obstruent counterpart, /b/. In contrast to fricative devoicing, oral stops appear to be voiced in the same contexts, independent of whether voicing is contrastive. In syllable-final position, /p/, /t/, and /k/ are realized as [b], [l], and [ɡ], respectively (Table 7). Syllabification of tautosyllabic vs. heterosyllabic consonant clusters may be signaled by differences in obstruent voicing (and manner), as shown in Table 8 and Table 9, where, for example, tautosyllabic /pt/, /ps/, and /pn/ are realized as [pt], [ps], and [mn], but the same heterosyllabic sequences can be realized as [b.t], [b.s], and [b.n]. Section 3 of this article supports these observations about obstruent voicing with acoustic measurements. In §2.4 we describe the truncation process that gives rise to the majority of syllable codas. Since our argument is that oral stops undergo voicing in syllable-final position, it is important to understand how truncation gives rise to syllable codas in the language. In §2.5, we briefly review other analyses of the voicing pattern in Lakota obstruents and highlight differences between our proposal and those of earlier scholars.
2.4. Truncation and syllable codas
Lakota syllables may be of the form CV, CCV, CVC, or CCVC, with representative examples in Table 10. Note that onsets appear to be obligatory, at least in careful speech: syllables that are vowel-initial phonologically are pronounced with an initial glottal stop in careful speech: á [ˈʔa] ‘armpit’, alí [ʔaˈli] ‘to climb up’, aá [ʔaˈʔa] ‘to be moldy’. Notice also that three of the four examples of word-final consonants in Table 10 are marked (cont), indicating that they are ‘contracted’ forms. Indeed, the majority of syllable codas in Lakota are the result of contraction or truncation, as described below.13
Recall from Table 6 and Table 7 data suggesting that the set of syllable codas is limited to voiceless fricatives [s], [ ʃ ], [χ] and to the voiced consonants [b], [l], and [ɡ]. However, of all the morphemes described in the NLD, only a few appear to be truly consonant-final. Two productive suffixes that appear to be consonant-final are -kel, a derivational affix meaning ‘somewhat, rather, fairly, kind of, sort of’, and -š, a suffix used [End Page 304] with a number of word categories to express adversative (opposition or contrast) or emphasis. Some examples of words with these suffixes are given in 4. Since the suffixes occur word-finally and both /l/ and /š/ are possible word-final codas, these suffixes show no evidence of alternations in voicing or manner and simply support the observations we have made regarding syllable structure up to this point.
(4) Codas in the lexicon: adverbial suffixes -kel and -š
a. -kel ‘somewhat, rather’
|apȟé/apȟékel||‘to wait’/‘kind of waiting’|
|ečhá/ečhákel||‘naturally of such quality’/‘by nature, naturally’|
|naȟmá/naȟmákel||‘to hide sth., sb.’/‘hiding in a way’|
|pasí/pasíkel||‘to research’/‘kind of researching’|
|yasú/yasúkel||‘to judge’/‘passing judgment hastily’|
b. -š ‘adversative; emphatic’
|iyé/iyéš||‘he/she/it’/‘at least him/her/it’|
|naké/nakéš||‘finally now’/‘now at last’|
|miyé/miyéš||‘I, me, it is me’/‘at least I’|
|waná/wanáš||‘now, already’/‘now indeed, at last’|
Another set of free morphemes that appear to be consonant-final are a small class of adverbs ending in -b, including: itkób ‘in the direction toward sb. who is approaching’, óčib ‘by degrees, slowly, step by step, little by little’ (red óčibčib), and sakhíb ‘to- gether’. Other b-final adverbs appear to be contracted forms of words ending in -pȟa: akáb ‘extra, overflowing, on top of’ from akápȟa ‘on the outside of, on top’; hakáb ‘afterward’ from hakápȟa ‘to be the following’; hútawab, hútab ‘downstream’ from hútawapȟa ‘somewhat farther downstream’; ȟeyáb from ȟeyápȟa ‘out of the way, removed’; isáŋm from isáŋpȟa ‘further than’; ób ‘with them (more than one), together with them’ from ópȟa ‘to join in something, to be a member of something’; and watób ‘by boat’ from watópȟa ‘to travel by boat’ (< wáta ‘boat’ + opȟÁ ‘to go by way of sth.’). On this basis, we hypothesize that adverbs like itkób historically derive from words ending in -pȟa, though synchronically, there is not always evidence of a longer form.14
In contrast to the coda consonants just mentioned, the majority of closed syllables in Lakota are the result of a process generally referred to as ‘truncation’. Under truncation, [End Page 305] a word (or stem) of a specific phonological shape undergoes final vowel loss, and the consonant that is rendered in final position may alternate predictably, depending on its quality and position. For example, the word tópa ‘four’ has a truncated form tób, which can occur as an independent word or as the first member of a derived word, as in tóbkiya ‘in four ways, four places’ and tóbtopa ‘by fours’ (red). Truncation regularly occurs in two kinds of word-formation processes: prefixal reduplication, where the prefixed element can be viewed as a truncated base, and compounding, where the first element in the compound is a truncated base. With other word-formation processes, as detailed in Ullrich 2018, morphosyntactic properties may determine whether truncation takes place. Another important finding of Ullrich (2018) is that truncated forms can function as independent phonological words in various syntactic constructions. As a consequence, truncation may give rise to word-medial or word-final coda consonants. In the remainder of this section, we focus on the phonology associated with regular truncation, since truncation is the primary source of coda consonants in Lakota. We refer to morphological environments for truncation simply as ‘complex words’, including in this category nominal compounds and prefixal reduplication, as well as a host of derived verb forms.
Our understanding of truncation has four phonological components. First, the phonological conditioning of vowel loss (5a); second, the alternations in voicing, discussed earlier, that result when a consonant is in coda position (5b);15 third, a dissimilatory process that applies to sequences of coronal consonants derived by reduplication (5c); and finally, an optional resyllabification of consonants into complex onsets that can result in derived ejectives, aspirates, or onset clusters (5d). This last process is supported by two facts: (i) the only consonant clusters that appear to show optional resyllabification are clusters that are allowed word-initially; (ii) the devoicing patterns can only be explained by resyllabification since a medial stop coda in VC.CV is typically voiced, as we show in §3.
(5) Understanding truncation as prosodic morphology
a. Truncation: If a Lakota form ends in /...VCfVf /, where Cf is a possible coda consonant, then:
i. …VCfVf → …VCf when it is the first member of a complex word.
ii. VCfVf → …VCf in isolation, provided that Vf is unstressed (optional).
b. Coda voicing constraints: In syllable coda position:
i. Fricatives devoice: ǧ → ȟ, ž → š, z → s.
ii. Oral stops and affricates voice: p → b, t → l, k → g, č → l.
c. Dissimilation (in reduplication only/morphophonemic): Heterosyllabic lateral + coronal consonant clusters dissimilate:
i. l.T → g.T, where T is a coronal consonant. (See §4 for further discussion.)
d. Optional resyllabification (fast speech, variable): In VC1.C2V where C1C2 is a possible syllable onset:
The patterns described in 5 are illustrated in Table 11 with relevant reduplicated and compound forms. In this table ‘n.a.’ means ‘not applicable’, ‘—’ indicates a predicted [End Page 306] but unattested form, and ‘(?)’ indicates a resyllabified form that may be indistinguishable from the original complex word, since no regressive voicing assimilation distinguishes the complex onset from the coda-onset CC cluster. Notice that, in word-medial position, triconsonantal clusters like /šmn/ and /ȟsn/ are found. Since only a single consonant is allowed in the coda, all medial CCC clusters must be syllabified as C.CC, with a simple coda followed by a complex onset.
[End Page 307]
Due to truncation in reduplication and compounding, triconsonantal sequences are not uncommon, and they support our view that voiced [b], [l], and [ɡ] are codas, since word-initial clusters are limited to two consonants (see Table 8 and n. 13): aǧúyabskuyela ‘Danish, cake’ from aǧúyapi ‘bread’ + skúyela ‘sweet’; wetȟábskala ‘white blood cells’ from wé ‘blood’ + tȟápa ‘ball’ + ská-la ‘white’; tȟabškátA, tȟabškál ‘playing basketball’ from tȟápa ‘ball’ + škátA ‘to play’; pȟelmná ‘to smell of fire’ from < pȟéta ‘fire’ + mná ‘to smell of sth.’; wašílȟpaya ‘garbage, trash’ from wa- indef.obj + šíčA ‘bad’ + ȟpáyA ‘to lie, be lying’.
It is worth stressing here that the phonological properties detailed in 5 for truncation are regular and, as far as we can tell, productive. New compounds—like those in Table 11 for čhaȟ.sní.yaŋ ‘ice cream’, aǧúyabskuyela ‘Danish (pastry)’, and wetȟábskala ‘white blood cells’—follow the same patterns as arguably older compounds that refer to indigenous culture items like tȟalʼágnake ‘rigid goldenrod’ (a plant used to lay meat on) and čhabsíŋte ‘beaver tail’ (used to comb hair). A further argument for the productivity of final obstruent voicing is that it occurs not only in the lexical truncation processes described here, but also in postlexical vowel dropping described as a feature of rapid speech by Rood and Taylor (1996:447): ‘Also characteristic of rapid speech is the dropping of unstressed word-final vowels … In these examples, note that p and k are voiced to b and g when they come to stand before a consonant’.
Clearly Lakota is not a language like Czech or Basque in which all obstruents are devoiced in the coda. Devoicing is found for the fricatives /z/, /ž/, and /ǧ/ (5b), but the oral stops /p/ and /k/ become voiced [b] and [ɡ], respectively, while /t/ and the affricate /č/ are both pronounced as [l]. Since the existence of phonological coda obstruent voicing is debated, §3 of this article provides acoustic evidence for the patterns we have just described. More specifically, we demonstrate that the sounds transcribed as [b] and [ɡ] are often voiced, that they have the closure duration, burst properties, and low energy values of oral stops, and that where they are voiced, their voicing is best viewed as a consequence of the voicing of coda stops. Before turning to this evidence, we briefly review previous analyses of Lakota voicing patterns and highlight details that distinguish our approach from previous ones.
2.5. Previous analyses of lakota voicing patterns
The analysis presented above agrees in most respects with that presented by Rood and Taylor (1985, 1996). They describe /b/ as a marginal phoneme, and discuss [b] and [ɡ] as positional variants of /p/ and /k/, respectively, when final vowels are dropped or words are reduplicated. In particular, they say that ‘[w]hen vowel dropping (of any origin except possibly the fast speech phenomena …) places /p t č k/ in word-final position or at an internal boundary between linguistic elements, these become [b], [l], [l], [ɡ], respectively’ (Rood & Taylor 1996:449).16
In contrast, Rankin (2001) and Rood (2016) take a very different view of the oral stop voicing process from historical and theoretical perspectives, respectively. Under both accounts, oral stops are lenited to sonorants in syllable-final position. Rankin (2001:5) suggests a sound law whereby syllable-final stops become sonorants by first becoming nasals, and then shifting to oral stops after oral vowels. Rood (2016) instead uses a [End Page 308] theoretical device, the feature [sonorant voice] (Rice 1993), which is assigned to oral stops in coda position. Rood’s argument that voiced stops [b] and [ɡ] are sonorants seems to have four parts. First, since /t/ and /č/ are realized as [l], a sonorant, in the coda, [b] and [ɡ] should be sonorants too. Second, since [b] alternates with [m] in Lakota in nasal contexts, it should be a sonorant. Third, consonant lenition is common in coda position crosslinguistically, so ‘[s]ince our target sounds are in coda position, we should therefore look for a way to declare their voicing to be lenition’ (Rood 2016:246). Finally, Rood claims that, like sonorants in many of the world’s languages, voicing of [b] and [ɡ] (but not [l]) is variable. Evidence for this variability ‘comes from the informal observation that [b] and [ɡ] in coda position often seem to match the voice phonetics of the following consonant’ (Rood 2016:249). Since Rood treats [b] and [ɡ] as sonorants, not obstruents, Lakota does not violate the universal markedness constraints discussed in §1 that give rise to common obstruent devoicing and prohibit obstruent voicing in the coda.
The most important difference between our account and all of the previous studies of Lakota voicing we are aware of, including those just mentioned, is that we provide acoustic evidence in §3 for the impressionistic descriptions of a range of voicing patterns in the language, including voicing of oral stops in coda position.
3. Phonetic analysis of lakota stop voicing patterns
Despite its endangered status, Lakota is well documented in comparison to most indigenous languages of North America. This is especially true where audio recordings are concerned. The third author of this article has made recordings of over 400 native speakers between 1992–2018, including hundreds of hours of narratives and dialogues, and has also collected recordings of several dozen speakers from other sources. Given that there are only about 2,000 speakers today, this corpus may represent the speech of 10–20% of the Lakota speech community. While we have listened to some of these recordings and analyzed voicing patterns in running speech from a handful of them, the central study of voicing in this article is based not on recordings of natural running speech, but on studio recordings of native speakers that form part of the database of the NLD, published by the Lakota Language Consortium (Ullrich 2011). Before saying more about this dictionary, a general comment is in order. Language documentation can take many forms, but a general piece of advice to those working on endangered languages is to create documentation that can be used for multiple purposes. Even if one has no interest in phonetics or phonology, creating high-quality audio recordings allows future scholars to do research in these areas. In this context, the NLD, a descriptive lexicographic work with multiple purposes, including language documentation, language pedagogy, and language revitalization, is exemplary. The high-quality recordings of the NLD are of great value to the scientific community, and without them, the current research would not be possible.
The NLD currently offers on its smartphone app approximately 52,000 sound files from eight native speakers representing 28,000 dictionary headwords, with about 95% of the audio files found in pairs, spoken by the same male and female native speakers. Though the audio files in the dictionary application are compressed OGG files, not ideal for acoustic analysis, the Lakota Language Consortium was extremely kind to offer us access to a subset of the original uncompressed audio WAV files. These recordings, made in a professional sound studio with control room constructed specifically for the Lakota Dictionary project, are of very high quality. The detailed phonetic analysis of Lakota stop voicing presented in this section is based entirely on these recordings. For almost every word examined, there are two tokens: one spoken by Ben Black Bear, [End Page 309] Jr. from the Rosebud Reservation, indicated by (M) after the token, and one spoken by Iris Eagle Chasing from the Cheyenne River Reservation, indicated by (F) after the token. Many people consider these two speakers to be the most competent and literate native speakers of Lakota.
The NLD recordings were made by prompting speakers with words as they are written in the dictionary. A possible complication for this study is that pronunciation of voiced vs. voiceless stops in the coda could have been influenced by spelling. While this possibility cannot be ruled out, two observations suggest that pronunciations of the two speakers are natural and not influenced by spelling. First, the patterns we discuss below do not always follow spelling conventions: for example, in sabsápa <b> is typically pronounced as voiceless though it is spelled with the symbol for a voiced stop (an alternative spelling is sapsápa). Second, both speakers have high levels of phonological awareness and commented when spellings did not fit their phonological intuitions. Both speakers were encouraged to say words in the way that was most natural to them. After recording sessions, if a speaker was uncomfortable with an audio file because it did not sound natural or did not feel right to them, that audio file was deleted.
Another issue that could influence pronunciation is that words recorded for the dictionary were spoken at a relatively slow rate, clearly and in isolation or as part of two-word phrases. Our results must be interpreted, then, as results relating to the phonology of clear speech.
The corpus compiled specifically for this study has a total of 611 words: 304 distinct words with two tokens each, spoken by the male and female speakers, plus two distinct words spoken only by the male speaker, plus one distinct word spoken only by the female speaker. From these 611 words, a database of oral stops was created including: 631 voiceless stops (<p> = 150, <t> = 196, <k> = 285); 584 voiced stops (<b> = 285, <g> = 299); and 14 ejectives (<pʼ> = 6, <tʼ> = 4, <kʼ> = 4). In addition, we included 111 instances of glottal stop, as we were particularly interested in the realization of coda stops before a glottal stop (see below). Of the 1,215 oral voiceless and voiced stops (excluding ejectives and glottal stops), 225 intervocalic tokens (between oral vowels) were used to establish voicing categories (see below). Stops in contact with a nasalized vowel were discarded from the analysis to avoid effects related to nasalization. The remaining 841 oral stops that were not intervocalic were subject to analysis based on the established voicing categories. Given our initial hypotheses that final coda voicing might be masked by phrase-final devoicing and possible assimilation to following voiceless obstruents, an attempt was made to include tokens where this masking effect might be absent, including word-final stops that were not phrase-final and syllable-final stops followed by glottal stop. All words were orthographically transcribed following the NLD spelling conventions. The transcriptions were then converted to SAMPA for automatic segmentation of the audio files using the WebMAUS application (Kisler et al. 2017) set for ‘Language independent (SAMPA)’, and subsequently hand-corrected as needed.
Given our central interest in determining whether Lakota shows final obstruent voicing of /p/ and /k/, the first part of the phonetic study, summarized in §3.1, was focused on the question of whether /p/ and /k/ show phonetic voicing word-finally and, more generally, in syllable-final position. With positive evidence of syllable-final voicing, we turned our attention to the question of whether these voiced segments still had properties of oral stops. This was necessary in order to rule out interpretations of voicing as a secondary feature of lenition, where voiced codas might be interpreted as fricatives or glides. In §3.2 we present evidence that coda [b] and [ɡ] have acoustic properties of oral stops, including [End Page 310] significant closure durations, absence of fricative noise, release bursts, and low energy levels typical of oral stops. Together, acoustic evidence for voiced [b] and [ɡ] in the coda and acoustic evidence for oral stop production of these segments support the view that Lakota has a sound pattern of oral stop coda voicing for /p/ and /k/.
3.1. Evidence for voicing in oral stops: an analysis of auto-correlation coefficients
The corpus for this study was 876 oral stops extracted from the 611 word files from the original NLD audio WAV recordings. The central goal was to determine if there is a bimodal distribution of voicing in the data (voiceless vs. voiced), as described in phonological analyses, and, if so, to determine the distribution of each category.
For each segment in the corpus, the auto-correlation (AC) peaks were calculated with the program EMU (Harrington 2010 for the legacy version, Winkelmann 2017 for the R package ‘emuR’), using the ESPS method with a frame spacing of 10 ms, a window length of 7.5 ms, and pitch ranges of 60–400 Hz for the male speaker and 90–600 Hz for the female speaker. This yielded a total of 6,921 measurements with a voicing coefficient between 0 and 1 at each time point, 0 standing for no correlation (voiceless) and 1 for perfect correlation (voiced). Then, the median value of the AC coefficients of all measured time points within a given stop was calculated for all of the stops in the corpus.17
The statistical analysis was carried out in R (R Core Team 2019). We ran a binomial linear mixed-effects model that used the AC coefficients (continuous variable: probability of voicing) to predict the voicing label (binary categorical variable: voiceless or voiced) with the R function glmer() (from the R package ‘lme4’; Bates et al. 2015). As random effects, we had intercepts for speakers. The intervocalic context was the only clear phonological context where the voiced vs. voiceless opposition appeared to be secure. Given this, we trained the binomial linear mixed-effects model on intervocalic stops (225 tokens). Figures 1a,b provide waveforms and spectrograms illustrating the intervocalic voiced vs. voiceless contrast for /b/ vs. /p/ (here and throughout, <p:> in the transcription line indicates pause). An AUROC test (‘area under the receiver operating characteristic’ curve; see Fawcett 2006) of the model predictions performed with the function auc() (R package ‘pROC’; Robin et al. 2011) resulted in 0.8871 (0.5 being the chance threshold and 1 being the point of perfect prediction). A model with random intercepts for speakers and words showed a lower AUROC coefficient (0.8655) than the model with random effects only for ‘speaker’, and it was thus disregarded. The model showed no singularities or any other obvious deviations from the model assumptions. Based on the voicing values of intervocalic stops, we predicted response values between 0 and 1 (0 being fully voiceless and 1 fully voiced, with the categories having been divided by 0.5) for the full data set grouped by the phonological context of each of the analyzed stops. [End Page 311]
In addition to the intervocalic stops used to train the model, the analysis of phonetic voicing of Lakota stops was applied to three different general phonological contexts: word-initial stops, word-internal morpheme-final stops that constitute word-internal codas, and word-final stops.
Lakota word-initial stops /p/ vs. /b/ are considered to have a voicing contrast in the literature, and thus we analyzed word-initial voiced and voiceless stops separately by using annotation labels that follow the NLD transcription of each word. A Hartigans’ dip test (Hartigan & Hartigan 1985) performed on word-initial stops transcribed as voiceless (102 tokens in our data set) suggested a unimodal distribution (p-value = 0.8041), with results summarized in Figure 2a. However, applying the same test to word-initial stops that are written as <b> or <g> in the dictionary strongly suggested a bimodal distribution (p-value < 2.2×10−16).18 Figure 2b shows word-initial stops that are written as voiced <b> or <g> in the dictionary: while some of these are clearly voiced, most were produced without voicing, according to our model. Since all word-initial stops in this corpus were [End Page 312] taken from words spoken in isolation, we interpret this result as evidence for optional devoicing of phrase-initial stops.
In contrast to word-initial stops, word-medial morpheme-final stops in Lakota function as codas and are thought to have predictable voicing, though there is debate as to if and where final obstruent voicing occurs (see §2). For the word-medial morpheme-final stops in our corpus that are analyzed as codas (361 tokens), Hartigans’ dip test suggests a bimodal distribution (p < 2.2×10−16), as shown by the histogram in Figure 2c. However, dividing this single context into three subcontexts based on the nature of the following segment yields different distributions.
Recall from §2 that morpheme-final stops preceding [p, t, k, tʃ, s, ʃ, χ, h] can be resyllabified as complex onsets, in which case the entire cluster is expected to be voiceless (see example 5d). Figure 2d shows hypothesized resyllabified morpheme-final stops (203 tokens; p = 0.3001) with values close to 0, supporting the analysis of morpheme-final coda stops as typically voiceless in this context. At the same time, over forty tokens maintain voicing, suggesting that, although regressive devoicing (with resyllabification) is the norm in this context, it is not obligatory. (An alternative analysis without resyllabification is also possible: regressive devoicing optionally yields voiceless clusters in these contexts.) To highlight the possibility of a voiced coda preceding a voiceless consonant [p, t, k, tʃ, s, ʃ, χ, h], we offer the spectrograms in Figure 3.
Complementary contexts are those where morpheme-final stops in coda position precede [l, m, n, j, w] or glottal stop. Let us first consider codas followed by one of the voiced sonorants [l, m, n, j, w]. Under our analysis, these codas undergo coda stop voicing (5b.ii) and are expected to be voiced, since there is no contextual devoicing that occurs in this context. Figure 2e shows morpheme-final coda stops before sonorants (123 tokens; p = 0.9697), with values close to 1 and unimodal distribution, supporting an analysis of final stops as voiced in this context. Spectrograms in Figure 4 offer examples [End Page 313] of this sound pattern. Though regressive voicing from following [l, m, n, j, w] might be suggested, the results summarized in Figure 2f above and Figure 2g below provide further support for a general process of coda stop voicing.
Figure 2f shows morpheme-final coda stops before glottal stop (thirty-five tokens; p = 8.266×10−5), suggesting bimodal distribution. While the majority of tokens are fully voiced, there are ten fully voiceless tokens, which we interpret as instances of the optional fusion/resyllabification of [b.ʔ], [ɡ.ʔ] to onset [pʼ], [kʼ], respectively, included under 5d, and illustrated in Table 11 by variants tȟog.ʼí.yA, tȟo.kʼí.yA ‘to speak a foreign language’. To illustrate the sequence of voiced stop in the coda followed by glottal stop, we offer the spectrograms in Figure 5. Since glottal stop is voiceless, the voicing of oral stops in the coda preceding glottal stop must have another source. We argue that the source is coda stop voicing (5b.ii).
The results tabulated in Figs. 2c–f support our analysis in §2. Oral stops /p/ and /k/ are voiced in the syllable coda: when followed by a voiceless obstruent or /h/, they are usually devoiced and resyllabified, forming complex onsets; when followed by glottal stop, optional fusion may give rise to voiceless ejectives.
Further support for the process of obstruent voicing in the coda is found in the last distributional category of word-final (and phrase-final) oral stops. The dip test performed on word-final stops (150 tokens) suggested a bimodal distribution (p < 2.2×10−16). Further, [End Page 314]
[End Page 315]
Fig. 2g shows that, among these, there are more stops categorized as voiced than as voiceless. Since all stops in word-final position are analyzed as phonological instances of /p/ or /k/, voicing in this context can be interpreted as a consequence of coda voicing (5b.ii). Another interesting aspect of word-final stops visible in Fig. 2g is that there is a greater proportion of tokens in the intermediate prediction values. Given that all of our word-final tokens were also phrase-final, we interpret this as evidence of a gradient phrase-final devoicing process. In phrase-medial position, where a word-final stop is followed by something other than a voiceless obstruent, it is voiced, as predicted by 5b.ii. To highlight the common occurrence of voiced oral stops word-finally, we offer the spectrograms in Figure 6. Further discussion of the acoustic properties of coda [b] and [ɡ] is offered in §3.2.
Overall, our acoustic analysis of voicing in Lakota stops suggests that stop voicing is contrastive intervocalically and word-initially, but is noncontrastive elsewhere. The distribution of phonetic voicing in Lakota oral stops supports an analysis of phonological syllable-final voicing of /p/ and /k/ to [b] and [ɡ], respectively (5b.ii). In addition to coda voicing, four distinct devoicing processes are observable in the data. First, in phrase-initial position, there is often devoicing of (underlying) voiced stops (Fig. 2b). Second, in phrase-final position, coda voicing can be obscured by gradient phrase-final devoicing (Fig. 2g). Third, in phrase-medial position, a regular process of regressive [End Page 316] obstruent devoicing is triggered by the class of voiceless sounds [p, t, k, tʃ, s, ʃ, χ, h], which, under our analysis, can be related to resyllabification (see example 5d). A final optional process that can yield voicelessness is fusion/resyllabification of [b.ʔ], [ɡ.ʔ] to onset ejectives [pʼ], [kʼ], respectively (5d). While all of these processes (with the possible [End Page 317] exception of phrase-initial devoicing) have been described in the literature, this is the first study presenting acoustic evidence in support of syllable-final oral stop voicing together with these devoicing processes.
3.2. Evidence that [b] and [ɡ] are oral stops, not fricatives or glides
The general question of whether processes of final obstruent voicing exist is complicated by the existence of many sound patterns of final lenition or weakening where voicing is coupled with reduced stricture. For example, though Blevins (2006a) suggests that Tundra Nenets may show a pattern of final obstruent voicing, Kiparsky (2006:228–29) argues that Tundra Nenets /p, t, k/ vs. /b, d/ should be treated as a ‘contrast of tenseness, not of voicing’, noting that ‘/b/ and /d/ are lax, and articulated with various kinds of lenition’. In order to determine whether Lakota has a sound pattern of coda voicing for /p/ and /k/, then, it is necessary to show not only that these sounds are voiced, as we have in §3.1, but also that the voiced segments are oral stops. Here we present acoustic evidence of significant closure duration, absence of fricative noise, release bursts, and low energy levels that are all characteristic of oral stops in contrast to fricatives, taps/flaps, and glides.
Where hand-eye inspection of waveforms and spectrograms is mentioned, these waveforms and spectrograms are extracted from the NLD data. All waveforms, spectrograms, and annotations were plotted with the computer program Praat (Boersma & Weenink 2019) and were exported as 600 dpi PNG files.
As detailed in §2, oral stops and oral fricatives contrast in Lakota, but due to coda stop voicing and coda fricative devoicing, the only environment where voiced and voiceless stops and fricatives contrast, and where accurate measures of duration can be made, is intervocalic position. Figure 7 shows durational measurements for all of the obstruents in our database in intervocalic position. Note that the horizontal line within each box marks the median, while the circle marks the mean. The durational measurements have been speaker-normalized by converting them to z-scores and back to milliseconds for easier comparison. Three outliers (> 3 SDs) have been removed under the assumption that they originated from hesitation or similar effects, resulting in a total of 348 intervocalic obstruents analyzed.
Voiced stops have the shortest closure durations of all obstruents, with averages ranging from 62 ms for [ɡ] to 77 ms for [b]. Plain voiceless stops show longer closure durations, averaging 110–113 ms, with voiced fricatives slightly longer (99–120 ms), and [End Page 318] voiceless fricatives (excluding /h/) the longest of all (148–163 ms). These durational measurements are unremarkable and are consistent with interpreting sounds transcribed as [b] and [ɡ] as voiced oral stops, since flaps or taps would be expected to show shorter closure durations averaging around 20 ms. We performed a linear mixed-effects analysis of the duration measurements of intervocalic consonants, with random intercepts by Speaker and Word, using the nonnormalized values to avoid singularity. A Tukey post-hoc comparison with Benjamini-Hochberg adjustment for multiple testing showed significant differences between [b] and [ɡ] and any other given consonant at the 0.001 level, as shown by Table 12.
However, since the measurements in Fig. 7 are taken only from intervocalic [b] and [ɡ], checks are necessary to ensure that the same segments in syllable coda show similar closure durations. Since we could not automate this process due to the phonotactics, [End Page 319] we resorted to hand-eye-checking spectrograms of fully voiced tokens from data like that presented in Fig. 2e, Fig. 2f, and Fig. 2g. Waveforms and spectrograms in Figs. 4–6 exemplify our findings: voiced coda consonants transcribed as [b] and [ɡ] in the NLD orthography that were measured as fully voiced by the acoustic analysis in §3.1 show closure durations falling within the central range of the distribution bars for voiced stops in Fig. 7. There is no evidence of significant temporal reduction of articulatory undershoot that might be associated with flapping/tapping or general lenition.
Absence of fricative noise during closure phase of voiced oral stops
To determine whether voiced codas were produced as stops vs. fricatives or glides, we performed two tasks. One was an automated measure of spectral energy, described in §3.2. The other was hand-eye-checking of spectrograms of fully voiced tokens from data presented in Figs. 2e,f and Fig. 2g, looking for noise above the voicing bar that would indicate incomplete stop closure. Spectrograms in Figs. 4–6 exemplify our findings: voiced coda consonants transcribed as [b] and [ɡ] in the NLD orthography that were measured as fully voiced by the acoustic analysis in §3.1 show silent closure durations consistent with oral stop production. These stops do not show aperiodic components in the spectrum during the closure phase that would indicate frication, nor do they show formant structure indicative of vowel-like productions. There is no evidence of significant noise at frequencies above the fundamental that might be associated with incomplete oral closure, frication, or general lenition.
Presence of release bursts for voiced oral stops
A salient characteristic of (released) oral stops is their release bursts. Release bursts are not generally found in flaps/taps and are absent in fricatives and glides. To check for release bursts, we resorted to hand-eye-checking of spectrograms of fully voiced tokens from data presented in Figs. 2e,f and Fig. 2g, looking for a spike of noise at the end of the closure that would indicate release of intraoral air pressure. Spectrograms in Fig. 4 and Fig. 6 exemplify positive findings: voiced coda consonants transcribed as [b] and [ɡ] in the NLD orthography that were measured as fully voiced by the acoustic analysis in §3.1 sometimes show visible release bursts before sonorants and in word-final position before pause. Note that in Fig. 4a, the [ɡ] of sagyéla (F) ‘in a dried, hard, or stiff condition’ shows a double burst: this is not uncommon before sonorants, where preceding voiced stops are often described as being followed by a short open transition. Presence of release bursts in medial and final codas were taken as positive evidence that the sounds measured as fully voiced instances of [b] and [ɡ] were indeed oral stops, and not fricatives or glides.
The finding of release bursts in phrase-final position, as illustrated in Fig. 6 for [b] in ǧób ‘snoring’, [ɡ] in oglág ‘telling one’s own, relating’, [b] in gléb ‘vomiting’, and [b] in ób ‘with them’ was interesting for two reasons. First, as just mentioned, it supported the analysis of these sounds as voiced oral stops. Second, it focused our attention more on properties of the release. As detailed in §2, like the majority of Lakota words with final consonants, these words are contracted forms that have undergone final vowel loss: ǧób from ǧópA ‘to snore’, oglág from oglákA ‘to tell one’s own (as a story, decision, name)’, gléb from glépA ‘to vomit’, and ób from ópȟa ‘to take part’. Though our findings remain preliminary, formant structure in the final release burst of these tokens is consistent with production of a final voiceless or partially devoiced vowel. In §4, we discuss aspects of the diachrony of final stop voicing in Lakota and suggest an earlier stage (attested in Assiniboine) of intervocalic voicing concomitant with final vowel devoicing. Voiceless release of Lakota final voiced stops could continue this earlier hypothesized sound pattern. [End Page 320]
Overall low spectral energy of voiced oral stops
If the coda consonants measured as fully voiced in §3.1 (Figs. 2e,f and Fig. 2g) are oral stops, as opposed to flaps/taps, fricatives, glides, or other sonorant sounds, then they will be expected to have lower overall energy envelopes than these other sound types. In order to assess this factor, we used quantitative measures across the entire data set. First, we applied a high-pass filter (350 Hz) to the audio data with the R package ‘wrassp’ (Winkelmann et al. 2017) to remove the lower frequencies, where most of the energy of voiced stops is expected to be concentrated. Then, we performed a root-mean-square analysis (rms), also with the R package wrassp, to the resulting filtered data. The rms values were speaker-normalized (z-normalization) and converted back to decibels. We excluded intervocalic voiced stops from the analysis, because there seems to be consensus on their status as stops, and because our focus was whether voiced consonants that result from the proposed process of stop coda voicing (5b.ii) were produced as oral stops or with some other manner of articulation. Our corpus of purported voiced coda stops included 299 tokens. These were compared with the entire set of approximants and voiced fricatives in our database, which included 219 tokens. Relevant results of our measurements are plotted in Figure 8.
Table 13 shows the results for the linear mixed-effects model constructed to test the intensity differences (in decibels; dB) between coda voiced stops and the approximants and voiced sibilants in the language. We added random slopes and random intercepts by speaker and word. Nonnormalized values were used in order to avoid singularity due to the incorporation of random effects by speaker to the model. Results shown are for Tukey post-hoc comparisons with α-level compensation for multiple comparisons using Benjamini-Hochberg adjustment.
The comparison shows a categorical difference in amplitude between voiced stops in the coda and all voiced fricatives and approximants, and no statistically significant difference between [b] and [ɡ]. Average spectral amplitude of voiced stops is below 60 dB (55 and 57 dB), while voiced fricatives have averages of 64 dB, and the averages for glides are 66 and 74 dB. These results suggest that [b] and [ɡ] form a category of low-energy sounds, consistent with their production as oral (voiced) stops, distinct from the category of voiced fricatives and voiced approximants. [End Page 321]
3.3. Final voicing as alternation
As reviewed in §2, many researchers describe voicing alternations for Lakota when a stem ending in …VTV- (T a voiceless oral stop or affricate) is produced as consonant-final. For example, sákA ‘to be dry, to be dried until hard or stiff’ has medial [k], but is reduplicated as sagsákA, where truncation (5a) results in /k/ being pronounced as [ɡ] in coda position (5b.ii). In this section, acoustic analysis of alternating consonants in morpheme-alternant pairs supports these descriptions: intervocalic instances of /k/ and /p/ are typically voiceless, while phrase-final and preconsonantal instances of the same consonants in the same morphemes are often voiced in nonvoicing contexts, namely word-finally and before voiceless consonants and glottal stop. Since our tokens are taken from recordings made for dictionary purposes, we have limited numbers of these pairs, making a statistical analysis unsuitable. However, the acoustic measurements presented below demonstrate that the voicing alternations described for Lakota are attested. Spectrograms and waveforms of each word in Figures 9 and 10 are accompanied by a plot showing the details of the voicing of each relevant segment by means of AC coefficients as a function of time. As discussed in §3.1, AC coefficients go from 0 (completely voiceless) to 1 (completely voiced), with 0.5 as the voicing threshold. Time in the x-axis has been normalized to the duration of each segment to facilitate the visualization of the proportion of voicing. Note that the last time point in the plot might be affected by the voicing of the following segment.
In Figs. 9 and 10, the comparison between voiced and voiceless stops shows that, whereas prototypical intervocalic voiceless stops show a decrease in voicing that follows carryover voicing from the previous vowel (and spans most of the duration of the [End Page 322] stop), word-final voiced stops do not show any significant decrease in voicing until the end of the segment. This comparison can also be made in reduplicated forms, where the first morpheme appears in its reduced form (with coda stop voicing) preceding the full form (with an intervocalic voiceless stop). Figure 11 shows the reduplicated form ȟobȟópA ‘to be extremely attractive’ as produced by the female speaker.
3.4. Summary of acoustic analysis
Our acoustic analysis of voicing during oral stop closure presented in §3.1 is compatible with the view that Lakota oral stops /p/ and /k/ are voiced in coda position. This voicing is sometimes obscured by devoicing processes, including phrase-final gradient devoicing, assimilation to voicelessness [End Page 323]
[End Page 324]
before voiceless obstruents and /h/ (which may be coupled with resyllabification into onset), and devoicing that occurs with optional fusion with a following glottal stop, yielding an ejective. In order to support the view that voiced /p/ and /k/ are oral stops, §3.2 presented an array of acoustic properties consistent with stop production. Sounds that were categorized as fully voiced codas in §3.1 were shown in §3.2 to have significant closure durations in the range of normal for voiced stops, absence of fricative noise during closure, release bursts, and low energy levels that are all characteristic of oral stops in contrast to fricatives, taps/flaps, glides, or other sonorant sounds. Section 3.3 presented morphologically related word pairs with voiceless and voiced alternants, [End Page 325] highlighting the differences in voicing in a given stop with regard to its phonological context (intervocalic vs. word-final). In sum, there is acoustic evidence that Lakota has a sound pattern of oral stop coda voicing, supporting the impressionistic descriptions of earlier researchers.
4. Hypothesized origins of lakota obstruent voicing patterns
Recall from §1 that certain phonological markedness accounts like that of Kiparsky (2006, 2008) predict the nonexistence of synchronic sound patterns of final obstruent voicing. In contrast, evolutionary phonology (Blevins 2006a,b) suggests that such patterns may exist, but may be extremely rare due to the numerous phonetic factors that result in devoicing of obstruents in word-final position. One of Kiparsky’s (2006) arguments against the evolutionary approach involves historical pathways of change. He suggests that, were there no grammatical markedness constraint against final obstruent voicing, it could easily evolve by the succession of two independently common sound changes: (i) intervocalic voicing: VTV > VDV, and (ii) final vowel loss: VDV# > VD#. Since, he argues, no clear cases of this kind are in evidence, the absence of synchronic final obstruent voicing supports the existence of phonological markedness constraints which demand that unmarked (voiceless) obstruents are preferred in positions of neutralization. We have shown above that there is acoustic support for a synchronic sound pattern of final voicing in Lakota. While this, in itself, is enough to call markedness accounts into question, we continue to be interested in the question of why sound patterns of this type are uncommon, and how they might evolve.
To this end, we offer below some hypotheses regarding the evolution of Lakota final obstruent voicing. After introducing the Proto-Siouan sound system in §4.1, we suggest Lakota /l/ < *d in §4.2. With this sound change established, the synchronic alternation of /p/, /t/, /k/ with [b], [l], [ɡ] can be seen to reflect a uniform historical voicing of oral stops /p/, /t/, /k/ > [b], [d], [ɡ] prior to the *d > l sound change. In §4.3, we suggest that this voicing process was similar to the historical pathway suggested by Kiparsky (2006): where he suggests intervocalic voicing followed by final vowel loss, we suggest intervocalic stop voicing concomitant with vowel reduction as a consequence of anticipatory coarticulation of the final vowel gesture.
4.1. The proto-siouan sound system
Proto-Siouan is reconstructed by Rankin, Carter, and Jones (1998) with the consonant inventory shown in Table 14, a system that also underlies the Comparative Siouan dictionary (Rankin et al. 2015), from which all reconstructions in this section are taken, unless noted otherwise. The Proto-Siouan consonant system in Table 14 consists minimally of a series of voiceless unaspirated stops and fricatives, *p, *t, *k, *s, *š, *x, laryngeals *ʔ and *h, three sonorants, *w, *r, and *y (which represent labiovelar, central rhotic, and palatal approximants, respectively), and two ‘funny’ resonants *W and *R, which are similar to *w and *r but more obstruent-like (Larson 2016).19
Given the numerous clusters in Proto-Siouan and the clear status of many postaspirates from heteromorphemic C+h clusters (Rankin et al. 1998:2), it is possible that all of the sounds in parentheses in Table 14 constitute historical clusters or, in the case of preaspirates *hp, *ht, *hk, allophonic variants of voiceless unaspirated stops intervocalically before accented vowels (Rankin et al. 1998:1, Larson 2016). Apart from *W and *R, which we return to below, the most notable feature of this inventory is its lack [End Page 326] of nasal stops, though /n/ and /m/ are contrastive sounds in all Dakotan languages.20 Nasalization in Proto-Siouan is hypothesized to be a feature of vowels only, with vowels *i, *e, *a, *o, *u, *į, *ą, *ų, and their long counterparts. By the time of Mississippi Valley Siouan (MVS), resonants preceding nasalized vowels were all pronounced as nasal stops, and subsequent to this point, were phonologized as nasal stops when vowel nasalization was lost.21 While Proto-Siouan might seem odd in having nasalized vowels but no nasal stops, the same pattern exists in Mandan, and at least one non-Siouan language of North America had a similar pattern. Eyak, an extinct Na-Dené language, had no distinct nasal stops, but did have nasalized vowels: phonetic [m] occurred as a variant of /w/ before nasalized vowels, while phonetic [n] occurred as a variant of /l/ in the same contexts (Maddieson 2013).
With the possible exception of *W and *R, there is no contrastive voicing in Proto-Siouan. Though marginal, the protophonemes *W and *R are sounds thought to have been like *w and *r, but more obstruent-like and with different reflexes: where *w is usually continued as /w/ or /m/ in most daughter languages, *W is continued as /w/, /b/, /mb/, or /p/, showing more obstruent-like behavior; and where *r is usually continued as /r/, /k/, /n/, /ð/, or /y/, *R is continued as /r/, /l/, /d/, /nd/, or /t/, also showing more obstruent-like behavior. Given this, it would not be unreasonable to interpret Proto-Siouan *W as weak [b] (a voiced oral labial stop or tap with short closure duration) and Proto-Siouan *R as a weak [d] (a voiced oral dental stop or tap with short closure duration), which might suggest a limited incipient voicing contrast for the bilabial and dental stops alone.
Earlier researchers (e.g. Rankin 2001, Larson 2016, Rood 2016) have focused on the singleton consonant reflexes of *W and *R as evidence of their obstruent-like phonetics in the protolanguage. However, an additional piece of evidence for obstruent-like status of *W and *R is their behavior in clusters and, more specifically, the apparent assimilation in voicing that occurs when a voiceless obstruent preceded one of these sounds. Crosslinguistically, voice assimilation in obstruent clusters is common, while voice assimilation between a sonorant and a preceding obstruent is rare (Mielke 2008, 2013). If *W and *R are treated phonetically as voiced obstruents, we can better understand the source of the unusual process of presonorant voicing in Lakota. [End Page 327]
Recall that, in Lakota, /p/ and /k/ are pronounced as voiced [b] and [ɡ], respectively, before sonorants /w, l, m, n/: [bl] from /pl/; and [ɡw], [ɡl], [ɡn], and [ɡm] from /kw/, /kl/, /kn/, and /km/, respectively (example 2 and Table 8). In the nonshaded rows of Table 15, these Lakota clusters are compared with cognate clusters in other dialects.
The first shaded row illustrates l/d/n correspondences independent of cluster phonotactics, and the second shaded row shows that fricatives do not assimilate in voicing to a following consonant.23 Of specific interest are the bolded singleton consonants and consonant clusters in Table 15. In the shaded cells, singleton *R and cluster *WR are continued as voiced singletons and clusters (whether sonorants or obstruents), respectively, in all dialects. In the nonshaded cells, bolded singletons are the result of dialectal voicing processes. In Lakota, as already discussed, the pattern is one of presonorant voicing, as in gwéza < *kWéza. However, in Yanktonai, the pattern appears to be one of regressive voice assimilation from a voiced oral stop to a preceding oral stop, as in gbéza < *kbéza < *kWéza. (Sisseton-Santee and Assiniboine show no evidence of obstruent voicing in parallel contexts.) In cases where a preform contains an initial *WR cluster, voicing is continued in both consonants independently, as in bdaská, mnaská < *WRaska. Returning to the typological observation above, crosslinguistically, presonorant voicing in obstruent-sonorant (OR) consonant clusters is extremely rare. For example, in Indo-European, where OR clusters are reconstructed, there is no evidence in any of the 300+ languages of initial *kl > gl. In contrast, voicing assimilation between obstruents in tautosyllabic clusters is the norm crosslinguistically (Mielke 2008, 2013), with notable exceptions in languages like Hebrew, Khasi, and Tsou (Kreitman 2008, Blevins 2010). Given these facts, a view of *W and *R as phonetically voiced obstruents is indirectly supported by their role in triggering voice assimilation in word-initial clusters like those illustrated in Table 15. We conclude that one possible view of Proto-Dakota *r and *w (or, as written here, *R and *W) is that these sounds were voiced, oral dental and labial stops *d and *b, respectively, with short closure durations and weak bursts.
4.2. Lakota /l/ from *d
If Proto-Dakota *r/*R was pronounced something like [d], and Lakota /l/ is a regular reflex of this sound, then a sound change of *[d] > [l] is motivated. [End Page 328]
Some support for Lakota /l/ < *d can be found in data related to the process of coda voicing. Recall that in the modern language, /p/, /t/, /k/ are pronounced as [b], [l], [ɡ], respectively, in the coda when phrase-final devoicing and/or assimilation to a following voiceless obstruent are absent. If voicing had begun as a natural phonetic process, the voiced counterparts of /p/, /t/, /k/ would be [b], [d], [ɡ], not [b], [l], [ɡ].
Evidence for earlier coda *[d] might be found in what could be interpreted as old compound forms. Consider the compound lotkhú ‘the underjaw of animals’ (Dakota dotkhú), from loté ‘throat’ + khúl ‘down, below, underneath’. Recall that the expected coda form of /t/ is [l], so we expect **lolkhúl, not the attested lotkhú. The attested form is consistent with earlier /t/ > [d] in the coda, followed by local regressive devoicing under resyllabification (5d). Loss of final /l/ of /khul/ may also be an indicator of the age of the compound. Another word that may represent an old compound is the interjection lotkȟúŋkešni ‘oh, by the way; incidentally; I forgot to mention; to go back to the subject’, with a variant lolkȟúŋkešni. While the etymology of this term is unclear, final-kešni appears to be from /-kA-šni/ ‘kind.of-not’, while one might speculate that -kȟuŋ-could be the root of kȟuŋyáŋ ‘quickly, promptly’, an archaic term that usually begins a command. Here again, [t] can be interpreted as a devoiced/resyllabified instance of earlier *d, with the [l]-variant a continuation of coda *[d].24 A final example of this kind is the pair of variants pȟelʼížaŋžaŋ, pȟetížaŋžaŋ ‘lamp, light’ from pȟéta ‘fire’ + ižáŋžaŋ ‘to be lit, give light’. Here, [t] in pȟetížaŋžaŋ appears to be a reduction of [.tʼ] < [d.ʼ], showing the stage prior to *d > l, while [l] in pȟelʼížaŋžaŋ is the modern (post *d > l) version of the same compound.
Another potential indicator of *d as the source of Lakota /l/ occurs in reduplicated forms. Recall that there is evidence for coronal dissimilation in reduplication: by rule 5c, l.T → g.T, where T is a coronal consonant: from lúta ‘red’, we have luglúta ‘red’ (inan.pl). If we are correct in hypothesizing Lakota l < *d, the historic dissimilations (after coda voicing) are *d.d > g.d (> gl), *dš > gš, *ds > gs, and *dč > gč. In contrast, assuming l < *l (or any coronal sonorant) implies dissimilations like *l.l > g.l, *l.š > gš, *l.s > gs, and *l.č > gč. Since dissimilation of place is more likely when the target consonants share manner features, and since a simple change of place (under place dissimilation) is more natural than a change of place and manner, the first set of dissimilatory changes seems more likely than the second set. Since forms with reduplicated coda [ɡ] for /l/ are arguably old (in some cases replaced with productive [l] forms), they are consistent with an earlier stage of the language where the voiced coda form of /t/ was [d]. Under this analysis, sótA ‘clean, clear’ has the old reduplicated form soksóta (< *sog-sóta < *sod-sóta), supported by the fact that the base sótA ‘clean, clear’ is obsolete in modern Lakota. In contrast, yusótA ‘to use something up, expend’ shows a productive reduplication yusólsota, where, in the synchronic phonology, coda /t/ is realized as [l].
If we are correct in hypothesizing an earlier stage of Lakota where /p/, /t/, /k/ were realized as [b], [d], [ɡ], respectively, in the coda with evidence for the persistence of voiced stops [b] and [ɡ] presented in §3, then a subsequent sound change *d > l is needed to account for the distribution of /l/ in the language. This sound change appears to be context-free and is supported by two kinds of evidence discussed above: compounds [End Page 329] where first members have [t]- and [l]-final variants, and reduplicated forms where first members have [ɡ]- and [l]-final variants. In both cases, obstruent-final variants reflect a stage of the language before the *d > l sound change: in compounds, *d > [t] in (resyllabified) complex onsets, and in reduplicated forms, *d > [ɡ] under dissimilation with a following coronal obstruent. In sum, many instances of Lakota [l], including most of those in morpheme-initial and intervocalic position of inherited roots, are direct reflexes of Proto-Siouan *r or *R, which, we suggest, merged as *R, a voiced dental [d]-like obstruent, pre-Lakotan *d, which underwent *d > l.25 Other instances of surface [l] are continuations of Proto-Dakota *t, pre-Lakota *t, which underwent regular voicing (see §4.3) to *d, and subsequent *d > l.
4.3. Precursors of final voicing in intervocalic coarticulatory voicing
If Lakota coda /l/ < *d, then there appears to be evidence of a historical obstruent voicing taking /p, t, k/ to [b, d, ɡ]. Since final obstruent voicing is exceedingly rare crosslinguistically, and since we have evidence in all cases that these stops were historically medial, not final, it seems reasonable to investigate the possibility that synchronic coda voicing is somehow a transform of an earlier intervocalic voicing process.
As we demonstrated in §2, syllable-final voicing in Lakota is associated with truncation (5a). Since Proto-Dakota had only open syllables, the closed syllables created by truncation are new structures, not subject to any preexisting constraints on codas in the language. Nevertheless, there are conditions on truncation: in Lakota, truncation occurs in noncompound forms only when the final vowel is preceded by a single consonant (not a cluster) and when that single consonant is an obstruent or /l/ (not /n/, /w/, or /y/).26 The picture becomes more complicated when truncated forms are compared across dialects, with representative data in Table 16, where ‘?’ marks suspect forms from the published literature that require audio verification.
Truncated forms in other dialects show voiced codas, though, as in Lakota, the voicing can be allophonic (as for /g/), phonemic (for /p/ to [b]), or phonemic and coupled with a shift from obstruent to sonorant (for /t/ to [n]). Two notes are in order regarding the Assiniboine data. First, there is a regular rule of coda nasalization: /p/ and /t/ become [m] and [n], respectively, when they are word-final, or when followed by a sonorant consonant; /k/ is unaffected, since there is no velar nasal in the language (Cumberland 2005: 70–71). Second, there is some evidence that Assiniboine /n/ and /m/ were historically postploded nasals [nd] and [mb] when followed by oral vowels (Cumberland 2005:25–26). Where the (albeit functionally weak) contrast between intervocalic /t/ and intervocalic /l, d, n/ (< *R) is neutralized in Lakota and Assiniboine, the contrast is maintained in d-dialects by /d/ > [d] but /t/ > [n], where, perhaps, as in Assiniboine, [n] continues an earlier [nd]. Though there have clearly been distinct paths of phonologization of voicing for coda consonants, we suggest a phonetic change of /p, t, k/ > [b, d, ɡ] for Dakotan generally when intervocalic before final unstressed (voiceless) vowels, supported by modern Assiniboine sound patterns, as discussed below.
The phonetic realization of /p, t, k/ > [b, d, ɡ] for Dakotan that we hypothesize is one that is grounded in coarticulation of a weak final vowel. We suggest that the weak articulation of the vowel is, in part, a manifestation of the anticipation of articulatory features from the vowel to the preceding consonant. Part of this anticipation yields voicing [End Page 330] of the consonant. In other words, the weaker the articulation of the final vowel, the more vowel-like (in terms of voicing/sonority) the articulation of the preceding consonant. Ultimately, the final vowel may be devoiced, as described below for Assiniboine, or lost altogether, as described above for Lakota. Under our analysis, it appears that something close to Kiparsky’s sequence of sound changes involving intervocalic voicing followed by final vowel loss has occurred. What makes Lakota special, or unusual, is that when the final vowel is not reduced, there is no evidence of intervocalic voicing. It is only when final vowels are significantly reduced, or lost altogether, that voicing of the once-intervocalic derived coda is realized.
Our analysis of Lakota voicing as a modified version of intervocalic voicing is consistent with facts about the phonetics of closely related languages, and with our own data, where some voiceless stops show voicing between vowels. Many Siouan languages show intervocalic voicing of (voiceless) stops in running speech. One of these is Assiniboine (Nakota), as described by Cumberland (2005).27 In Assiniboine, the voiceless unaspirated stops are voiced intervocalically, but voiceless elsewhere (Cumberland 2005:18). Of particular interest to this study is Cumberland’s (op cit.) description of words like /thoka/ ‘enemy’, spoken in isolation. She transcribes this word as [toga] where the intervocalic /k/ of /thoka/ is voiced, and the final vowel of the same word is devoiced. Acoustic data is provided to support this analysis, and Cumberland is explicit in detailing a rule of word-final vowel devoicing, and in saying that ‘voiceless vowels will still trigger intervocalic voicing so that, even when the vowel is virtually inaudible, evidence of its presence may be seen in a preceding obstruent’ (Cumberland 2005:78). [End Page 331] Our phonetic study of Lakota revealed similar ‘intermediate’ stages in some tokens where final voiced obstruents were released into voiceless vowels. Spectrograms illustrating final voiceless vocalic release can be reviewed in Fig. 6. Though truncation typically eliminates final weak vowels in Lakota, we suggest that obstruent voicing, as a historical process, reflects an earlier stage of the language where voicing was noncontrastive and intervocalic stops were voiced before unstressed, reduced final vowels, a pattern extended to all intervocalic stops in Assiniboine. As noted above, what makes Lakota unusual is that when final vowels are not reduced, there is no evidence of intervocalic voicing. This is why we view historical obstruent voicing in Lakota as a phonetic process concomitant with final vowel reduction and loss.
While we have only touched the surface of the history of obstruent voicing in the Dakotan languages, the evidence in §§2 and 3 strongly suggests a synchronic sound pattern whereby Lakota /p/ and /k/ are voiced to [b], [ɡ] in syllable-final position. We attribute the full set of alternations, including /t/ pronounced as [l], to an earlier sound change of intervocalic /p, t, k/ > [b, d, ɡ] concomitant with reduction of final unstressed vowels, including vowel devoicing and loss. This sound change was followed by a context-free change of *d > l in Lakota, supported by data presented in §4.2.
5. Summary and implications for phonological theory
In §2 we described Lakota sound patterns that appear to involve a synchronic process whereby stems ending in …VTV (T a voiceless unaspirated stop or affricate) can be pronounced as …VD (D a voiced consonant). Under this process, /p, t, k/ are pronounced as [b, l, ɡ] in syllable-final position. Our laboratory phonology study in §3 provides phonetic evidence that /p/ and /k/ undergo voicing and maintain their obstruent quality. The segments reported as [b] and [ɡ] in syllable-final position in the Lakota literature are not lenited segments: they are typically produced with significant closure duration, are voiced for the duration of closure when not in a devoicing environment, and have bursts consistent with the production of oral stops. In cases where voicing was partial or absent, it was explained by context-sensitive devoicing: phrase-finally there is gradient devoicing, while before voiceless obstruents and /h/, voiced obstruents tend to be devoiced. The Lakota voicing process is an unusual one from a phonological perspective, as it is neither wholly neutralizing nor wholly allophonic. In the case of /t/ to [l], a robust contrast exists, so final voicing can be seen as neutralizing, though, of course, this alternation can be viewed in terms other than final voicing. In the case of /p/ to [b], a weak contrast exists between /p/ and /b/; here voicing is neutralizing, but barely. In contrast, for /k/ to [ɡ], voicing is purely allophonic: Lakota, like other Dakotan languages, has /k/, but no /g/. Given the nonuniformity of the phonological alternations involved, the psychological reality of the process of final voicing may also be nonuniform. In this context, it is interesting to note that most orthographies represent [b] as <b> and [ɡ] as <g> (e.g. tópa, tób ‘four’; íyotakA ‘sit down’, íyotagkhiyA ‘to cause someone to sit down’) (Rood & Taylor 1985:7) and, further, that ‘the most widely used transcription systems do not consistently write these instances of /p/ and /k/ with b and g when a voiceless obstruent follows’, suggesting that speakers could be aware of the contextual devoicing described in §3 as well. Our historical discussion in §4 provided support for a historical pattern of intervocalic /p, t, k/ > [b, d, ɡ] before final unstressed vowels, concomitant with devoicing and/or loss of these vowels, as still found in modern Assiniboine and some tokens in our Lakota database. Under this analysis, Lakota underwent a later *d > l change, distinguishing it from the d-dialects.
The historical source of Lakota final obstruent voicing in an earlier intervocalic stop voicing process, followed by final vowel loss, is of theoretical import. Kiparsky (2006) [End Page 332] adopts a general (violable) universal constraint prohibiting obstruent voicing in syllable codas on the basis of the following argument: if this general universal constraint did not exist, how could we explain the fact that, of the many languages with intervocalic voicing and the many languages with final weak vowel loss, not one shows a progression whereby intervocalic voicing of /p, t, k/ to [b, d, ɡ] is followed by final vowel loss, yielding a sound pattern where only voiced stops are found word-finally? In earlier work, Blevins (2006a,b) suggested Somali as such a case modulo more recent final devoicing, but the argument from Lakota is stronger. In Lakota, we have clear phonetic data supporting voicing of /p/ to [b] and /k/ to [ɡ] in the coda (see §3), with comparative evidence from Assiniboine (Cumberland 2005) attesting intermediate stages of intervocalic voicing with final vowels intact, with final vowels devoiced, and with final vowels lost. Further, we can see why final voicing is precarious and rarely attested. In Lakota, the process was rendered nonuniform by the *d > l change and is currently evidenced only where later processes of phrase-final devoicing and regressive devoicing are not found. We are led to conclude that Kiparsky’s (2006) argument is flawed. Consistent with evolutionary phonology (Blevins 2004, 2006a,b, 2015), Lakota illustrates a possible but fleeting instance of syllable-final obstruent voicing. In languages where intervocalic voicing of plain voiceless stops is the norm (Kakadelis 2018), loss of final vowels can yield final-voicing sound patterns as well. However, synchronic sound patterns of this kind may be rare due to the high frequency of final obstruent devoicing and voice assimilation, two processes with well-studied phonetic bases.
We conclude that the Lakota language, as currently spoken, shows evidence of a rare pattern of syllable-final obstruent voicing. The oral stops /p/ and /k/ are voiced to [b] and [ɡ] in syllable-final position, while /t/ is pronounced as [l] in the same position. Since we have examined oral stops only in the context of oral vowels, sound patterns in the context of nasalized vowels may differ and are deserving of further study. Obstruent devoicing is also in evidence. Fricatives /z, ž, ǧ/ are devoiced to [s, ʃ, χ] in syllable-final position. In addition, there is evidence of variable initial stop devoicing, gradient phrase-final devoicing, and fairly regular regressive devoicing of [b] and [ɡ] before voiceless obstruents and /h/. Our conclusion is consistent with earlier descriptions of Lakota phonology (e.g. Rood & Taylor 1985, 1996, Ullrich 2008, 2011, 2018, 2019, Ullrich & Black Bear 2016), building on these in three ways. First, we support the sound pattern of syllable-final stop voicing with acoustic analyses of syllable-final stops in the language. Second, we use the same acoustic evidence to argue for gradient phrase-final devoicing and regressive devoicing before voiceless obstruents. Third, we offer a preliminary historical explanation for the final voicing process and for its rarity: final voicing is a consequence of earlier, conditioned intervocalic voicing, preserved only when the final vowel was reduced or lost. Finally, we highlight the importance of the Lakota sound patterns to phonological theory. Traditional markedness accounts predict that such sound patterns do not and should not exist. In contrast, phonetic-historical accounts like evolutionary phonology explain skewed patterns of voicing in terms of common phonetically based voicing and devoicing tendencies, allowing for rare cases of final-obstruent voicing like that found in Lakota.
Like many indigenous languages of the Americas, Lakota is endangered. The acoustic component of this study serves to highlight the central role of endangered language documentation in phonological description and theory (Blevins 2007) and the continued importance of producing high-quality recordings in language documentation, as exemplified by the NLD. Independent of its scientific merit, we hope that this study of one indigenous language of the Great Plains will inspire other researchers to bring important data from indigenous languages to bear on central issues in linguistic theory [End Page 333] and encourage colleagues around the world to continue their excellent work in high-quality language documentation.
The Graduate Center, CUNY
365 Fifth Avenue
New York, NY 10016
Campus de la Nive
Château-Neuf, 15 Place Paul Bert
64100 Bayonne, France
2620 N Walnut St Ste. 810
Bloomington, IN 47404
revision invited 3 October 2019;
revision received 18 November 2019;
accepted 21 December 2019]
* We are grateful to the Lakota Language Consortium for allowing us to access and analyze sound files from the New Lakota dictionary for the purposes of this study, and to Ben Black Bear, Jr. and Iris Eagle Chasing, whose contributions to the New Lakota dictionary made this research possible. We also want to thank Language co-editor Megan Crowhurst, associate editor Khalil Iskarous, and three anonymous referees for their insightful comments, which have greatly improved this article. The first author pursued this work under the auspices of the Endangered Language Initiative, generously supported by the Provost’s Office of The Graduate Center, CUNY. The second author would like to thank Christopher Carignan, Aitor Egurtzegi, and Raphael Winkelmann for their helpful suggestions. The work of the second author was supported, in part, by an ERC Advanced Grant to Jonathan Harrington.
1. Throughout this article, we use the following abbreviatory conventions: acc: accusative case, adv: adverb, adverbial, C: consonant (nonsyllabic segment), caus: causative, cont: contracted (or truncated) form, diss.: dissimilation, Eng.: English, e.o.: ‘everyone’, Fr.: French, gen: genitive case, inan: inanimate, IPA: International Phonetic Alphabet, n.a.: not applicable, MVS: Mississippi Valley Siouan, NLD: New Lakota dictionary, pl: plural, prs: present, red: reduplication, resyllab.: resyllabification, sb.: ‘somebody’, sg: singular, sth.: ‘something’, suff: suffix, V: vowel (syllabic segment), vbz: verbalizer, VOT: voice onset time, wd.: word, <p:>: pause (in phonetic transcriptions), ‘.’ (period): used to mark syllable boundary.
2. Basque dialects differ as to whether /p, t, k/ are voiceless unaspirated or voiceless aspirated in onset position. In aspirating dialects, aspiration is limited to prevocalic position.
3. The description of Ingush, a Nakh-Daghestanian language of the Caucasus, reported to have final obstruent devoicing in Blevins 2006a, should be slightly modified. Following Nichols (2011:80), Ingush has only partial devoicing of voiced obstruents in word-final position. Acoustically, neutralization with the voiceless series is incomplete, though some speakers cannot perceive a contrast (Nichols 2011:8).
4. Final obstruent devoicing can be viewed as one of five common domain-final laryngeal sound patterns. The other four are: final deaspiration, as in Marathi (Houlihan & Iverson 1979); final aspiration, as in Sierra Popoluca (Elson 1947, de Jong Boudreault 2009); final deglottalization, as in Ganza (Smolders 2016); and final glottalization, as in Standard Thai (Henderson 1964, Harris 2001). In this study, we restrict discussion to final obstruent devoicing in languages with true obstruent voicing contrasts. In true voicing languages, the voiced series show significant voiced closure durations, and the voiceless series have insignificant VOTs. See Jansen 2004 and Beckman, Jessen, & Ringen 2013 for further discussion of true voicing languages, and Kakadelis 2018 for analysis of domain-final laryngeal sound patterns in ‘no-voicing’ languages.
6. The Siouan language family is also sometimes referred to as Siouan-Catawban to include the more distantly related Catawban languages. See Ullrich 2018:33–35 for a brief summary of language relationships within this family, and Parks & DeMallie 1992 on Lakota-Dakota dialects.
8. Though /bu/ and /pu/ contrast in Lakota, there is no /wu/, leading some to suggest that bu < *wu historically. However, since bá ‘to blame sb.’ and wa- contrast (cf. wa-1 indefinite object marker, wa-2 ‘with a knife’, wa-3 1sg for class I verbs), the contrastive status of /b/ seems secure in some vocalic contexts. Note that Lakota wa-1 corresponds to ba- in Dakota dialects and to ma- in the two Nakota languages.
9. Within evolutionary phonology, synchronic patterns of this kind are expected. In other frameworks, the synchronic voicing of /p, t, k/ to [b, l, ɡ] might be viewed as odd, since [b, l, ɡ] do not appear to form a natural class. However, with /l/ specified as [−continuant], one can view the spell-out of a voiced, coronal, non-continuant in Lakota as [l]. For other examples where /l/ shows [−continuant] behavior, or patterns essentially as /d/, see Mielke 2008.
10. For the purposes of this discussion, we follow Table 5 in treating aspirated, velar aspirated, and ejective oral stops as either single segments or clusters, indicated here by parentheses in Table 8. Whether or not these are treated as clusters, the same restrictions can be seen to be in effect.
11. There are a small number of potential initial CCC clusters limited to fast-speech forms: kčhí, kičhí ‘with sb.’; kčhíčho, kičhíčho ‘to invite e.o.’; kčhíčhopi, kičhíčhopi ‘a feast, party’; kčhíšnala, kičhíšnala ‘with him/her/it alone’; kčhízA, kičhízA ‘to fight e.o.’; kčhó, kičhó ‘to invite sb., call to’. However, if /čh/ is an aspirated affricate, then these can all be analyzed as CC clusters.
12. Note: though /h/ and /ȟ/ contrast word-initially (e.g. há ‘the skin or hide of sb., sth.’ vs. ȟÁ ‘to bury sb., sth.’; hé ‘animal horns or antlers’ vs. ȟé ‘mountain, mountain ridge’), they do not contrast after word-initial/ syllable-initial stops or affricates before oral vowels. In this position, [h] is found after [tʃ] and before high vowels [i] and [u]; [χ] is found before nonhigh back/central vowels [a] and [o], and before nasalized /aŋ/ and /uŋ/; [h] and [χ] are in free variation before [e]; and [h] is found before nasalized /iŋ/, unless it is the ablaut vowel, in which case [χ] is found, resulting in a limited case of contrast. Alternations support these phonotactics. For example, we find phíŋkpa ‘the top of anything’ from pȟá ‘the principal part of sth.’ + íŋkpa ‘tip’, where pȟ → [ph] / __ i.
13. Another source of coda consonants is glottal stop insertion at the end of statements after vowel-final words: alí [ʔaˈliʔ], [ʔaˈli] ‘he climbed up on it’.
14. Further work is needed to determine whether truncation of forms ending in …pȟV, …kȟV, …tȟa is generally possible, limited to adverbs, or lexically determined. At present, examples with …kȟV and …tȟa are limited to adverbs anúŋkȟa/anúŋg ‘on both sides’ and tókȟa/tog ‘how, how is it?’, while examples with …pȟV include the adverbs mentioned in the text, as well as verbs ičhápȟA/ičháb ‘to get accidentally stabbed’, mničhópȟA/mničhób ‘to wade in water’.
15. Truncation, voicing, and regressive and phrase-final devoicing (see §3.1) result in voiceless oral stop/ nasal stop variants when the nuclear vowel is nasalized, since a voiced stop after a nasalized vowel is realized as a (voiced) nasal stop. For example, truncated variants of núŋpa ‘two’ are núŋp [nʊ̃p] and núŋm [nʊ̃m] < [nʊ̃b].
16. Rood and Taylor (1996:449) continue: ‘When a nasalized vowel precedes these sounds, they may further shift to a nasal consonant’. Recall that we are restricting our attention to oral syllables in this study, partly for reasons of time, but also because our preliminary findings are that stops are sometimes only partially nasalized in this context, and measuring their acoustic properties is more complex than examining those following oral vowels.
17. Regarding the preference for the metrics we selected (AC coefficients) over others (percentage of voicing into closure or absolute voicing duration), we chose to perform measurements of AC coefficients during the whole stop closure with the understanding that they provide information that is comparable to that provided by percentage of voicing into closure, with the additional benefit of the reduction of the degrees of freedom of the researcher, thus reducing the risk of a false positive. We selected the median over the mean, in part to avoid skewed results due to potential artifacts or subpar segment alignments. The results of this method should not differ greatly from those of methodologies in other works on voicing in endangered languages (e.g. Coetzee & Pretorius 2010), where closure duration and voicing into closure are measured (often by hand, which might yield complications regarding the marking of boundaries), and then the percentage of voicing into closure is calculated. Segments with a voicing-over-closure percentage of over 50% are considered voiced, and those below 50%, voiceless. In our study, the median of all measurements of a given stop (performed at each 10 ms) would mark 50% of the distribution: if that point is over 0.5, then the segment is considered voiced, while segments below 0.5 are considered voiceless.
18. Initial [b] and [ɡ] are rare in Lakota. There were only thirty-eight tokens in our database, including words with initial <gl> clusters, where voicing of [ɡ] is predictable and allophonic, as described in §2.
20. In contrast, Mandan appears to have inherited the Proto-Siouan pattern without change: it has phonetic [n] and [m] only as allophones of nonnasal consonants adjacent to nasalized vowels.
22. For the purposes of this discussion, we have replaced standard Proto-Dakota *r and *w with *R and *W to signal to the reader that these sounds may have had obstruent-like properties. Proto-Siouan *pr- and *wr- (from vowel syncope) merge as MVS *[b]r, and *wr merges with MVS *[b]r in Proto-Dakota, where [b] could be represented as *p, *b, *w, or *W, since there is no contrast in this context. We write these as *WR here.
23. Since our interest is voiced obstruents, forms like Lakota-Dakota mní ‘water’ < *Wni < Proto-Siouan *wa-rį́:, where nasal assimilation takes *bnį > mní in Proto-Dakota, are not included in 5.
24. These [l]/[t] coda variants in preconsonantal position should not be confused with [lʼ]/[t] variants in compounds whose second members are vowel-initial. In arguably new compounds like Lakȟóta-iyÀ ‘to speak Lakota’ and Lakȟóta-iyàpi ‘Lakota language’, where the second member is vowel-initial, there are two common variants, one with intervocalic [lʼ] (often lenited to [l]) and one with [t], as in, for example, Lakȟól’iyà, Lakȟótiyà ‘to speak Lakota’. In these cases, the variant with [lʼ] is considered older and more formal than that with [t].
25. In a few cases, Dakota /l/ continues Proto-Dakota *y-.
26. Here again we see /l/ patterning with the obstruents, suggesting historical *d. For prevocalic /m/, there is only one example where truncation may be observed: the proper noun Iktómi ‘the trickster of Lakota myths’ can be contracted to Iktó.
27. Recent work on the Stoney variant of Nakhóta by the third author of this article suggests a similar sound pattern, with regular voicing of intervocalic stops. There is at least one language, Gitksan, where an acoustic study demonstrates voiced allophones of voiceless unaspirated stops in prevocalic position (Rigsby & Ingram 1990).