Peaks Beyond Phonology:Adolescence, Incrementation, and Language Change
What is the mechanism by which a linguistic change advances across successive generations of speakers? We explore this question by using the model of incrementation provided in Labov 2001 and analyzing six current changes in English. Extending Labov's focus on recent and vigorous phonological changes, we target ongoing morphosyntactic(-semantic) and discourse-pragmatic changes. Our results provide a striking validation of the incrementation model, confirming its value as a key to understanding the evolution of linguistic systems. However, although our findings reveal the predicted peak in the apparent-time progress of a change and corroborate the female tendency to lead innovation, there is no absolute contrast between men and women with respect to incrementation. Instead, quantitative differences in the social embedding of linguistic change correlate with the rate of the change in the speech community.
language change, adolescence, incrementation, vernacular reorganization
A recurrent finding in studies of language change is that innovations initially spread slowly, reach a maximum rate at mid-course, and then slow down as they near completion (Altmann et al. 1983, Bailey 1973, Kroch 1989, Labov 1994: 65-67, 2001; see also Weinreich et al. 1968). This trajectory has been enshrined in the form of the now-familiar S-curve, a statistical curve in which the frequencies are cumulative (see Figure 1).
In accounting for this pattern, two issues arise. The first concerns the manner in which a change comes to be transmitted across successive generations, known as the [End Page 58] TRANSMISSION PROBLEM (Labov 2001:416ff.). The second concerns the mechanism by which a change advances, known as the INCREMENTATION PROBLEM (see Labov 2001). The challenge is to understand how transmission and incrementation are implemented within the speech community. Studies of diachronic change have shown that linguistic change is typically discontinuous (Joseph & Janda 2003:20), often proceeding chaotically in fits and starts, bursts of change (Lass 1997:304), and exaggerations (Janda 1999:329). At the same time, it must necessarily be the case that:
as the members of each identifiable generation recreate language for their own use, language is continuously being integrated into a society that is not uniform in terms of age but still takes in new members seamlessly from new entries into it (i.e. new individuals).
In this way, the progression of a change cannot be divorced from the social context in which it is embedded; transmission and incrementation are intricately entwined. Studies of synchronic change attempt to model this process in living speech communities. Such endeavors have revealed that when data from younger groups (e.g. preadolescents and adolescents) are included in an apparent-time analysis, there is a distinctive, repeating pattern: a peak in usage of incoming forms occurs among speakers who are approximately seventeen years of age. If this is the natural trajectory of incrementation in language change, how can it be explained?
In early research it was assumed that the apparent-time trajectory 'would continue in the same direction as the age group became younger and younger' (Labov 2001: 454). This pattern is what Labov refers to as 'the monotonic function of age' (2001: 171). Empirical data have not borne this out. When studies have included preadolescents and adolescents among the age cohorts under investigation, the results have revealed a crest in the curve of change (e.g. Ash 1982, Cedergren 1973, 1988) rather than the expected continued upswing. The frequency of incoming (i.e. innovative) forms is highest among adolescents; preadolescents are consistently found to use incoming forms less frequently, not more frequently, than their immediate elders, while postadolescents also use the same forms less frequently. The difference in usage between these critical age groups creates a peak in the apparent-time trajectory. A paradigmatic example comes from Cedergren's milestone study of (ch) lenition in Panama Spanish (Cedergren 1973), replicated more than a decade later (Cedergren 1988). It is this peak, illustrated in Figure 2 from Cedergren 1988, that forms the point of departure for the current investigation.
When the adolescent peak was observed for a number of changes, it was initially suggested that the incoming forms had reached their limit and were receding (see Labov 2001:454-55). However, the recurrence of the drop-off and its coinciding peak within similar age cohorts across a variety of studies suggested a more principled explanation. The foundational work on the issue was done by Labov (2001), who proposed a model of linguistic change based on the logistic function (Verhulst 1845). This model is intended to capture the logistic trajectory of change, evidenced by the S-curve, so as to account for the linear advance of change across adult cohorts, that is, monotonicity (Labov 2001:447, 454). Because a model of language change must also account for the peak in the apparent-time curve, however, something supplementary is essential.
Labov's (2001) argumentation draws on empirical evidence from ongoing sound change in Philadelphia, an urban community in the United States. A question that needs to be asked is whether Labov's model has broader applicability and, in particular, can be extended to other components of the grammar. As such, we consider ongoing morphosyntactic(-semantic) and discourse-pragmatic changes in another urban community [End Page 59] in North America, Toronto, Canada. Our data confirm that the incrementation model is applicable to changes beyond phonology, thereby adding to the building picture that an adolescent peak may be 'a general requirement of change in progress' (Labov 2001:455). At the same time, the results suggest that certain assumptions of the model may have to be revisited. The data thus provide a new layer of findings and observations to add to the explanatory value of this 'first approximation' (Labov 2001:454) of the basic model.
The discussion is organized as follows. We first provide context for the issues raised in this analysis, offering some essential background, and then review two possible models of language change, uniform incrementation and logistic incrementation. The latter draws on the logistic function, a cumulative normal distribution that produces an S-curve. This is the model argued for in Labov 2001 and we focus here on its assumptions and predictions. As part of that discussion we review the evidence for the adolescent peak and discuss theoretical implications of its existence. We then outline basic issues in variationist methodology. At this point we introduce the variables used in the current study, describing in detail the procedures followed for their analysis. We then present our findings and discuss the results in relation to Labov's model. [End Page 60]
2.1. Apparent Time.
APPARENT TIME is a theoretical construct, one of a set of methodological tools that has provided the basis for a synchronic approach to understanding language change. Analytically, it functions as a surrogate for real time, enabling diachrony to be viewed from a synchronic perspective. In an apparent-time study, generational differences are compared at a single point and are used to make inferences about how a change may have taken place in the (recent) past.1 Age differences are assumed to be temporal analogues, reflecting historical stages in the progress of the change. The technique has been in use since the early 1900s (e.g. Gauchat 1905, Hermann 1929) and has become a keystone of variationist sociolinguistics (Bailey 2002, Bailey et al. 1991, Chambers et al. 2002, Labov 1963, 1966).
The apparent-time construct 'relies on the assumption that individual vernaculars remain stable throughout the course of an adult lifetime' (Bailey 2002:320). The linguistic behavior of any adult cohort can only be interpreted as reflecting a distinct stage of a change if it is assumed that usage has remained essentially fixed or static over the course of the lifetimes of those individuals. There is an increasing body of research, however, documenting ongoing change throughout the lifespan. This type of change can only be ascertained by comparing data from two or more periods, achieved via either a trend study or a panel study. A TREND STUDY involves resampling the same age range of speakers in the same speech community at different points in time (Bailey 2002, Sankoff 2006). Large-scale contemporary trend studies are rare but some important resampling enterprises have been undertaken, allowing for comparisons of the same community in real time. The communities in question include Panama City, Panama (Cedergren 1988), Montreal, Canada (Blondeau 2001, Sankoff & Blondeau 2007, Sankoff et al. 2001), Martha's Vineyard, US (Blake & Josey 2003, Pope et al. 2007), Norwich, England (Trudgill 1988), and various locales in Finland (Nahkola & Saanilahti 2004, Nordberg 1975, Nordberg & Sundgren 1998, Paunenen 1996). A PANEL STUDY involves resampling the same speakers. This means that individuals must be followed for an extended period of time, making panel studies an even rarer commodity in the field (though see, for example, Brink & Lund 1979, Nahkola & Saanilahti 2004, Palander 2005, Robson 1975, Sankoff & Blondeau 2007, Tagliamonte 2007). It is also possible to use diachronic data to investigate the issue of change throughout the lifespan. In this case, researchers rely on written documents in the historical record produced by the same individual over their lifetime (e.g. Nevalainen & Raumolin-Brunberg 2003: 83-109, Raumolin-Brunberg 2005, 2009). The studies cited here are situated in distinct temporal, geographic, and linguistic locales and they focus on different levels of the grammar (phonology, morphology, and discourse pragmatics), yet they converge in demonstrating that individuals are capable of linguistic adjustment throughout adulthood. The critical question is, what is the nature of this adjustment?
Trend studies have consistently established an increase in the rate of occurrence of incoming forms. Cedergren (1988) reported that in the thirteen years intervening between her initial and follow-up investigations in Panama, the frequency of (ch) lenition had increased among speakers between the ages of forty and seventy years. Tagliamonte and D'Arcy (2007b) demonstrated that in a period of just seven years, speakers in the Canadian province of Ontario had substantially increased their use of quotative be like. [End Page 61] Panel studies similarly show individuals changing the frequency of features involved in change. In the Swedish town of Eskilstuna, Nordberg and Sundgren (1998) found that speakers who had been under the age of forty-five years in 1967 (Nordberg 1975) had continued to decrease their use of the local plural neuter suffix -ena in favor of the standard form -en. Blondeau's (2001) study of Montreal French revealed that speakers interviewed in 1971, 1984, and 1995 had continuously increased their use of simple personal pronouns (on, tu, vous). Sankoff (2004) performed a case study of two of the boys involved in the British documentary series Seven up, filmed in seven-year increments beginning in 1963 when the children were seven years old. She found that both boys had made 'some significant phonetic . . . alterations to their speech after adolescence' (Sankoff 2004:136). There is thus strong consensus from both trend and panel studies that individuals can shift the frequency of linguistic features well into adulthood. It seems, therefore, that the assumption of postadolescent linguistic stability that underlies much sociolinguistic research may not reflect the actual situation as accurately as initially believed (see also Labov 2001:446-47).
Nonetheless, adult participation in change does not undermine the utility of apparent time as a heuristic of ongoing change (Boberg 2004:266, Sankoff & Blondeau 2007: 32). Studies aimed at testing the empirical validity of apparent time as an analytical construct (e.g. Bailey 2002, Bailey et al. 1991) have concluded that it remains a viable tool in sociolinguistic research, functioning as 'an excellent surrogate for real time evidence' (Bailey 2002:329). There are a number of reasons for this. First, if speakers shift in accordance with ongoing change during their adult lives, then apparent time in fact underestimates the rate of change. Second, it seems that generational change (change in successive generations of speakers) and communal change (change by individuals as they age) function in concert. In Nahkola and Saanilahti's (2004) study of fourteen changes in a rural Finnish town in 1986 and 1996, ten supported the apparent-time prediction: the changes had increased among the same speakers in the ten-year interval, but they had advanced even further among the subsequent generation (see also Cedergren 1988). Third, and perhaps most critically, changes in adulthood appear to have a restricted influence; it remains the case that individuals are more capable of change earlier in life than later (e.g. Palander 2005). Sankoff (2004:136) stresses that although the boys in the Seven up documentaries had made adjustments to their speech, neither had 'made himself over linguistically'. It also seems that change is more often the exception, representing small-scale increments among a small proportion of adults (e.g. Brink & Lund 1979, Sankoff et al. 2001) with many not participating in ongoing change (e.g. Raumolin-Brunberg 2009). Further, in the small number of studies that have considered different types of variables, it is clear that the nature of the change and its stage of development are critical for determining lifelong adjustments. For example, only changes with relatively high rates of use (and thus robust variability) in the earlier of the periods investigated lead to ongoing change at a later period (Nahkola & Saanilahti 2004:89, Raumolin-Brunberg 2009). Thus, virtually all of the evidence for ongoing linguistic change in adulthood derives from an increase in frequency of forms. Little is known about the grammar underlying the frequency effects, such as linguistic constraints on variation. Apparent-time research on the development of quotative be like has provided a hint that while adults are able to increase their rate of use of this incoming form, they do not participate in ongoing grammatical readjustments that mark the form's entrenchment in the quotative system (Tagliamonte & D'Arcy 2007b:213). [End Page 62]
In sum, the building evidence indicates that individuals can increase their FREQUENCY of usage. Extenuating factors such as the state of development of the change, the level of use and the extent of variability, and the surrounding social embedding of the change and the implications of its use for social advancement, however, all play a part in speakers' participation in linguistic change during adulthood. Moreover, despite questions concerning the validity of apparent time (Bailey 2002, Bailey et al. 1991, Tillery & Bailey 2003), even allowing for ongoing change across the lifespan, both trend and panel studies continue to support the apparent-time construct as 'well grounded' (Sankoff 2004:137). It remains a 'powerful lens for interpreting the past' (Sankoff 2006: 115). Key for our argumentation here is that apparent time is an empirically sound and effective tool for modeling linguistic change synchronically. In the discussion that follows, we continue to assume that individuals attain a critical threshold of constancy-at least with respect to the operation of constraints on variation-of their grammar in early adulthood.
2.2. Gender Asymmetry, Vernacular Reorganization, and Stabilization.
A point easily taken for granted in language change is that rates of use of particular linguistic forms may differ among subsequent generations of speakers. Grandparents, parents, and their children talk differently, a pattern that perpetuates until such time as a change either comes to completion across all generations or the variation between forms stabilizes. The compelling question underlying this rather simple observation is, how does this happen?2 A somewhat limited view might rely on language-internal factors (as defined by various theoretical approaches) as the forces behind transmission, for example, the effects of frequency and analogy (as in exemplar theory; see Bybee 1985, Johnson 1997, Pierrehumbert 2001), the pressure to maintain symmetry (as in Martinet's approach to sound change; see Martinet 1952, 1955), the maximization of simplicity or transparency (as in classical generative historical linguistics; see Kiparsky 1968, 1971, Lightfoot 1979), the reordering of constraints (as in optimality theory; see, for example, Nagy & Reynolds 1997), or the resetting of parameters (see Labov 2001: 261-63). Such a view, however, fails to consider the social context of language use. Change does not occur in a vacuum. Indeed, one of the most consistent findings of sociolinguistic research has been the gender-asymmetric nature of the process.
A general sociolinguistic finding is that women lead linguistic change (Labov 1990). This reflects an overarching tendency that holds for the majority of recorded changes. Labov's (1990:218-19, 2001:284) survey of sound change suggests that in the few instances where men are at the forefront, the changes are relatively isolated ones (i.e. they are restricted to individual sounds and do not involve the system as a whole, such as in the case of chain shifts). The generalization of female dominance in language change is somewhat more complicated, however, in that men do not simply follow their female counterparts. Gender asymmetry develops early in the progression of a change, and the confluence of evidence from a number of studies in a range of speech communities (e.g. Buenos Aires (Wolf & Jiménez 1979), Philadelphia (Labov 2001), Seoul (Chae 1995)) suggests that the underlying cause is that once a change becomes associated with women, men either retreat from or resist the incoming form. The resultant split in the linguistic behavior of men and women is most exaggerated when the rate of change is most vigorous (i.e. during the upswing of the S-curve). During this [End Page 63] period, men are often a full generation behind, displaying levels similar to women of the previous generational cohort. This gap has been observed since early dialectological work by Gauchat (1905), who focused on ongoing diphthongization and monophthongization in the Swiss village of Charmey, and has been corroborated in a number subsequent studies, including devoicing of [ž] (Wolf & Jiménez 1979), raising of (oh) (Chae 1995), fronting of (aw) (Chambers & Hardwick 1985), fronting of (eyC) and (aw) (Labov 2001:306), and so forth. Once established, gender asymmetry remains visible until the change nears completion, at which point the difference between men and women is gradually and successively reduced (Labov 2001:306-9).
A more subtle kind of gender asymmetry arises in the formation of the vernacular. A strong and recurrent finding from dialect research is that children acquire the vernacular of their primary caretaker (Kerswill 1996b, Kerswill & Williams 2000, Labov 2001). In most cases, this caretaker is a woman (most often the mother). Hence, the vernacular that children first speak is that of their female caretakers.3 Since it is clear that children must at some point come to speak differently from these caretakers in those cases where changes progress, it follows that children must at some point focus on a norm that is different from the one they have acquired.4 This changing of the vernacular is known as VERNACULAR REORGANIZATION (Labov 2001:415) and it is a critical element in the incrementation of linguistic change. Without reorganization, a change will not progress for the simple reason that children and their parents not only speak alike but will also continue to do so. Vernacular reorganization must therefore occur in those cases where a linguistic change advances. This raises two age-related issues: When does reorganization begin and when does it end?
The beginnings of vernacular reorganization have been examined in some detail, most notably in the work of Payne (1980) in King of Prussia, Pennsylvania (in the greater Philadelphia area) and of Kerswill and Williams in Milton Keynes, in southern England (Kerswill 1994, 1995, 1996a,b, Kerswill & Williams 2000). The mounting evidence indicates that shifts begin to take place in the vernacular after the age of four years. Consider, for example, Figure 3, from Kerswill & Williams 1994 (cited in Kerswill & Williams 2000). This figure charts the use of various phonetic realizations of the (ou) variable (i.e. the fronting of [æ I] in Milton Keynes). While the older children in the sample, the eight- and twelve-year-olds, appear to be converging on a fronted diphthong with a front rounded off-glide (i.e. ), the youngest children display a distribution similar to that of their caregivers, favoring a realization with centralized [End Page 64] onsets and back rounded off-glides (i.e. ).5 Such results suggest that the older children have moved away from this norm (Kerswill 1996a:193), a shift that is first apparent with the eight-year-olds. As summarized by Labov (2001:427), while the four-year-olds reflect the dialect of their caretakers and 'are not yet registering any strong linguistic influence from the speech community', the eight-year-olds show 'a striking departure from this pattern'. From these and other similar results, we can conclude that vernacular reorganization is already underway by age eight and its inception must be situated in the window between this time and age four.6
What seems less clear is the point at which vernacular reorganization ceases. We know that certain types of changes are easily incorporated into the vernaculars of speakers of any age, such as the adoption of new lexical items (Kerswill 1996a:179), but the language-acquisition literature has clearly indicated that there are limits to what can be learned beyond a certain age in terms of the phonological, morphological, and syntactic components (and their interfaces) of the mental grammar. In the dialect-acquisition literature, research by Payne (1980), Trudgill (1986), Chambers (1992), and others has clearly illustrated that the ability to master local norms is highly correlated with the age of exposure to the second dialect. The general consensus arising from this work is that under the age of seven, children 'will almost certainly acquire a new dialect perfectly', but over the age of fourteen 'almost certainly will not' (Chambers 2003: 179).7 Evidence such as this suggests that, after a period of lability, the grammatical system stabilizes and the vernacular becomes fixed.
3. Incrementation: Models of Language Change.
Incrementation refers to the process by which 'successive cohorts and generations of children advance a change beyond the level of their caretakers and role models' (Labov 2007:346), resulting in transmission over many generations. Investigation of the mechanism responsible for [End Page 65] pushing change forward in a continuous direction invokes two particular models that are discussed and tested by Labov (2001). The first is uniform incrementation. The second is logistic incrementation.8 These models differ in significant respects, yet in Labov's theory of incrementation both are considered to play important roles in language change. They share the same underlying assumptions regarding gender asymmetry, vernacular reorganization, and vernacular stabilization. Specifically, these models are designed to capture the incrementation of female-dominated changes, since this appears to be the default option. The models therefore incorporate the generation gap between men and women and assume that while children of both sexes inherit their initial increment (i.e. rate of use) from their mothers, only girls increase this increment via vernacular reorganization. The onset of this shift is set at the age of four years. It is further assumed that this process then continues until a point of stabilization, which is set at seventeen years of age. As Labov (2001:447) is quick to note, however, the age of seventeen is not intended as a substantive statement. Rather, seventeen serves as a convenient notation in the model. As becomes clear, the age of stabilization has crucial ramifications for predictions regarding the particular point at which the peak appears on the slope of change.
3.1. Uniform Incrementation.
Given the three key assumptions-reorganization, incrementation, and stabilization-uniform incrementation results in the following way. Female children will acquire their caretaker's vernacular and use it until the age of four. They will then begin to 'undergo the influence of the change' (Labov 2001:448), and each subsequent year, up to age seventeen, will see a uniform increase in the unit of change (e.g. shift along the F1 or F2 parameters, frequency, etc.). At the age of stabilization, the reorganized vernacular stabilizes, presumably both in terms of frequency and constraints. Then, when these women become mothers themselves, this vernacular is passed to the next generation of speakers, who also undergo the same process of reorganization, incrementation, and stabilization. The pattern of change resulting from this rather mechanical process would resemble the step-wise trajectory in Figure 4, reproduced from Labov 2001:449, figure 14.2. In this figure, a generation is set at twenty-five years, incrementation is modeled via yearly increments of one, and the time span is set at one hundred years. These time units are arbitrary and should not be considered as meaningful for any purpose other than that of exemplification.
In this trajectory, the change reaches a maximum that is maintained for approximately one generation until it increases again in the following generation. A uniform pattern of incrementation such as this IS confirmed in empirical sociolinguistic research, but its generality is limited. It is not, for example, characteristic of data from female speakers. Rather, Labov (2001:449) observes that uniform incrementation-when it occurs-appears to be restricted to males, where he attributes the progression of change to the generational influence of female caretakers. That is, males inherit the increment of their caretaker but do not participate in its advancement as do females.
3.2. Logistic Incrementation.
Given the lack of generality of uniform incrementation, a model other than that in Fig. 4 is necessary to account for the incrementation of change across time. Moreover, uniform incrementation, with its step-wise progression, is not compatible with the S-curve of linguistic change seen in Fig. 1. The logistic progress of a change is responsible for this curve (e.g. Bailey 1973, Kroch 1989, Labov [End Page 66] 1994, 2001, Weinreich et al. 1968), created by the differing rates at which it advances. After initially spreading quite slowly, the rate of change increases to a maximum at midpoint before slowing down again as the change nears completion. This type of distribution can be generated by the logistic expression (also known as the Verhulst model or the logistic growth curve; see Verhulst 1845), which is given in 1.9 As implemented by Labov to model language change, K1 in the equation in 1 is the maximum possible change in a unit of time, K2 is the limits of the change, N0 is the starting point, r is the rate of change, and t is time (Labov 2001:450).
To illustrate how this equation generates a logistic curve, the values of the expressions in 1 can be arbitrarily set, after Labov 2001, as follows. As before, it is assumed that the relevant unit of measurement is the year, which is simply an expedient unit and should not be interpreted as significant. With this in mind, N0 is set at year one, t equals time in years, and K1 and K2 are set at one hundred. In other words, our hypothetical change progresses over one hundred years, though any other period is possible. Figure 1 was generated using the increments produced from this equation, the values of which are given by Labov (2001:452, table 14.1). In modeling incrementation in this way, the progress of change at any point is determined by the sum of the increments for each year it has been ongoing. At year ten, for example, n in Fig. 1, the change has a level (or frequency) of 2.65, reached by adding the increments from years two through ten to the initial level in year one. The values are reproduced here from Labov 2001: 452 in Table 1.10
The mechanics of this logistic model are more complicated when transmission is factored in. The assumptions built into the model stipulate that children first acquire [End Page 67]
their caretaker's vernacular, which in the case of young girls then undergoes reorganization continuing from after year four until stabilization at seventeen.11 In the case of our hypothetical change, tracking its progress at twenty-five-year intervals results in the slopes shown in Figure 5 (from Labov 2001:453, figure 14.5).
The apparent-time perspective in Fig. 5 differs substantially from that in Fig. 4. No longer evident is the step-wise progression of uniform incrementation that characterizes men more so than women. Instead, we see a linear curve, in line with the trajectories that are typical of female speakers in sociolinguistic research (Labov 2001:Ch. 9). Strikingly, these curves do not continue along an upward trajectory as speaker age decreases. Instead, exactly as observed in natural data, they are punctuated by a peak, thereby providing some corroboration that the model conforms to the actual situation (see Labov 2001:455). [End Page 68]
The question remains, however, as to how the peaks are produced. The answer falls out from the assumptions of the model. For every year a change has been ongoing, its level becomes successively greater. This means that from year to year, children acquire a larger increment than older children did.12 During vernacular reorganization-when speakers participate in the incrementation process-they carry the change further than speakers from the previous cohort, who necessarily began at a lower level. This accounts for the continued progression of the change, and explains why the slope rises in Fig. 5 from right to left. In other words, the increase from older to younger speakers reflects the monotonic association of frequency with age that is characteristic of change in progress (Labov 2001:460). The drop-off after the maximum, toward the youngest speakers, is attributable to greater length of participation in the change. Observe that the peaks are situated at seventeen, the age of stabilization. By this time, speakers have participated in the change longer than their younger peers have: seventeen-year-olds have had twelve years of incrementation, thirteen-year-olds have had eight years, nineyear olds have had four years, and so forth. Consequently, the seventeen-year-olds have accrued more increments. In other words, the cause is purely mathematical; the peak is due to the progress of the change in time as each new cohort reaches its maximum. Come the age of seventeen the vernacular stabilizes and incrementation ends. Not yet having reached this critical age, the younger age groups continue to participate and the sum of their increments eventually reaches a new height.
What can be deduced from the above discussion is that by the age of stabilization, every cohort reaches a higher level of change than their immediate elders and younger peers. This is visible in Fig. 5. First, a peak is apparent for every curve, though for reasons to be discussed shortly, it is more marginal in some cases (e.g. 2000) than in others (e.g. 1950). Second, taking the peak from 1950 as an example, this maximum will be surpassed by the next group to reach seventeen, represented here by the thirteen-year-olds. To see this, a straightforward comparison with the curve from 1975 is necessary. The seventeen-year-olds from 1950 correspond to the forty-one-year-olds in 1975. The levels are identical. The reason is simple: having reached the age of stabilization, the seventeen-year-olds of 1950 ceased to participate in the change. Incrementation was halted and thereafter the level remained constant. In contrast, the thirteen-year-olds of 1950 correspond to the thirty-seven-year-olds of 1975. In 1950, they still had another four years of incrementation to undergo. By the time they reached the age of seventeen, they had surpassed the previous maximum through the addition of greater increments. Had a survey been conducted in 1954, a new peak would have been visible. The seventeen-year-olds of 1950 would have then been on the down-slope. This can be seen by comparing the level of the thirteen-year-olds in 1950 with the level of the thirty-seven-year-olds in 1975: it is now above that of the forty-one-year-olds.
As noted earlier, the discovery of a peak in apparent time marked 'an idiosyncratic or at least unexpected feature' (Chambers 2003:223) in the trajectory of change. Ash (1982) suggested that the peak might indicate that the change was receding. Cedergren (1988:53) alluded to the social importance of the incoming form in the 'linguistic marketplace' (Bourdieu & Boltanski 1975), the type of deductive change considered adaptive (see Andersen 1973). In this case, the explanation put forward was that the [End Page 69] young adults-spurred by their sensitivity to the social importance of the new variant-were using it hypercorrectly.13 Another possible interpretation is that linguistic change proceeds through socially motivated exaggeration as successive generations of speakers push innovating forms beyond the model set by the earlier generations (Janda 1999), again implicating the difference between innovations that are adaptive and emerge from the speech community and those that evolve from the linguistic system (see Andersen 1973:789-90). There may also be a sociolectal retrenchment toward adult norms following the adolescent years (Chambers 2003:195). In this latter view, the peak simply reflects age grading. As we have just seen, however, the peak in apparent time is created by the logistic incrementation of linguistic change. The 'drop-off' that occurs among younger speakers (e.g. the five-, nine-, and thirteen-year-olds in Fig. 5) is not in fact a drop at all. Rather, their lower levels fall out from the shorter periods they have been participating in the incrementation process. They simply have not had time to amass the increments that their immediate elders have.
To provide empirical support for his incrementation model, of which the peak in apparent time is an indelible aspect, Labov (2001:458) offers the findings from nine ongoing female-dominated sound changes in Philadelphia. The results are unequivocal: among women a peak is visible for every variable, eight among thirteen-to-sixteen-year-olds and one among seventeen-to-twenty-nine-year-olds. Such results provide strong support for the model exemplified in Fig. 5 above. Among men, no peak is apparent for five of the sound changes. Instead, the coefficients continue to rise to the youngest speakers as predicted.
A theoretically interesting and important corollary of the peak is the assumption that the phonology stabilizes around seventeen years of age. It is length of participation (or lack thereof) in the incrementation process that accounts for the peak's existence. It is the age of stabilization, however, that accounts for its position in the apparent-time slope. In Fig. 5, the peak is consistently situated among the seventeen-year-olds. While older speakers have become 'linguistic adults, WITH A STABILIZED PHONOLOGY, [the younger speaker] is still advancing' (Labov 2001:463, emphasis added). The seventeen-year-olds mark the transition point between these two stages in the development of the vernacular. Above the age of stabilization, speakers have ceased to increase their increments. Below the age of stabilization, speakers have not participated for as long and so have not yet surpassed the seventeen-year-olds. Stated plainly, the peak appears at the point of vernacular stabilization, whatever age this may be. If stabilization settled, for example, at twenty-one, then this is where the peak would occur. The implications are clear. The peak will occur at whatever point the vernacular stabilizes. Can it be extrapolated from empirical evidence that other components of the mental grammar stabilize at the same time the phonology does? Evidence for this would be a peak similarly situated in apparent-time data for changes beyond phonology.
Is it reasonable to assume, however, that a peak will be apparent for all changes or that all changes will stabilize at the same point in an individual's life? Because a peak in the apparent-time trajectory is an inherent aspect of the incrementation of a linguistic change in the speech community, this suggests that it should be theoretically impossible for one NOT to manifest in apparent-time data if a change has been caught at the [End Page 70] appropriate phase. Returning once more to Fig. 5, of the four slopes shown here, it is the peak for the year 1950 that is most clearly defined. Given the one-hundred-year limit, 1950 is precisely the midpoint of this hypothetical change. This is when the rate of change is at its maximum. As the change progresses and its rate slows, the peaks become successively smaller until the final year, 2000, where the curve departs little from a straight line. In short, the prominence of the peak relates directly to the rate of change: the faster the rate, the greater the peak. The converse is also true. The peak is therefore dependent on the rate of change: the increments must be large enough to create one. We can therefore predict that for very slow-moving changes, such as those nearing completion, a peak may not be visible because the increments may be too small.
To summarize, the logistic-incrementation model proposed by Labov (2001) not only provides a mathematical model that is consistent with the S-curve of linguistic change, but it also predicts a peak among females near the age of stabilization. Indeed, Labov considers the peak 'a general requirement of change in progress' (2001:455). In contrast, no peak is expected for males, who are assumed to inherit the increment of their female caretaker but not to participate in moving it forward (Labov 2001:457). Instead, male index values are predicted to rise steadily as each group acquires a successively higher increment, but no allowance is made for individual incrementation components (Labov 2001:457, figure 14.8).
4. Methodological Detail.
The above discussion raises a number of questions. The specific issues that concern us include the following: (i) Will the adolescent peak be evident in changes in morphosyntax and discourse pragmatics?, (ii) If so, what does it tell us about stabilization in these components of the grammar?, and (iii) Does participation in ongoing language change differ according to gender? If not, what interpretation does this suggest? To answer these questions, we consider evidence from six changes that are currently in progress in Toronto, Canada (though they are not necessarily restricted to this locale).14 Three are morphosyntactic(-semantic) innovations: have to for expressing deontic modality, have for marking stative possession, and going to for encoding future temporal reference. The remaining three are discourse-pragmatic innovations: like as a discourse marker, be like as a quotative complementizer, and so as an intensifying predicate adverb. Each of these forms has previously been subjected to full-scale quantitative study as part of an ongoing research project tracking changes in the Toronto speech community. In this section we first describe the archive from which our data were drawn. As a backdrop to our analysis we then introduce the variationist method, its practice, and implications for the study of features beyond the phonology. In the following section (§5) we detail the method applied in each of the six analyses that provide the substance for the current investigation. Since morphosyntactic(-semantic) and discourse-pragmatic variables differ in a number of respects from phonological ones, we situate each change in its sociohistorical context, discuss the variable forms, and provide the necessary information concerning the circumscription of the variable contexts. We also discuss the operation of any social and linguistic constraints on the use of the individual forms.
The data were drawn from the Toronto English Archive (Tagliamonte 2003-2006). This database comprises more than 350 hours of recorded spontaneous [End Page 71] speech, representing approximately 1.8 million words. The speakers-all of whom were born and raised in metropolitan Toronto-range between the ages of nine and ninety-two years and are stratified by gender as well as by socioeconomic factors such as occupation and education. The data were collected following standard sociolinguistic interviewing techniques (Labov 1970, 1971), and individual speakers are represented by anywhere from one to three hours of informal conversational data. The discourse is varied but is uniformly lively and interactive and includes numerous narrative sequences. These materials thus capture contemporary vernacular Canadian English as spoken in Toronto, the major urban center of the country.
Table 2 displays the 152-participant sample that forms the baseline for the current investigation. To ensure direct comparability with Labov 2001, the speakers have been categorized by age following the protocol established there; the most critical divisions occur below the age of stabilization. Labov (1994:47) argued that it is 'the youngest stratum of speakers that will give us evidence on the state and direction of a linguistic change in progress' and concluded that apparent-time data must include speakers as young as eight years old (Labov 1994:49). The youngest speakers in the present study are nine; the under-thirteen group consists of nine-to-twelve-year-olds.
Every attempt has been made to ensure that the sample constitution was as robust and as representative as possible. As Table 2 indicates, most cells comprise six to eighteen speakers. Importantly, the critical nine-to-twenty-nine-year-old cells remain identical for each variable, containing the full sample as shown for these age groups in Table 2.15 Moreover, the pool of individual speakers remains constant across variables. This means that this investigation is based on a consistent comparison of six different variables as used by the same set of speakers.
4.2. The Variationist Method.
The analytic approach adopted in this research falls within the framework of empirical linguistics known as variationist sociolinguistics, part of the 'descriptive-interpretative' strand of modern linguistic research (Sankoff 1988a:142-43). Studies employing this methodology are based on the premise that the features of a given speech community, be they structural, phonological, lexical, or discursive, may vary in a systematic way (Young & Bayley 1996:254). These 'variables' reflect alternatives with the same referential value (meaning) in running discourse (Sankoff 1988a:142-43) and as such are members of the same structured set in the grammar (Wolfram 1993:195). In its most basic definition, the linguistic variable is 'two or more [End Page 72] ways of saying the same thing' (Labov 1972, Sankoff 1980b). The choices, however, are not random and when examined quantitatively, systemic trends and patterns emerge that reveal 'structural and ordered' heterogeneity (Weinreich et al. 1968:100-101).
The importance of variationist analysis lies in its ability to model these choices quantitatively, to assess statistically the simultaneous, multidimensional factors impacting on them, and to evaluate their relative strength and significance. Once an adequate number of choices has been taken into account, generally accepted to be in the range of thirty tokens per environment (Guy 1980, 1993), the selection of one variant or another can be modeled statistically (Cedergren & Sankoff 1974, Labov 1969).16 A foundational concept is the PRINCIPLE OF ACCOUNTABILITY. In variationist sociolinguistics, accountability means that every variant that is part of the variable context, whether the variants are realized or unrealized elements, must be included for analysis. In other words, the variant forms under investigation-in the case at hand, incoming variants in the community-are analyzed along with all the other variants involved in the linguistic system of which they are a part. In conducting an accountable study, the analyst must report variable features as a proportion of the total number of environments where the variant had the potential to occur, whether it actually occurred in the environment or not.
A critical assumption underlying the principle of accountability is that variants differ relatively little in terms of function; they have the same referential value. However, when the linguistic variable lies beyond phonology, as with all the linguistic features under investigation here, the variants may not have obvious similarities based on function alone. This requires a full understanding of the area of grammar that is participating in the variation, what is referred to as the variable context (i.e. the linguistic environment where the variation occurs). Circumscribing the variable context is a painstaking task, requiring the analyst to 'ascertain which structures of forms may be considered variants of each other and in which contexts' (Sankoff 1982:681). The latter procedure involves a 'long series of explanatory manoeuvres' (Labov 1969:728-29) in which the methodological particulars of the study-what has been excluded, what has been included, and how each instance was treated-are detailed following the scientific method. Only those tokens that are part of the variable context, that is, are comparable in function, are included for analysis.
Variants undergoing change, particularly when a change involves the development of grammatical meaning, may have different lexical sources as well as different histories in the language. For example, the alternation between the will future and the going to future (§6.5) has distinct verbs as the sources: the synthetic form comes from the Old English verb willan while the periphrastic future has its origin in the motion verb 'to go'. Such dissimilarities make it impossible to derive the variants from any meaning-preserving grammatical rule. At the same time, variable forms may retain semantic nuances that make them functionally distinct, and thus invariant, in certain contexts. Others still may become entrenched in formulaic utterances and endure long past their productive life in the vernacular (e.g. Shall we go?). In other words, linguistic variables [End Page 73] beyond phonology present complications for the variationist model of analysis. The extension of the model to nonphonological variables has thus marked an important development of quantitative sociolinguistics, one that has involved advancements of considerable depth in two areas: (i) methods for circumscribing the context and (ii) the study of 'layered' forms involved in grammatical change (Hopper 1991, Hopper & Traugott 1993).
At the crux of the issue is that linguistic variables beyond phonology require methods for circumscribing the variable context that extend beyond semantic equivalence. No two forms can have identical meaning, even in lexical choices (e.g. pail vs. bucket). The argument for equivalence becomes increasingly strained as we move from considering morphological alternations (e.g. slowly vs. slow), to syntactic options (e.g. there are two vs. there's two), to discourse-pragmatic features (like I don't know vs. I don't know), the last of which are defined in part by their lack of referential meaning. In practice, however, when hundreds to thousands of individual occurrences are taken into account, as is typically the case in variationist research (see for example Table 4), different forms are inevitably found to be used interchangeably in some contexts even though they may have distinct referential meanings elsewhere. Indeed, the linguistic variable may be defined as the task of 'separating out the functionally equivalent from the inferentially possible' (Weiner & Labov 1983:33). Through detailed methodological argumentation, Sankoff (1973), Sankoff and Thibault (1980), Laberge (1980), Sankoff and Thibault (1981), and Weiner and Labov (1983) have demonstrated that the linguistic variable need not be confined to cases in which the variants mean precisely the same thing. Instead, the linguistic variable may have functional equivalence in discourse. It is this form/function asymmetry that implicates the role of the linguistic variable in language change (Sankoff & Thibault 1981:684). For example, Sankoff and Thibault (1981) argued that when discourse alternatives coexist over time, the equivalence can be expected to become embedded within the grammar. Sankoff's research on Tok Pisin demonstrated that variable discourse features could morph into grammatical markers (Sankoff 1980a, 1990). Subsequently, variationist techniques have been applied to a full range of grammatical changes across a number of different languages: tense/aspect phenomena (Poplack & Tagliamonte 1998, 1999, Schwenter 1994, Tagliamonte 2000, Torres-Cacoullos 1999), morphosemantic change (Krug 2000, Nevalainen 1997, Nevalainen & Rissanen 2002, Tagliamonte 2000, 2002b, 2003, 2004), and discourse-pragmatic change (Méndez-Naya 2003, Nevalainen&Rissanen 2002, Palander-Collin 1997, Thompson & Mulac 1991). Among other advances, these studies have championed the value of extending the study of linguistic variation to grammatical change.
5. The Linguistic Variables.
Each variable contributing to the current investigation has already been subject to a large-scale quantitative analysis employing variationist methods (D'Arcy 2005, 2006, 2007, 2008, Tagliamonte & D'Arcy 2004, 2007a,b, Tagliamonte et al. 2007). The findings arising from these studies have established that in Toronto, the linguistic systems involved are shifting in apparent time such that one variant is increasing in frequency while the others wane. These changes are not simple innovations made by an individual speaker; they reflect the community's increasing adoption of an innovation (Joseph & Janda 2003:78, Milroy 1999:223). We also note that in each of these studies we have discovered considerable internal linguistic conditioning. The analysis of these internal (structural) factors plays a critical role in the interpretation of the results, but in this article we abstract away from these aspects of the variation to focus on the frequency of forms according to speaker age and sex, the [End Page 74] key elements of the incrementation model we are testing. Nonetheless, the analyses on which we report here have been configured consistent with our previous work. In what follows, we provide an overview of each linguistic variable and the methods that were applied to its analysis, leaving aside the details of the internal linguistic patterns. Further information can be obtained from the relevant papers.
5.1. Quotative be like.
Within the English quotative system, be like represents one of a number of quotative options. Also part of this structured set are the verbs say, think, and go, among other more infrequent verbs (e.g. ask, realize, yell, etc.). The use of be like exemplified in 2 marks a dramatic and recent ongoing change in English. It originated in North America early in the 1980s (e.g. Butters 1982:149, Joseph & Schourup 1982/83, Schourup 1982) and since that time has been rising steadily in frequency. In the analyses that follow, be like is reported as a proportion of the total number of all quotative verbs used in the interview materials, amounting to 5,553 tokens (see Table 4 for details).
a. We're like, What's going on?'
She's like, 'Oh.'
We're like, 'What's "Oh"?' (N/∂/m/26)17
b. She's sitting there and she's like, 'Oh my god!'
She's like, 'That's your boyfriend?'
And I'm like, 'Yeah.'
She's like, 'Oh, he was a cool one at Lawrence.' (3/T/f/18)
c. I was like 'Oh-my-God! That's Cam's sister!'
And then I was like 'Oh-my-God!'
And then I was like 'Oh-my-God, is he here?'
And then I was like all freaked out. I was like 'Oh no!'
But it's just like 'Oh-my-God, now it all comes back to me.' (I/~/f/27)
Be like has been the subject of numerous investigations (e.g. Blyth et al. 1990, Buchstaller 2001b, 2006, 2009, Cukor-Avila 2002, D'Arcy 2004, Ferrara & Bell 1995), and by most accounts it is being propelled by young women. It is also increasing very quickly in vernacular use. In a trend study of Canadians in their twenties, be like has risen from 13 percent of all quotatives in 1995 (Tagliamonte & Hudson 1999) to 58 percent of all quotatives in 2002 (Tagliamonte & D'Arcy 2004). The same trajectory of advancement has been corroborated in a small panel study (Tagliamonte 2007). Therefore, quotative be like should provide an ideal test site for Labov's model of logistic incrementation.
5.2. Discourse Marker like.
Having garnered considerable attention of late (Andersen 1997, 2001, Buchstaller 2001a,b, 2002, 2006, 2009, Meehan 1991, Miller & Weinert 1995, Romaine & Lange 1991, Underhill 1988), the use of like in discourse is another strong candidate for the examination of incrementation. The historical record indicates that like has been performing pragmatic functions for at least two centuries (see D'Arcy 2005:4, 2007:401, n. 9). A comprehensive study of discourse like across the generations in Toronto revealed that it has been entering the grammar gradually and systematically [End Page 75] through regular processes of language change (D'Arcy 2007, 2008). Of the many syntactic positions in which like occurs in discourse, the clause-initial context, where it functions as a discourse marker, is perhaps the most critical for the assessment of change. This context is exemplified in 3. This particular position, the syntactic adjunct slot, is one in which discourse markers generally occur in English (see Kiparsky 1995, Traugott 1997 ). It is also the most frequent position for discourse uses of like (Andersen 2001, Levey 2004, Romaine & Lange 1991, Tagliamonte 2005, Underhill 1988). We define this context here as the left periphery of CP, the functional projection that dominates the clause (for discussion see D'Arcy 2005:75).
a. My other cat always sleeps and like we almost never see him. (3/V/m/11)
b. You know, like the people were very very friendly. (N/V/f/60)
c. Like my uncle's sister married this guy, George-J. (N/‡/m/85)
An important methodological challenge that arises in the study of a feature such as discourse marker like is how to circumscribe the variable context. Because the meaning of like in this configuration is pragmatic, motivated by the speaker's desire to clarify, elaborate, exemplify, and so on (D'Arcy 2007), it is impossible to establish which CPs unambiguously present potential adjunction sites and which do not. To circumvent this issue, seventy-five CPs were systematically extracted from each interview, whether they contained like, some other discourse marker, or no marker at all (D'Arcy 2005:76). Only declarative CPs were included in the analysis; instances of interrogatives and imperatives were excluded due to their low frequency of occurrence in the interview materials (total combined N = 55). Also excluded were tokens where like did not occur, but had it surfaced it would function as a conjunction meaning 'as if/though' (e.g. I really feel ø it's gone (N/u/f/49)). Finally, there are two contexts where like as a marker either fails to occur or it occurs so rarely that the context must be omitted from statistical analysis. These are within an enumeration, as in 4, and in response to a direct question, shown in 5.
a. But like he's got so many things
that don't fall into the stereotype:
Like he's good at ah putting together cars,
* he's a carpenter,
* he's good with tools. (I/8/m/32)
b. There's too many 'Bedfords' in my life:
ø I live on Bedford Street.
* I work at the Bedford Academy.
* I went to Bedford Public School. (N/O/m/24)
a. Q: Do you have any friends that are going to go in there as well or?
A: ø I have a few. (3/H/m/12)
b. Q: What happened to you during the blackout here in the Beach?
A: ø I was actually here working at the rec center. (N/fi/m/37)
c. Q: Really? And what else?
A: Like one of my cats meows so much. (3/V/m/11)
As demonstrated by example 4a, like may mark the first clause in a list, where it means 'for example', but it is never occurs within the body of the enumeration, indicated here by asterisks. Consequently, in these types of sequences only the first clause was retained. In the case of question-answer sequences, like is not categorically precluded in response to a direct question, demonstrated by 5c, but it almost never surfaces in this context [End Page 76] (3%, N = 193).18 Consequently, good practice dictates that adjacency pairs of this nature be removed from variationist analysis (e.g. Guy 1988).
In the analyses of discourse marker like presented here, we focus on matrix CPs and report its proportion out of 3,363 clauses as defined above.
Both quotative be like and discourse like are changes that are overt and vigorous. There are also ongoing changes that have been in progress for hundreds of years and that are moving at a much slower rate. The next linguistic features are of this type. The first involves the use of have to encode stative possessive meaning.
5.3. Stative Possessive have.
In English, the verb to have operates in two distinct senses. It can function as a dynamic verb where it is roughly equivalent to 'receive', 'take', or 'experience' (e.g. have breakfast/a dance, etc., Quirk et al. 1985:132) or it can function as a stative verb where it indicates possession (e.g. have blue eyes/a blue car, etc.). When have encodes stative possessive meaning it alternates with two other forms, have got and got, as in 6.
a. The latest edition of the Reader's Digest has kid's language.
[. . .] And I've got one [kid] in my house. (N/T/m/64)
b. It has some strength and it's got some character. (I/2/f/54)
c. This weekend uh I've actually got a pretty busy-
I have a couple of projects to do
and I got a test in math and law coming up. (3/m/m/15)
The oldest form, have, dates to the late tenth century (Crowell 1955:280, Jespersen 1961b:47-54, Visser 1963-73:1475, 2002-4). Use of have got with stative possessive meaning begins around the end of the sixteenth century and subsequently increases throughout the Early Modern period (Kroch 1989, Noble 1985). The most recent of the stative forms is got alone, which is attested since the mid-nineteenth century (c. 1849).
Previous research has shown that variation among the forms used for stative possession is restricted to contexts in which they carry present-tense morphology (LeSourd 1976, Quinn 2009, Tagliamonte 2003). The variable context for stative possession is further circumscribed in Toronto English, restricted to present-tense affirmative clauses rather than simply present tense (Tagliamonte et al. 2007). Thus, every token of have, have got, and got-including the morphological variants has and 've/'s got-was extracted from the interview materials when it occurred in the affirmative present tense and unambiguously encoded a possessive meaning. In the analyses that follow, have is reported as a proportion of the total number of stative possessive forms, 2,587, in the data.
Contrary to what has been reported for varieties such as British English and New Zealand English, where have got is increasing (e.g. Kroch 1989, Noble 1985, Quinn 2009, Tagliamonte 2003), the trajectory of change in Toronto is one in which have is on the rise (Tagliamonte et al. 2007). This corresponds with the current situation in North American English more generally, where high rates of have are amply attested. Indeed, the elevated frequency of have has been present in Canada as long as can be ascertained using apparent-time data. Among the oldest speakers in the sample, have is by far the most frequent form for encoding stative possession (65%, N = 499). In [End Page 77] each consecutive generation use of have got and got is successively reduced, displaying a classic monotonic pattern of change in progress (Labov 2001:171). That the rates of have are consistently in excess of 65 percent entails that the Toronto English data capture a somewhat late phase in the development of this change, yet it remains one in which the apparent-time construct captures ongoing incrementation.
The following linguistic variable also involves have, but in a different function. In this case, our focus is the system of deontic modality in which have to expresses the meaning of obligation or necessity.
5.4. Modal have to.
The English system of modality in which obligation/necessity is encoded is also the site of ongoing linguistic change (Tagliamonte 2004, Tagliamonte & D'Arcy 2007a, Tagliamonte & Smith 2006). Variation involves the forms have to, have got to, got to, and must, illustrated in 7.
a. You have to like run to the other side
and you have to catch the flag,
but if they catch you then you have to go to like a jail.19 (3/V/m/11)
b. I said 'You have to come up.'
I said 'You must come up.'
And um [. . .] I said 'I've gotta go.' (N/s/f/52)
c. You've got to have a thousand to two thousand dollars worth of equipment to be able to do it and you got to know how to operate that equipment. (N/œ/m/62)
As with the forms used for stative possession, each of these variants represents a successive stage in the development of the system, that is, grammatical layering. Must is the oldest, dating back to the Old English period (Traugott 1999, Warner 1993). Alternation between have to and must can be found as early as Chaucer (1386-1400) (Brinton 1991:34) and was established by the Late Middle English to Early Modern English period (c. 1400-1500) (Crowell 1955:69, Krug 2000:54, Traugott 1999:8). Got to is a much more recent addition, believed to have entered the paradigm just over a century ago (Jespersen 1961a, Krug 2000, Traugott 1999, Visser 1963-73). Investigation of deontic modality in Toronto has revealed that the system of obligation/necessity has undergone almost complete specialization to have to (Tagliamonte & D'Arcy 2007a). Thus, we have caught this change near the end point of its trajectory.
Like stative possession, the system of deontic modality in Toronto English is also more circumscribed than it is in other varieties of English. Furthermore, have to appears to be more grammaticalized than has been found elsewhere. The net effect is that the envelope of variation is quite restricted, limited to present-tense, affirmative, declarative utterances where variability between must and its semi- or quasi-modal equivalents have to, have/'ve/'s got to, got to is possible. In the following analyses have to is reported as a proportion out of the total number of these forms, 1,174, in the data.
In contrast to possessive have and modal have to, other longitudinal changes do not appear to be moving to completion at such high rates of specialization. The next change has long roots in the language as well, but unlike the previous two, it remains in a state of robust variation.
5.5. Future Temporal going to.
The use of going to (and its variants gointa, gonna, gon, and first-person singular 'mena) rather than will or shall to express future temporal [End Page 78] reference, as in 8, is another developing feature of contemporary English (Poplack & Tagliamonte 1999, Tagliamonte 2002a).
a. Like, if I'm gonna go downtown,
it's gonna be to dance. (N/G/f/26)
b. Yeah he's going to pick me up tomorrow after classes.
We're going to go out for lunch downtown.
That's going to be cool. (3/F/f/17)
c. So you're always gonna have your waves and influences
so it's gonna constantly change.
Never gonna be static; can't be static. (I/1/m/51)
The future meaning of going to is attested from the 1400s and is believed to be continuing to increase contemporaneously (Mair 1997). In the current materials, however, the (major) variants in this system, going to and will, both hover around 50 percent in terms of overall frequency. This suggests a number of possibilities: the rise of going to may be progressing very slowly and/or slowing down, or perhaps the division of labor between will and going to is solidifying.20 This provides yet another profile in which to examine the incrementation model at the community level.
Applying variationist techniques to the study of tense and aspect features is inherently difficult. Because future time is expressed in English by morphological forms also denoting other (nonfuture) temporal, modal, and/or aspectual meanings, it is necessary to take temporal reference as the starting point for circumscribing the variable context. We restrict our inclusion criteria to clear predictions about states or events transpiring after speech time (see Poplack & Tagliamonte 1999) and exclude all nontemporal readings (e.g. hypotheticals, counterfactuals, etc.). We constrain our analysis further by excluding all future-in-the-past contexts (e.g. He was gonna get sucked up (I/1/m/ 51)). In the analyses that follow, going to is reported as a proportion of future contexts as defined here, totaling 2,561 tokens.
5.6. Intensifier so.
The final linguistic feature we examine is perhaps the newest in the set of changes we have identified in Toronto. This is the shift toward use of predicate intensifier so from the earlier favorite really (Ito & Tagliamonte 2003, Tagliamonte 2006, 2008), as in 9.
a. Yeah he's really good. He's so weird though. (N/r/f/22)
b. She was like so sure and so careful. (4/h/m/16)
c. It's so hard to get into a company. Like, it's so hard. (I/&/f/21)
The intensifier system has been studied extensively (e.g. Bauer & Bauer 2002, Bolinger 1972:18, Labov 1985, Lorenz 2002, Nevalainen & Rissanen 2002, Partington 1993, Peters 1994, Quirk et al. 1985:590). This area of grammar lends itself to a wide number of variants, making it susceptible to rapid change and recycling as forms are replaced by newly coined expressions, for example, pure (Macaulay 2007), and sometimes nonce expressions as the following example from the television show Friends demonstrates: I mean, isn't that just kick-you-in-the-crotch, spit-on-your-neck fantastic? (Tagliamonte & Roberts 2005). This makes intensifiers a particularly useful choice for the study of linguistic innovation. [End Page 79]
Use of intensifiers is generally regarded as both a teenage phenomenon (e.g. Bauer & Bauer 2002, Macaulay 2007, Paradis 2000, Stenström 1999, 2000) and a feature of female discourse (Jespersen 1922:249-50, Stoffel 1901:101). Women are also credited with leading in the recycling of the intensifier system (Jespersen 1922:250). These strong sociolinguistic correlates render intensifying adverbs another important test case for the incrementation model. One of the issues in dealing with intensifiers in an accountable (quantitative) way, however, is that it is difficult to find where they could have occurred, but did not. Building on the fact that the vast majority of intensifiers occur with adjectival heads (Bäcklund 1973), we restricted our analysis to this context. Further, only those adjectives that could potentially be modified by an intensifier were included, for example, I'm_Italian vs. I was_lucky (2/c/m/16) (see also Ito & Tagliamonte 2003, Tagliamonte 2008). Our method also involved excluding numerous contexts. Sentence constructions that do not permit intensifier use (e.g. comparatives and superlatives), as in 10a, and constructions involving the lexical items too and so when their function was other than intensification, as in 10b, were removed from analysis.
a. Does it sound softer to sound nicer? (N/M/f/31)
b. He was so stealth [sic] we almost ran into him. (2/c/m/16)
Finally, our analysis was restricted to affirmative tokens. The rationale behind this is that intensifiers that occur within the scope of negation do not express the 'higher' degree of intensification in which we are interested (see Ito & Tagliamonte 2003:264). We focus here on the most recent incoming intensifier, so. Its use is reported as a proportion out of the 5,718 intensifiable predicate adjectives, thus defined, in the data.
5.7. Summary of Linguistic Features.
Table 3 situates these six linguistic features with respect to their age and level of variability. It provides a summary of the time scale of each feature by indicating its origin in time, its approximate age in years in the early 2000s when the data were collected, and the frequency of each incoming form as represented by its proportion of use in the age group in which it is used most frequently.
The details of the individual datasets upon which our investigation is based are displayed in Table 4. These datasets have considerable coverage across generations of speakers and comprise a comprehensive number of tokens, totaling nearly 21,000 variable contexts.
Given the incrementation model outlined in §3.2, an adolescent peak should appear among females for each of these features. Among males a step-like pattern 'culminating in an upward movement to the youngest group' (Labov 2001:457) is predicted. Labov (2001:458-60) found that these patterns were discernable not only for sound changes that were new and vigorous but also for those at other stages of development, such as middle-range and nearly completed.21 The linguistic features we have targeted for [End Page 80] investigation provide a range of profiles against which the incrementation model can be tested. Quotative be like is a recent form and is by far the most vigorous of the six. Discourse marker like has more time depth but has recently accelerated, experiencing a period of rapid change over the past fifteen to twenty years (D'Arcy 2005:225-26, 2007:404-5). Possessive have, deontic have to, and future going to are long-term grammatical changes within morphosyntactic, semantic-syntactic, tense-aspect areas of grammar that have been ongoing for centuries, though as noted above (§5.3), the use of have for stative possession may well be a North American trend. Intensifier so is quite new. Although it likely developed as an intensifying adverb early in the twentieth century, it is currently undergoing renewal in Toronto. In sum, varying types and stages of change are represented in the current materials.
In the results that follow, the predictions of the incrementation model are tested to establish their application with respect to changes beyond phonology. The questions may be asked as follows (see also §4): Is the peak restricted to sound change or can it be found with other types of change as well? How does gender asymmetry play out in nonphonological change? What new insights can a study of the incrementation of other types of change provide?
6. Testing the Incrementation Model.
We first analyze each feature distributionally, determined by the proportion of the incoming form out of the total number of relevant contexts for each age group. Here, however, where our goal is to understand the incrementation of a linguistic change, we need to focus on the statistical validity of the differences in the frequency of forms between adjacent age groups. Consequently, we turned to R, a data-analysis software package that provides a range of statistical and graphical techniques (R Development Core Team 2008). For each feature we conducted two tests: a Spearman rank correlation test to assess the correlation between age and frequency and a nonpaired Wilcoxan test to assess the differences between the crucial age groups. We then subjected the data to a logistic-regression model using Goldvarb (Sankoff et al. 2005). This package was specifically designed for the type of naturalistic data that is the focus of our study (see Paolillo 2002, Sankoff 1985, 1988b, Sankoff & Rousseau 1979). It allows us to control simultaneously for the effects of age and sex as well as for any internal factors that have a significant constraining effect on the use of each incoming form.22 [End Page 81]
6.1. Distributional Analysis.
Our study had its genesis in the trajectories of change that were revealed by plotting the frequency distributions for each of the incoming variants by speaker age, that is, across apparent time. These results are displayed in Figures 6a and b, categorized according to speaker sex: females in Fig. 6a, males in Fig. 6b. The discourse-pragmatic variables are shown with empty symbols and dark lines; morphosyntactic(-semantic) variables have solid symbols and gray lines.
[End Page 82]
Among the females, a peak is evident for the majority of the changes. The most striking peak occurs for quotative be like, sharply differentiating the seventeen-to-twenty nine-year-olds from older age groups, a result that indicates just how recent this form is (see Tagliamonte & D'Arcy 2007b). This peak also marks a clear transition point between the seventeen-to-twenty-nine-year-olds and their younger peers. The steep slopes of the curve on either side of the peak provide a visual display of how quickly this particular change is moving (cf. the 1950 trajectory in Fig. 5). Discourse marker like also has an obvious peak, though in this case it is situated among the thirteen-to-sixteen-year-olds, marking them off from both their immediate elders and juniors. Less overt peaks are visible, in descending order of prominence, for intensifier so, possessive have, and future going to. Just as reported by Labov (2001:456-59), the peaks typically occur among thirteen-to-sixteen-year-olds (like, so, have, going to), while one, be like, crests in the subsequent age group, seventeen-to-twenty-nine-year-olds. Such results provide remarkable parallelism with earlier findings from phonology in exhibiting a peak in apparent time as predicted by the model of incrementation. The only form for which no peak is visible is modal have to. Here it is the youngest speakers who have the most frequent usage.
Among the males the results diverge markedly from expectation. Only one form-modal have to-displays the predicted, steadily advancing pattern for the youngest age group. The other incoming variants peak in apparent time, albeit to differing degrees, and then decline within the youngest cohort. Three of these peaks are situated among thirteen-to-sixteen-year-olds: possessive have, discourse marker like, and intensifier so. Two can be found among seventeen-to-twenty-nine-year-olds: quotative be like and future going to. Although these findings run contrary to the model as presented by Labov (2001), they are not without precedence. Labov himself observed a peaked pattern among male speakers. Since the peaks were restricted to middle-range changes, (uwF), (owF), and (owC), Labov (2001:459-60) suggested that the changes were conceivably receding before reaching completion. No peaks were visible for either the new and vigorous or the nearly completed changes in Philadelphia; for both types the trajectories continued upward to the youngest age group. In the Toronto data, there is no basis for an interpretation of recession of change. We return to this point below.
Viewed together, the results from the distributional analyses reveal that the adolescent years are both decisive and foundational in the acquisition of change in progress, as suggested by Labov (2001:463). They also confirm that late adolescence is a critical watershed in the progression of change. Nonetheless, these findings present only a first step in testing the logistic-incrementation model. Percentages alone cannot provide an evaluation of whether or not the peaks are statistically significant. Are they created by meaningful differences in incrementation or by random fluctuation? To answer this question we now turn to statistical tests that enable us to pursue the relevance of the peaks beyond distributional frequencies.
6.2. Statistical Models.
For each variable, we first produced a scatter plot showing the frequency of use of the incoming variant for each speaker in the sample. Individuals are indicated by their sex, marked as either male (m) or female (f). We then fit a line across the data, tracking the overall distribution of each feature across apparent time, and tested for a correlation between frequency and speaker age. If change is in progress, the Spearman results should indicate a strong correlation between these two external factors. The statistical findings are shown in Table 5; Figures 7a-f display the scatter plots. [End Page 83]
[End Page 84]
[End Page 85]
Three of the trajectories in these overall findings display the expected peak in apparent time: quotative be like (Fig. 7a), intensifier so (Fig. 7c), and future going to (Fig. 7f). Three do not: discourse marker like (Fig. 7b), possessive have (Fig. 7d), and modal have to (Fig. 7e). Though we might have predicted no peak in the case of have to, since the results by speaker sex reveal continuously rising frequencies across apparent time, a peak was visible for both have and like in the sex-differentiated results of Figs. 6a and 6b. Its failure to appear in the aggregate results in Figs. 7b and 7e does not provide a clear picture of the actual situation and seems to derive from the way in which the data are configured by this particular graphical tool. When speakers below the age of twenty are grouped following the divisions in Labov 2001, there is a peak in the apparent-time trajectory for both possessive have and discourse marker like, though it is not picked up by the linear fit of the data seen here.23 This is important to bear in mind when evaluating the results of the Wilcoxan analyses to be discussed below.
Spearman correlation coefficients, rho in Table 5, vary between 1 and -1. Absolute values are never reached, but the closer the coefficient is to 1 or -1, the stronger the correlation is between the factors being tested. We can make two overarching observations regarding the Spearman results in Table 5. First, there is a negative and significant correlation between age and frequency for every variant investigated here. This means that an inverse relationship holds between frequency and age: as age increases, the form in question is significantly less likely to be used. We can therefore confirm that in each instance we are dealing with change in progress, since these results corroborate a monotonic association of frequency in apparent time (i.e. speaker age). Second, despite the consistency of this trend across forms, there is nonetheless variability in the strength of the correlation. With coefficients approaching 1, the frequency of discourse marker like, quotative be like, and possessive have is highly correlated with speaker age. The correlation is also fairly robust for modal have to. With future going to the correlation is less strong, albeit highly significant. In the case of intensifier so the coefficient runs precipitously close to 0 and the weakness of the correlation is reflected by the [End Page 86] significance level, 0.0423, which though still significant is only marginally so. These differences in correlation strength reflect the slopes of change: faster-moving changes will exhibit rho values closer to the absolute than will slower-moving ones because age differences will be greater in the case of the former than in the latter. Due to the fact that the logistic equation incorporates rate (r in 1), the slope of change has significant ramifications for both the logistic model of incrementation and the predictions it makes for apparent-time data. Two factors influence rate: the duration of the change and its synchronic position along the S-curve. Thus, even though in each case a significant amount of the variation can be explained by age, this variation is tempered by the stage of the change itself.24
The next step in this analysis is to establish whether differences in frequency between adjacent age groups are meaningful. The standard statistical test for this type of comparison is the t-test. However, t-tests are most effective in cases where the data is more or less evenly distributed across cells. Sociolinguistic data such as ours is not (see Table 4, the seventeen-to-nineteen-year-olds in particular, and Figs. 6a and 6b). In contrast, Wilcoxan tests are designed to handle data with skewed distributions (for example, individual speakers are taken into account), making the Wilcoxan a better test for the comparisons required here. As such, we used a series of nonpaired Wilcoxans to evaluate the differences between each of the three critical age divisions: nine-to-twelve-year-olds vs. thirteen-to-sixteen-year-olds and thirteen-to-sixteen-year-olds vs. seventeen-to-twenty-nine-year-olds. We also tested seventeen-to-twenty-nine-year-olds vs. thirty-to-thirty-nine-year-olds. These results are reported in Table 6. For the majority of the age comparisons the significance levels are very high, confirming that the overall frequency differences are real. In only two contexts are the results nonsignificant: thirteen-to-sixteen-year-olds vs. seventeen-to-twenty-nine-year-olds and seventeen-to-twenty-nine-year-olds vs. thirty-to thirty-nine-year-olds for modal have to.
We have so far established that, for each of the six variants included in the analysis, we are dealing with bona fide change in progress as opposed, for example, to age grading [End Page 87] (Spearman correlations, Table 5). We have seen that when the overall distribution of forms is viewed across apparent time, for both men and women a peak is visible among either the thirteen-to-sixteen-year-olds or the seventeen-to-twenty-nine-year-olds (Figs. 6a and 6b). We have also established that the distributional differences between the critical adjacent age groups are significant. Thus, for all but modal have to, which is monotonic through to the youngest speakers, the peaks that appear in the distributional trajectories are not due to chance. What remains to be established is gender differentiation. Labov's model is intended to account for the incrementation of female-dominated changes. Before we can be certain that these six features present fitting test cases for the model, we must confirm that women are advancing the respective changes. The data in Figs. 6a and 6b were reconfigured such that for each incoming form, men and women are displayed in the same figure. These line graphs are shown in Figures 8a-8f.
[End Page 88]
[End Page 89]
Modeling the data in this way demonstrates two things. First, it provides a graphic display of where peaks do and do not occur, revealing noticeable similarities between males and females with respect to the overall trajectories of change in apparent time. Second, it shows that for all but future temporal going to there is an unmistakable pattern of gender differentation: these are female-dominated changes.25 Setting aside this last form, we can therefore conclude that each feature meets the criterion of gender asymmetry in favor of women. In the case of going to, the lack of a clear pattern is not necessarily problematic. In fact, it highlights a crucial aspect of gender asymmetry and its role in the course of a change. As discussed in §2, the default option is for women to advance new or incoming linguistic forms and the male/female divide in use develops early in the progression of a change (Labov 1990). Frequency differences are most exaggerated when the rate of change is at its maximum, but as the change advances to completion, whatever state 'completion' may represent (e.g. ousting of older forms via renewal, stable variation, entrenchment of older, nonproductive forms in routines, etc.), gender asymmetry is successively reduced. Previous variationist research on the English system of future temporal reference is silent on whether the shift toward going to is (or was) led by women (e.g. Poplack & Tagliamonte 1999, Tagliamonte 2002a). Since we have no evidence to the contrary, there is no reason to suspect that going to presents a rare countertrend to gender differentiation. We can only speculate that at an earlier stage in the diffusion of going to, when the shift was progressing at a faster rate, gender differentiation was a salient part of the change and women were responsible for pushing the use of this form forward. What is captured here is a late stage of the change, one in which meaningful male/female differences have leveled (overall distribution: females 46%; males 44%). Use of going to is continuing to increase across apparent time (cf. Table 5), but the coefficient is low, indicating that the correlation is weak (though still highly significant). In other words, its rate has slowed. [End Page 90]
In this regard, going to presents an important point of comparison with predicate intensifier so. The correlation coefficient for so in Table 5 is lower than that for going to, nearing zero, but unlike going to, so displays gender differentiation. This contrast relates to the developmental stage of the respective changes: going to is nearing completion and gender differences have reduced to the point of nonsignificance; so is new and gender differences have emerged and are expected to become more marked as the change progresses at increasingly faster rates as it enters the upswing of the S-curve. Thus, predictions regarding gender asymmetry must be interpreted in diachronic context. With two notable exceptions, women can be expected to be differentiated from their male peers throughout most stages in the progress of a change: during the immediate inception of a change before men retreat from the incoming form (cf. early research on quotative be like, Blyth et al. 1990, Dailey-O'Cain 2000, Ferrara & Bell 1995, Tagliamonte & Hudson 1999) and in the very late stage as a change nears completion. With this in mind, we assume here that going to does not present a conflict for Labov's model for the incrementation of female-dominated change. We suggest that it was advanced by women in the past and is now approaching a point of stable variation in the community.
6.3. Multiple-Regression Analysis and Peaks in Apparent Time.
To this point we have employed methods that rely on overall distributions produced from aggregate results. We have not accounted for the behavior of individual speakers, nor have we controlled for the effects of language-internal constraints that may significantly affect variant choice. We now turn to logistic regression, a statistical model that allows us to control simultaneously not just for speaker age and gender, but also for any internal factors that have a significant effect on each of the variant forms in question. Each model was configured to match the best fit of the data as already established in our previous work on each of these features. The difference between our previous analyses and the current investigation is that the age groups have been included as an independent factor group in each multivariate analysis.26 At this point we diverge from the groupings used by Labov (2001) in one crucial respect: rather than amalgamating all speakers between the ages of seventeen and twenty-nine years, we have separated out the seventeen to-nineteen-year-olds so as to provide a more precise view of the location of peaks that occur outside the thirteen-to-sixteen-year-old cohort. It is these highly constrained logistic-regression results that we use to evaluate the predictions of Labov's logistic-incrementation model in detail.27 The figures that follow report estimated probabilities, which is the probability that a given variant will be used as predicted by the model we have fit to the data. In the software package used here (Sankoff et al. 2005), these are reported as factor weights, a value between 0 and 1 that indicates the probability of rule application (here, the likelihood that the feature will occur in the collective speech of each age group when other conditioning factors are simultaneously accounted for).
We begin with quotative be like, a change that is both new and vigorous. These characteristics make it the most likely of the forms included here to capture the predictions of Labov's incrementation model (see Labov 2001:446), assuming, that is, that [End Page 91] they are valid outside the phonology. The Spearman test in Table 5 revealed the highest possible level of significance, demonstrating a strong and significant negative correlation between use of be like and age. Peaks were visible in all apparent-time trajectories, whether viewed as an aggregate (Fig. 7a) or separated by speaker gender (Figs. 6a,b and 8a). The Wilcoxan results in Table 6 confirmed that these peaks are meaningful, with significant differences holding between the critical age groups, beginning with the nine-to-twelve-year-olds on the one hand through to the thirty-to-thirty-nine-year-olds on the other. The regression results for be like are shown in Figure 9 with solid lines marked by empty triangles.
A considerable amount of the variation can be explained by age, particularly among speakers aged thirty-nine and under. As with the distributional results, the regression results demonstrate two striking peaks: one among women and one among men. The male peak is not predicted by the model, but given its consistency in these data it must be accounted for. We return to this point below. Note the position of the peak in each trajectory. For the women it is situated among the seventeen-to-nineteen-year-olds. For [End Page 92] the men it appears in the next oldest age cohort, the twenty-to-twenty-nine-year-olds, quite late in the age spectrum.28 Observe as well that after both the female peak and the male peak, the slope drops steeply through subsequent age groups down to the forty-to-forty-nine-year-olds. This suggests that we have caught this change at the highest point in the acceleration of the S-curve. In other words, we are likely witnessing 'the point in the logistic curve where the slope reaches the maximum' (Labov 2001: 451).
Let us now consider a form with a more protracted trajectory of development. Compared with quotative be like, the change toward discourse like has been proceeding more slowly and in a step-by-step progression that can be tracked through different functional projections in the phrase structure (see D'Arcy 2005, 2008). The CP position is one of the early entry points for this marker, and it is the location where discourse like is most diffused and most frequent. The regression results for like can be seen in Fig. 9 with solid lines marked by empty squares.
In Table 5, the highest correlation coefficient reached is that for discourse marker like, and the negative correlation between its frequency of use and speaker age reached the strongest level of significance possible. The Wilcoxan results in Table 6 revealed statistically significant differences between all adjacent age groups under the age of forty years; in each case the highest possible level of significance was reached. Despite the fact that this change is older than quotative be like and that it is progressing at a slower rate, these results suggest that like is beginning to accelerate. The estimated probabilities from the multiple-regression models add to these observations a confirmation of the peaks in apparent time that were visible in the distributional results (cf. Figs. 6a,b and 8b). As with quotative be like, a peak occurs in the trajectory for the men as well as in that for the women, though the slope is considerably steeper among the women. Again, therefore, the predicted continued upswing among males for a female-dominated change fails to materialize. At the same time, the peaks are situated among thirteen-to-sixteen-year-olds, precisely where the model predicts they should occur if individuals participate in incrementation and vernacular organization ceases at approximately the age of seventeen years.
The results for stative possessive have are shown in Fig. 9 with dashed lines marked by asterisks. In this case the change is fairly advanced, with all age groups displaying rates of use above 65 percent overall. Nonetheless, the Spearman rank correlation test reported a highly significant negative correlation between speaker age and the frequency of have, establishing that the shift continues to progress in apparent time. The detailed approach of the Wilcoxan tests revealed that with respect to the overall distribution of forms, each of the critical age groups is significantly differentiated from the adjacent group(s). When males and females are separated, as in Fig. 9, the estimated probabilities reveal a clear peak for speakers of both sexes. While the drop-off between the two [End Page 93] adolescents cohorts is markedly sharp, with thirteen-to-sixteen-year-olds-where the peak occurs-significantly differentiated from the seventeen-to-ninteen-year-olds, that separating the preadolescents from the thirteen-to-sixteen-year-olds is not negligible. Once again, therefore, the current results meet the predictions of the incrementation model with regard to the location of the peak, but one is visible for speakers of both sexes, despite the shift toward stative have being a female-dominated change.
The regression results for modal have to are displayed in Figure 10, represented by a dashed line with the probabilities for each age group marked by a diamond. These results are more challenging to interpret. Once again we have a change that is led by women (Fig. 8e) and that is continuing to advance in the community (Table 5). However, this is the form for which frequency differences between contiguous age groups largely failed to be selected as significant (Table 6). The final upward push from the thirteen-to sixteen-year-olds to the preadolescents marked the only exception, proving highly significant. In other words, in this case we have a form with no peaks, exhibiting instead a continued upward trajectory irrespective of gender. This form also differs from the others we have examined in that although women are more advanced than are men in the use of have to, the slope of the female curve is less steep than is the male slope. This suggests that the change has begun to slow, since the increments for women are [End Page 94] being successively reduced. Finally, the pattern of both trajectories is decidedly step-like rather than linear. This in combination with the finding that they culminate in an upward movement to the youngest group suggests that participation in this change is due to the increment inherited from caretakers and not to the incrementation process of Fig. 5 (Labov 2001:457). In other words, whereas both sexes participate in the previous changes in such a way that follows the predictions for women, in the case of modal have to participation follows the predictions for men (see Labov 2001:456, figure 14.7).
The results for future going to are also somewhat problematic, though for a different reason. These can be seen in Fig. 10, displayed by a solid line marked with solid circles. Here, the estimated probabilities provided by the model reveal that for women in their thirties through to the youngest age group, the slope departs little from a straight line, although a very slight peak occurs among the thirteen-to-sixteen-year-olds. Among men a similar result is found, though the line exhibits a curvature, rising to a maximum among the twenty-to-twenty-nine-year-olds that is largely held constant among seventeen to-nineteen-year-olds. The estimated probabilities then drop in each of the subsequent age groups. The contrast between the nine-to-twelve-year-old boys and girls, in which the girls pattern much more in concert with their immediate elders but the boys do not, we take to be a residue of the gender-asymmetric nature of the change. Although gender differences have dissipated in these late stages of the change, the results in Fig. 10 support our hypothesis that going to was originally advanced by women. The men remain slightly less advanced and consequently the trajectory is somewhat curved (cf. the slope for 1975 in Fig. 5).
As discussed above (§6.2), going to appears to be a change that is virtually complete in the vernacular speech of Toronto. It has one of the least robust correlation coefficients (Table 5) and outside the two youngest age groups, where the difference must derive in large part from the boys, age differences are only marginally significant (Table 6). These findings, in conjunction with the robust variability exhibited between going to and will (see Table 3), suggest that the contemporary future-temporal-reference system may have reached a point of functional partitioning. Recall that changes nearing the end of their cycle slow down. A concomitant of this deceleration is that the increments of change are successively reduced (Labov 2001:450-51, figure 14.4). The result is a leveled trajectory in which no peak can be ascertained (cf. the slope for 2000 in Fig. 5). This is almost precisely what we see among the women in the case of going to. The change has more or less stabilized, leading to a near flat line in these apparent-time data. In sum, this change is behaving exactly as predicted.
Of the features considered here, predicate intensifier so occurs at the lowest levels of frequency. It has the weakest correlation coefficient and exhibits the lowest level of significance in the negative correlation that holds between age and frequency of use (Table 5). These facts suggest that we have caught this change at an earlier stage of development than the others. Nonetheless, gender asymmetry has emerged, with females well in advance of their male peers (Fig. 8c), and there is a distinct peak in the apparent-time trajectory (Fig. 7a) with the critical contiguous age groups significantly differentiated from one another (Table 6). The regression results for so are in Fig. 9, a solid line demarked by empty circles. A peak is visible for both men and women and is similarly situated among the thirteen-to-sixteen-year-olds. The male trend is different in that the line is more level, but it does rise continuously, albeit in small increments, toward the peak. As with quotative be like, discourse like, and possessive have, these results meet the predictions of the incrementation model with regard to the [End Page 95] location of the peak. They also contribute to the emerging picture that it is characteristic of both male and female speakers during the advancement of a change.
The model of logistic incrementation is predicted to be most clearly viewed in 'prototypical new and vigorous linguistic change where women are a generation in advance of men' (Labov 2001:446). Labov's (2001) investigation focused on a complex system of phonological changes in Philadelphia. One might argue that morphosyntactic(-semantic) and discourse-pragmatic changes do not progress at the same rates as do phonological ones. Such changes may also be characterized by different patterns of social embedding. We suggest, however, that such differences should augment the utility of the comparison in testing for the model's generalizability, particularly if the model is able to produce a peak in the relevant age cohort(s) regardless of the divergent nature of change. Thus, while 'new and vigorous' sound changes may provide the best test for the logistic-incrementation model, a further check is provided by evidence from other types of change where contrastive types of behavior are predicted (i.e. modified or lack of peaks for changes just entering or drawing to the end of the S-curve).
To put our findings in perspective, we begin by considering the evidence from the discourse-pragmatic data. Where the rate of change has reached its maximum and the frequency of use is high-quotative be like-there are steep peaks in both the male and female trajectories. Where the change is just beginning to gain momentum and the frequency of use is lower-discourse like-there is a notable contrast between the male and the female curves: the peak is sharper among females. Finally, where the change is still relatively incipient and the frequency of use is low-intensifier so-a peak is most visible for the females; the male slope is characterized by much smaller increments. Similarly, the evidence from morphosyntactic(-semantic) changes reveals that the rate of change and the frequency of use have a direct bearing on the manifestation of a peak in apparent time. A change that creeps forward very gradually across the generations and hovers in robust variation-future going to-does not present peaks, presumably because the rate of change is too slow and age differences are minimal. As a result, the curve 'departs little from a straight line' (Labov 2001:454). Yet a change with a near-categorical rate but that is moving steadfastly to completion, such as have to, may still be progressing quickly enough to produce a peak AND to evidence gender asymmetry.
Taken together, the findings presented here graphically demonstrate that the rate of change and the frequency of use are together an important correlate of the adolescent peak, directly affecting whether or not a high point in the progression of change will be visible in apparent time, and further, under what circumstances one may appear. Changes that are vigorous are most likely to percolate through the youngest members of a population in such a way as to mark the transitions from preadolescence to adolescence to young adulthood, and in this last stage, the concomitant stabilization of the grammar. Changes nearing completion or just beginning to escalate may not have attained the requisite value for r in the logistic expression (see 1, §3.2) to manifest an adolescent peak.
Labov's model of incrementation is intended to explain the linear advance of women in a change that is led by women, allowing for both inheritance of the change and its further incrementation to the age of stabilization. Viewed in apparent time, the outcome should be a peak near the age of stabilization. If the same pattern is evident for men, as we have found here, the most straightforward interpretation is that the incrementation [End Page 96] process of Fig. 5 is not restricted to women. The results in Figs. 6a,b and Figs. 8-10 advocate-insofar as discourse-pragmatic and morphosyntactic(-semantic) changes are concerned-that men also participate in moving forward the increments. If they did not, then there is no obvious explanation as to why peaks appear in the apparent-time trajectories for men. It is also striking that the peaks are identically situated to those in the trajectories for the women. We suggest, therefore, that the explanation lies in the observation that for some changes men simply lag behind women in their participation in the incrementation process. Their rate of change is slower, resulting in smaller increments of change in comparison to their female peers. In sum, one of the consequences of gender asymmetry is that, other things being equal, men and women of the same age in the same community represent different stages of change: women reflect a more advanced stage, men a less advanced stage. Until the more advanced levels of a change are achieved, when the rate begins to slow, women will participate at a faster rate than their male peers. Indeed, as with most sociolinguistic research (see for example Chambers 2003, Labov 1990), the studies we have conducted consistently call attention to the fact that men lag behind women. These hypotheses require further empirical investigation.
In his discussion of how women come to participate in male-dominated change, Labov notes that '[Women] may have a slower rate of change. Or they may have a lower limit on the amount of change in any given year (the K1 variable in the logistic formula). Or they may begin to imitate the male change at a later age, at 13, 14, or 15' (2001:460-61). Whatever the cause(s), we suggest that the same principle is at work with respect to the role of males in female-dominated change in western urban speech communities.29 While nonetheless following behind females in both the slope and frequency of use of innovative variants, males are active participants in the incrementation process. The model need only be extended such that, for some changes at least, men may build on the increment inherited from their caregivers. In other words, the male/female contribution to the advancement of linguistic change may be quantitatively, rather than qualitatively, differentiated. As just noted, Labov has hypothesized that in the case of male-dominated change, one way in which women may follow behind men is that women's rate of change is slower. The data presented here intimate that the same holds of men in a female-dominated change. Slower rates of change minimize age differences. Consequently, contrasts in male and female apparent-time trajectories-whereby the peak is more steep among adolescent girls than it is among their male peers-follows rather straightforwardly from the rate of change IF this is faster for females than it is for males. There are at least two ways in which this difference can be realized. On the one hand, once men either retreat from or resist a change after it becomes associated with women, which seems to be the underlying cause of the gender-asymmetric pattern (§2.2), they necessarily lag behind women in the advancement of the change. The implication from the logistic model is that the change is being incremented at a slower rate among men than it is among women, since the increments become successively larger as the rate increases. On the other hand, there is no reason to suspect that the value for r in 1 (§3.2) is identical for both men and women. A change may proceed over one hundred years for women but over a greater period for men. Such a result will automatically follow from the first scenario, not because the [End Page 97] rate differs but because men prolong the initial stages of the change when they pull back from innovative forms at the outset. But once men become part of the incrementation process, they may also then proceed more slowly than their female peers had. A confounding effect of this nature would explain why, during the upswing of the S-curve, men can be a full generation behind women in the progress of a change. Such a possibility is beyond the scope of this article, but it remains nonetheless that regardless of causation, men are full participants in incrementation; they simply do so, all other things being equal, at a slower rate than women do.
Table 7 summarizes the results for the six changes investigated here. All show female-driven developments with the exception of going to, which we infer was led by women in the past. For all but modal have to, each change exhibits a peak in apparent time. Moreover, the peaks occur for both males and females, and remarkably, they are, more often than not, identically situated with respect to speaker age-groupings. The provocative question is, why do phonological changes generally have a peak in apparent time for women only (Labov 2001) while discourse-pragmatic and morphosyntactic(-semantic) changes consistently have peaks for both females and males? Perhaps it is the case that males are more likely to participate in discourse-pragmatic and morphosyntactic(-semantic) change. Perhaps gender asymmetry does not operate in the same way across different levels of grammar. Perhaps it is due to the contrasting type of linguistic change they embody. Perhaps it is the varying nature of their social embedding. One way in which to pursue the answers to these questions is to move deeper into the speech community and attempt to understand the contribution of individuals, as Labov did when he identified the leaders of linguistic change (Labov 2001:Chs. 10, 11, and 12) or as Eckert has done via social networks and microcommunities within the larger whole (Eckert 1988, 2000). Another possibility is to focus in on the nature of language change itself, not simply with respect to its speed or point of change, but also with respect to its origin (inside or outside the community) and its nature and type (evolutive or adaptive (e.g. Andersen 1973), transmitted or diffused (Labov 2007)). The present study is not able to tackle these questions but they signpost potentially incisive directions for future research.
Of course, the phrase 'language change' is misleading because language does not exist separate from its speakers (cf. Hopper & Traugott 2003:40). A shift in focus toward the speaker is an explicit aspect of the incrementation model, since assumptions regarding the social context of language use are embedded in its application within the speech community. However, within the generative framework at least, it is child language acquisition that has been considered responsible for driving language change (see e.g. Andersen 1973, Anttila 1989, Lightfoot 1997, 1999). This model has been implemented as a means of explaining completed changes (Lightfoot 1997, 1999), but as Labov (2007:346, n. 4) points out, 'such a process has not yet been directly observed [End Page 98] in the study of changes in progress'. As discussed above (§2), the growing sociolinguistic evidence suggests quite strongly that via transmission, children first acquire the vernacular of their primary caretaker before engaging in vernacular reorganization (e.g. Kerswill 1996b, Kerswill & Williams 2000, Labov 2001). The apparent robustness of Labov's model adds to this the understanding that it is incrementation, driven by social forces and actively engaged in by individual speakers once acquisition is more or less complete, that is largely responsible for the advancement of innovative linguistic forms, phonological, morphological, syntactic, or discourse-pragmatic. Thus, a model of language change that privileges the acquisition process is not sufficient as an explanatory heuristic for either transmission or incrementation. There has been a nascent acknowledgment of this for some time, emerging from a number of linguistic perspectives (e.g. Bybee & Slobin 1982, Eckert 1997, 2000, Labov 1994, Milroy 1992). In matching (and meeting) patterns of change visible in empirical apparent-time data, the incrementation model provides a framework from which to reassess, in conjunction with findings from other subfields within the discipline of linguistics as a whole, models of language change in general.
The logistic-incrementation model provides a revolutionary perspective on language change, presenting a key to understanding the evolution of linguistic systems. The morphosyntactic(-semantic) and discourse-pragmatic changes we have studied here provide further evidence of its legitimacy. These changes progress with a peak during adolescence, thereby confirming that the occurrence of the peak in the apparent-time trajectory is a 'general requirement of change in progress' (Labov 2001:455). We can conclude that the parallels between our findings and those for phonological changes in Philadelphia reinforce the status of this model as 'a reasonable first approximation to the incrementation problem' (Labov 2001:454). Several issues arise, however.
First, to understand the location of the peaks or, conversely, to explain their absence, it appears essential to contextualize a language change in terms of its stage of development. The peak for future going to is negligible, for example, because this change appears to be nearing a point of stabilization and ongoing change is progressing very slowly. As Labov observes (2001:446), slower changes are less likely to reveal an adolescent peak than faster changes.
Second, it may not be necessary to have a gender-asymmetric view of language change based on qualitative differences in order for logistic incrementation to provide a viable model. Males may participate in change similarly to women in that they may also partake in driving the increment forward (cf. Labov 2001:463). The evidence provided here suggests that the essential difference between the sexes may lie in their respective rates of change. Whether this is due to the types of changes we have targeted or to external developments in the social life of this community is beyond the scope of this article. It seems likely though that contextualizing language change within recent sociohistorical shifts is critical. As society itself changes, it may not be surprising to find that the way language change is embedded socially changes as well. For example, the apparent-time trends toward almost total loss of possessive have got and modal have got to in Canadian English that we have discussed here represent trends that are antithetic to developments in the same areas of grammar in other major varieties of English (e.g. British and Antipodean varieties). We believe that the reasons for divergence lie in the varying social embedding of the variants across communities (see Tagliamonte & D'Arcy 2007a, Tagliamonte et al. 2007). The fact that sociolinguistic patterns can differ across time and/or space relates to social structures, external contexts, [End Page 99] the nature of the change, and who carries them forward (see Labov 2007). As more transnational collaborative studies of complex speech communities are completed, such issues will come to the forefront of the study of linguistic change. The critical point remains, however, that we cannot confirm the predicted asymmetry in incrementation for male and female speakers. While Labov (2001:459) argues that the logistic model 'reflects the major differences between the patterns of linguistic change of men and women', here we can say only that there is no absolute contrast between male and female with respect to the apparent-time trajectory of change (see Table 7).
Third, regarding the criterion of stabilization, recent evidence suggests that speakers do not completely cease to participate in change. While the progression of linguistic change visibly slows in early adulthood, there are also hints that speakers participate in ongoing developments throughout the life cycle (e.g. Labov 2001:447, Sankoff & Evans Wagner 2005, Tagliamonte & D'Arcy 2007b). Thus, as pointed out by Labov himself (2001:454), the basic model may need adjustment to account for certain types of life-long incrementation. We emphasize, however, that the constancy of the position of a peak in apparent time across studies and among different linguistic features encompassing varying rates of change is particularly telling. It makes a strong case for maintaining the notion of stabilization as a legitimate facet of linguistic change within the speech community.
Finally, we conclude the discussion by reiterating Labov's (1994:49) assertion that the study of language change in apparent time should model a population profile reaching down to the youngest members of the speech community. It is only with the added perspective of preadolescent speakers that the possibility for viewing the adolescent peak is made available to the analyst, thus providing otherwise inaccessible data for better understanding the progression of change. Our results converge in demonstrating that innovating morphosyntactic(-semantic) and discourse-pragmatic variables, just like phonological variables, surge forward to a pinnacle in adolescence as the newest cohort carries a change further than did speakers of the previous generation. This provides strong support for the observation that the peak in apparent time is a general requirement of synchronic change (Labov 2001:455).
Department of Linguistics
University of Toronto
Room 6076, 130 St. George St.
Toronto, Ontario M5S 3H1
Department of Linguistics
University of Canterbury
Private Bag 4800
The first author gratefully acknowledges the support of the Social Sciences and Humanities Research Council of Canada (SSHRC) for grant no. 410-2003-0005 'Linguistic changes in Canada entering the 21st century' and the Research Opportunities Program at the University of Toronto (ROP). We are deeply indebted to Katie Drager, Jen Hay, John Paolillo, Roeland van Hout, and David Sankoff for sharing their statistical expertise. We also thank Kirk Hazen, Brian Joseph, D. Gary Miller, and all the anonymous referees of Language for their careful and insightful comments on earlier versions of this manuscript. Any remaining errors in fact or in interpretation remain our own.
2. The well-known precursor to the notion of vernacular reorganization is Halle's (1962) argument that children do not match their parents' original grammar. Reorganization by children to a simpler model of their parent's language was considered 'imperfect learning' (see also Labov 2007, n. 4).
3. An issue not discussed by Labov (2001) concerns the nature of this vernacular. Recent research by Foulkes, Docherty, and Watt (2005) on child-directed speech (CDS) has found that the frequencies of socio-linguistic variants in the speech of caregivers differed depending on whether the addressee was an adult or a child. In particular, the rates of standard variants increased significantly when mothers were speaking to infants (aged 2;0 to 4;0) as opposed to when they were speaking to other adults. Such results suggest that the original vernacular is not the adult model but rather is one in which the frequencies of forms are shifted. Moreover, the frequency of individual variants in CDS also differed depending on the gender of the child. Thus, in addition to an overall increase in the use of standard variants, speech to girls generally contained more standard variants than speech to boys, where vernacular variants were more abundant (Foulkes et al. 2005:196). Based on such results, Foulkes and colleagues (2005:198) hypothesize that mothers may tune their phonological performance to conform to the child's developing gender identity. If such an interpretation is correct, then gender asymmetry in sociolinguistic usage develops alongside the acquisition of the vernacular as a by-product of the nature of CDS.
5. The  form represents an external form in the community, originating in London.
6. Note that in Docherty et al. 2002, the upper age limit for children was set at 4;0, since beyond this age 'it would be difficult to control for other sources of linguistic input, such as speech from the peer group and younger siblings'. In other words, the study assumed the onset of vernacular reorganization after age four.
7. See Kerswill 1996b for a proposed difficulty hierarchy for the acquisition of second dialect features.
8. The information that appears in this section is largely a review of chapter 14 of Labov 2001 (pp. 446-65). Readers are referred to the primary source for further details and/or clarification.
9. The logistic equation was developed in the nineteenth century by Verhulst (1845) to model population growth. In the initial formulation, r is the Malthusian parameter, referring to the rate of maximum population growth.
10. In some cases there is not an absolute correspondence between the increment in column 3 and the level in column 2 in Table 1 (table 14.1 in Labov 2001), for example, 1.970.20 ≠ 2.18. This is due to rounding.
11. In addition to internal mechanisms, many systematic factors certainly push adolescents toward differentiation, including evolutionary, social, and psychological. Interdisciplinary discussion of such influences would undoubtedly raise important questions for further research.
12. This assumes a 1 : 1 relationship between caretaker age and child age. While obviously a simplification of the actual situation, beginning in adolescence the greatest predictor of vowel quality, for example, is the peer group (e.g. Eckert 1988, 1989). We can therefore hypothesize that differences inherited from caretakers are leveled out during vernacular reorganization.
13. By the time of the second study (Cedergren 1988), (ch) lenition had become the object of social awareness, something that had not been the case at the time of the first study (Cedergren 1973). The association with particular age groups, however, replicated the original pattern, peak and all (Cedergren, p.c. December 23, 2005).
14. While we focus here on active language changes in Toronto English, the greatest part of the grammar is stable.
15. A slightly modified sample design was used for one of the discourse-pragmatic features, like as a discourse marker. Missing are speakers aged nine, thirteen to fourteen, and eighty-eight to ninety-two (for details see D'Arcy 2005:25). However, the sample base is identical to that used for all other variables that form part of the present investigation.
16. The figure of thirty is taken as a reasonable objective for statistical testing, though with morphological and syntactic variables, which occur less frequently than do phonological ones, it is not always attainable. General statistical laws dictate that with fewer than ten tokens there is a high likelihood of random fluctuation, but with numbers greater than ten there is 90 percent conformity with the predicted norm, rising to 100 percent with thirty-five tokens (see Guy 1980:20). As such, if thirty tokens per environment cannot be attained, any number in excess of ten is preferable.
17. Examples from the Toronto English Archive are identified by corpus, followed by the individual speaker's code, sex, and age. The corpora are coded as follows: 2 = ROP 2002, 3 = ROP 2003, 4 ROP 2004, I = IN-TO-VATION 2003, N = IN-TO-VATION 2004. The ROP corpora provide data from speakers aged nine to nineteen and speakers in the IN-TO-VATION corpora are aged seventeen to ninety-two.
18. Observe as well that the sequence in 5c differs qualitatively from the adjacency pairs in 5a,b, since the question itself asks for further elaboration and exemplification, meanings that are encoded by like as a discourse marker. More generally, the only discourse marker that surfaces in response to direct questions in any substantive way in the Toronto interview materials is well (12%, N = 193) (D'Arcy 2005:80). Such a result is consistent with its function as a response marker (see also Lakoff 1973, Sacks et al. 1974, Schiffrin 1987:103-5).
19. Discourse like occurs in several distinct syntactic structures (see D'Arcy 2005, 2008). The uses that occur in examples 7a and 9b are distinct from that which is discussed in this article (i.e. CP adjunction).
20. Ongoing analysis will address more detailed questions regarding the grammaticalization of the go-future in the Toronto speech community, particularly the development of its contemporary function (see Eckardt 2006, Garrett 2009).
21. For discussion of the categorization of phonological changes, see Labov 1994:63-65.
22. The effects of individual speakers that might be thought to interfere with the statistical validity of this model are handled, as is good practice in sociolinguistic studies generally, by testing of the contributions of each person before grouping them into socially defined categories.
23. The overall distribution of forms among the younger age groups is as follows. For stative possessive have: nine to twelve, 91%; thirteen to sixteen, 94%; and seventeen to twenty-nine, 83%; for discourse marker like: nine to twelve: 17%; thirteen to sixteen, 26%; and seventeen to twenty-nine, 18%. Why the peak should appear for other features and not for these two is unclear, but based on the distributional differences when speakers are clustered together in age groups as was done in Figs. 6a and 6b (as opposed to viewing age as a continuous vector), it is certain that smoothing has occurred in Figs. 7a-f, reducing the prominence of the peak where it does occur.
24. It should be noted, however, that the story is somewhat more complicated. Spearman rank correlations assume that a linear relationship holds between the factors being tested, in this case, age and frequency. Since the data are not linear (most trajectories are stochastic across apparent time), particularly among the youngest age groups, these results must be interpreted in context. Compare, for example, quotative be like and discourse marker like. The coefficient is greater in the case of the discourse marker than it is in that of the quotative. This must derive in part from the nature of the aggregate data: there is no peak for marker like, resulting in a more linear relationship between frequency and age, while a clear peak occurs in the quotative data, resulting in a nonlinear relationship between factors. This will influence the strength of the correlation coefficient such that more linear data will be more accurately represented than nonlinear data.
25. Although the distributional margin between male and female speakers is narrow across apparent time for like as a discourse marker, even overlapping at times, the genders are nonetheless significantly differentiated, with like favored among women (D'Arcy 2005:97-98, 2007:396). This gender asymmetry is particularly robust among speakers aged twenty-nine and under, coinciding with the time at which the rate of change increases.
26. For each feature, including future going to, age was selected by the logistic regression as a significant factor conditioning the probability of variant choice.
27. We did not use a chi-square test to assess the difference between males and females or the differences between the age groups because the logistic regression accounts for all significant effects within the data.
28. This result is not due to skewing from any individual speaker. There are eight speakers in the male twenty-to-twenty-nine-year-old cell and a total of 287 quotative tokens. The frequency of be like is 58 percent, the highest distribution among the male age groups. Among seventeen-to-nineteen-year-olds its frequency is 41 percent (N = 345), while among thirteen-to-sixteen-year-olds it rises moderately to 48 percent (N = 306), before dropping to 26 percent among nine-to-twelve-year-olds (N = 57). In Fig. 7a, there is a lone male in his late twenties (not to be confused with the nineteen-year-old who uses be like exclusively) who appears to have a particularly high rate of be like: 79 percent. However, this speaker contributes just 15 percent of the data in this cell (N = 43); there are four other twenty-to-twenty-nine-year-old males who contribute equal or greater proportions of data and two speakers have overall rates of be like of 70 percent. We can safely rule out, therefore, that the peak among the twenty-to-twenty-nine-year-olds is an anomaly.
29. There are, of course, other types of speech communities in which it is males who are the implementers of change, for example, the famous case of Kupwar (Gumperz & Wilson 1971); however, this is a case of contact-induced change, what Labov (2007:346-47) considers diffusion rather than transmission.