In lieu of an abstract, here is a brief excerpt of the content:

  • On stochastic grammar*
  • Brady Clark

1. Introduction

Newmeyer (2003) has argued against models of mental grammars that incorporate probabilistic information (henceforth, stochastic grammar). In the discussion that follows, I review N’s arguments against stochastic grammar and attempt to show that they do not stand up to scrutiny.

More generally, what follows is a defense of the inherent variability tradition of modeling linguistic variation.1 The inherent variability tradition includes the variable rules approach (see Labov 1969 and references in Paolillo 2002), classification and regression tree analysis (Ernestus & Baayen 2003), analogical modeling of language (Skousen 1989), generalized linear models (see references in Manning 2003), various versions of optimality theory (e.g. stochastic optimality theory (Boersma 1998, Clark 2004), partial ordering (Anttila 1997), floating constraints (Nagy & Reynolds 1997)), extensions of head-driven phrase structure grammar (Bender 2001), and extensions of the principle and parameters framework (Yang 2003). A guiding assumption of work in this tradition is that mental grammar accommodates and generates variation, and includes a quantitative, noncategorical, and nondeterministic component (Weinreich et al. 1968, Bender 2001).

Before I turn to N’s arguments, I must mention some terminological assumptions. Throughout, when I use the term mental grammar, I mean a linguistic system existing in the mind of an individual speaker. When I use the term grammar model, I mean a theory of the mental grammar of an individual.

2. Methodological issues

N’s (2003:696) first argument against stochastic grammar is methodological in nature. Noting that the corpora that advocates of stochastic grammar have typically drawn probabilities from encompass data from a wide range of speakers (e.g. the Switchboard Corpus (Dick & Elman 2001), the New York Times, etc.), N asks ‘how could usage facts from a speech community to which one does not belong have any relevance whatsoever to the nature of one’s grammar’ (Newmeyer 2003:696).

2.1. On the use of corpora

This is a fair question, but its pointedness is weakened when one considers what linguists use corpora for. In practice, large corpora are drawn upon to develop descriptions of the language use of a wide range of speakers, as in statistical natural language processing (Manning & Schütze 1999:7). The central question in this type of approach is whether the corpora that are available are representative. This question is addressed by using statistics to handle finite samples of potentially infinite sets of data (Brew & Moens 2000).

Is it reasonable to use frequency asymmetries in corpora to justify theories of individual mental grammars? As a matter of convenience and intended applications (e.g. [End Page 207] multi-user speech-to-text systems), work in statistical natural language processing typically does not separate data from different individuals. Consequently, descriptions of language use drawn from multi-individual corpora may seem irrelevant to grammar models. However, if we make the idealization that speakers share the same mental grammar, we can use the frequency of data of multi-individual corpora as a model of individual variation. Without this idealization, we could not begin to understand what different speakers have in common, for example, within a given speech community. Further, it is an accidental fact, not an essential one, that present-day corpora are not tailored to single individuals, in the sense that they contain all the external linguistic evidence a speaker might encounter or be exposed to in the course of learning. Present-day corpora serve as an adequate surrogate for individually tailored longitudinal collections of that sort. These corpora may not be a perfect substitute, but they are the best we can manage currently.2

One possible objection is that by mixing data from different individuals together in a large data set, evidence relevant to the investigation of the mental grammar of particular individuals is potentially obscured (Mohanan 2003, Newmeyer 2003). For example, in a situation where some individual mental grammars have pattern X (e.g. verb-object) and other individual mental grammars have pattern Y (e.g. object-verb), combining data from these two types of linguistic systems results in apparently random free variation (Mohanan 2003).3

It is often methodologically necessary to look at groups of speakers.4 In practice, both corpus studies...

pdf

Share