In lieu of an abstract, here is a brief excerpt of the content:

• DROVE ([aI]/[i] → [oU]) • SANG ([I] → [æ]) • KEPT ([i] → [E]Ct) • BURNT ([3]/[E]/[I] → [3]/[E]/[I]Ct) In the baseline experiment, participants are also tested on monosyllabic verbs that do not change in the irregular past tense (e.g. cut, hit). These forms are strong outliers and are not reported in the data. The stimuli for the ESP experiment, 156 nonce verbs across four classes, are shown in Table 1 in the main text. Our formal criteria are based on the behavior of existing verbs in English and their categorization by Bybee & Slobin (1982), Moder (1992), and Albright & Hayes (2003). We made some adjustments to their categories. For instance, our 1 Supplementary materials for ‘Morphological convergence as on-line lexical analogy’, by Péter Rácz, Clay Beckner, Jennifer B. Hay, and Janet B. Pierrehumbert. Language 96(4).735–70, 2020. Supplementary Information – Morphological convergence as on-line lexical analogy Péter Rácz, Clay Beckner, Jen Hay, and Janet B. Pierrehumbert April 29, 2020 In this supplementary information, we detail (1) the structure of our nonce verb stimuli, (2) the setup of the Generalized Context Model (GCM) and (3) the Minimal Generalization Learner (MGL), and (4) how these models compare. We illustrate the models using fits on our baseline data. We also discuss model selection in analysing our regression data (5), how the two learning models compare on our baseline data (6), and how they compare on our ESP post-test data (7). Sections (2) and (3) also appear in the main text of Rácz et al. (2020), as Appendices A and B. However, this material is included here so that this Supplement can serve as a standalone document for readers who are interested in the technical details of our study. Additional material is reproduced from the main text, revisiting specific regression model summaries and additional methods we used to investigate contributions from the GCM and MGL. The source file for this SI, along with the data and the code for the paper, are hosted at https://github.com/petyaracz/RaczHayPierrehumbert2019. 1 Regular and irregular verbs in English Four irregular verb classes were defined for our stimuli, based on the vowel alternation and affixation processes that apply to the stem: DROVE class is a generalized version of Moder’s class 4, described using the [aI]→ [oU] alternation but also including weave. (In contrast, Albright & Hayes restrict this class to the [aI]→ [oU] alternation.) The verb classes could be defined in a number of ways and still be mostly consistent with the work of Moder, for instance. We have trialed a set of small changes in defining the verb categories and found that they have no major effect on our overall results. 2 Implementation of the Generalized Context Model 2.1 Outline Our implementation of the GCM evaluates the competition between two categories, regular and irregular, for each nonce verb base form. The framework of Nosofsky (1990) is adapted to morphophonology by using a segmental similarity calculation based on natural classes (Frisch et al., 2004). The same treatment of segmental similarity is used in the implementations of the GCM in Albright & Hayes (2003) and Dawdy-Hesterberg & Pierrehumbert (2014). We build on Dawdy-Hesterberg & Pierrehumbert (2014) in that we define our categories based on formal similarity. The implementation of the GCM is laid out in models/gcm/gcm.html. 2.2 Training data Participants are presented with a sequence of nonce verb base forms, and have to pick either a regular or an irregular past tense form for each. The irregular past tense form is pre-determined by the class for the stem, so that, for a given verb, the participants can only choose between the regular past tense form and the irregular past tense form we assigned to the verb. (So, for instance, for splive, a verb in the DROVE class, they can choose either splived or splove, but not splift or sploven, etc.) For a given class (such as DROVE verbs), the GCM has a choice between two sets of verb types. The irregular set consists of verb types in Celex that form their past tense according to the pattern captured by the class (such as an {[aI],[i]} → [oU...

pdf

Share