Code-switching and predictability of meaning in discourse


What motivates a fluent bilingual speaker to switch languages within a single utterance? We propose a novel discourse-functional motivation: less predictable, high information-content meanings are encoded in one language, and more predictable, lower information-content meanings are encoded in another language. Switches to a speaker’s less frequently used, and hence more salient, language offer a distinct encoding that highlights information-rich material that comprehenders should attend to especially carefully. Using a corpus of natural Czech-English bilingual discourse, we test this hypothesis against an extensive set of control factors from sociolinguistic, psycholinguistic, and discourse-functional lines of research using mixed-effects logistic regression, in the first such quantitative multifactorial investigation of code-switching in discourse. We find, using a Shannon guessing game to quantify predictability of meanings in conversation, that words with difficult-to-guess meanings are indeed more likely to be code-switch sites, and that this is in fact one of the most highly explanatory factors in predicting the occurrence of code-switching in our data. We argue that choice of language thus serves as a formal marker of information content in discourse, along with familiar means such as prosody and syntax. We further argue for the utility of rigorous, multifactorial approaches to sociolinguistic speaker-choice phenomena in natural conversation.