- A computational theory of writing systems by Richard Sproat
Linguists generally regard written language as little more than a crude approximation of spoken language. In this book Sproat counters this trend by developing a formal theory of writing systems, focusing on a computational theory of text-to-speech synthesis (TTS) systems.
Ch. 1 (1–33) outlines technical issues pertaining to TTS. All TTS systems include two main functions: conversion of written text into linguistic representations and conversion of linguistic representations into speech. S’s discussion focuses mostly on the formal issues concerning the initial stage, based on his ongoing work at AT&T Labs. He introduces some critical constructs of his theory of writing systems. One is the orthographically relevant level (ORL). S characterizes ORL as a specifically linguistic level of representation and couples it with the constraint that orthographic depth be consistent across the vocabulary of a language. A further theoretical construct, mapping, mediates between linguistic representations and orthographic presentations, with the constraint that the mapping is a regular relation. These theoretical constructs and constraints are critical to S’s argument: they present writing systems as constraint-based systems, establishing their legitimacy as subjects of serious linguistic inquiry.
Ch. 2 (34–66) provides a formal account of orthographic objects, including the introduction of another theoretical construct, the small linguistic unit (SLU). A planar (two-dimensional) finite-state model serves as a computational device in the representation of the SLU. This model defines a rich set of catenation operations that are particularly suitable for representing nonalphabetic scripts. To illustrate the validity of his planar grammar formalism, S provides an extensive analysis of the internal structure of Chinese characters.
All writing systems restrict the direction of the script at the macroscopic levels of the line, page, and document. Deviation from this restriction is allowed only at the microlevel of script-internal catenation, that is, at the level of the SLU. Universally, the orthographic syllable is the unmarked SLU, as seen in Linear B, Devanagari, Han’gul, and Chinese. The largest SLU is found in Maya writing, where it may extend from the unmarked syllabic level to that of the word or small phrase.
Ch. 3 (67–130) discusses the issue of orthographic depth. In linguistic terms, orthographic depth may be characterized as the tension in faithfulness between phonology and morphology. To verify ORL and its associated constraint, S meticulously contrasts Russian and Belarusian orthographies, showing that they differ mostly in the depth of the level of representation. Across its entire vocabulary, Russian orthography is much deeper than that of Belarusian.
The notion of deep orthography is further extended to English, where S uncontroversially concludes that the evidence for a deep morphological ORL in English is equivocal. S does not, however, align himself with advocates of whole-word reading, who naively assume that English spelling is logographic in nature, based on a loose definition of logography. Nor does he agree with Noam Chomsky and Morris Halle that English has a near-optimal spelling system (Sound pattern of English, New York: Harper and Row, 1968), a view that S believes is based merely on personal taste.
Ch. 4 (131–62) reviews seminal work on writing systems, most of which employ arboreal classifications. He then postulates an alternative, nonarboreal arrangement that places writing systems in a two-dimensional space, according to which phonographical elements (e.g. consonantal, alphabetic, and syllabic) are encoded, and the amount of associated logography. For instance, in Chinese, the syllable is the phonographic type, and it is accorded copious logography although not as much as in Japanese. In this framework, Chinese and Japanese are essentially phonographic writing systems with additional logographic information encoded.
Ch. 5 (163–84) discusses psycholinguistic aspects of writing systems. As is common with psycholinguistic approaches, S’s computational model rests on [End Page 822] two fundamental principles: the dual route hypothesis and architectural uniformity. In S’s computational model, the dual route consists of mapping to the ORL and a set of spelling rules that serve...