In lieu of an abstract, here is a brief excerpt of the content:

REVIEW Frequency Analysis ofEnglish Usage: Lexicon and Grammar . Compiled by W. Nelson Francis and Henry Kuöera, with the assistance of Andrew W. Mackie. Boston: Houghton Mifflin Company, 1982. viii + 561 pp. $40.00. Word Frequencies in British and American English. Compiled by Knud Hofland and Stig Johansson. Bergen: The Norwegian Computing Centre for the Humanities, 1982. vi + 547 pp. with three microfiche. $45.00 (250 Nkr). Word-frequency studies have been produced for a variety of purposes: to assist authors of children's readers and second-language texts, to accompany concordances and word indices on the presumption that they will enhance stylistic investigations, and to enlighten lexicologists and lexicographers. Before the routine use of computers for such purposes, these studies were error-prone and labor-intensive. Often based on relatively small or unrepresentative samples, they were of limited value for scholars interested in comprehensive descriptions of the use of words in a language as a whole. While the counts made by Thorndike & Lorge and West were commonly and usefully employed before the 1 960s, the "Brown Corpus"—so named after Brown University in Providence, Rhode Island, where W. Nelson Francis and Henry Kufcera are faculty members—introduced a new era of study. Thanks to financial support from the U. S. Office of Education and subsequent funding by the National Science Foundation, Francis and Kucera were able to select and inventory a balanced and representative sample of printed American English published during 1961. The Corpus contains 1,014,232 textual words from 500 samples apportioned to represent fifteen genres ranging from newspaper prose to belles lettres to learned and scientific writings. In 1967, the first results were published in Computational Analysis ofPresent-Day 128 Richard W. Bailey129 American English, a volume containing word lists in rank and alphabetical order and including a variety of interpretative tables and lists to show (among other things) the words that are distributed across all genres and those that are significantly more common in one genre or in a cluster of related genres in the Corpus. At the end of the 1960s, applications of the Brown findings to lexicography were not immediately apparent. One commercial dictionary firm vaunted its use of the new computer-made resource in its advertising, but large as it may seem to potential buyers of word books, the million-word corpus was mainly valuable in refining understanding of quite frequent words. It offered little to lexicographers wishing to prepare or overhaul a list of entry words, nor does the distribution of words in its genres much change the application of field labels commonly found in desk or general-purpose dictionaries. If the Brown Corpus was not an immediate boon to lexicography in general, it did inspire a variety of useful studies. A typical application resulted in the revision of words included in the scientific vocabulary of bilingual dictionaries. Studies of the English modal verbs were improved by the evidence of actual use provided in the Corpus. Statistical studies of vocabulary distributions were also enhanced by the findings that Francis and Kucera presented. For the most part, however, such investigations were within the realm ofinquiry envisioned by Francis, Kucera, and the advisory committee they formed to assist them in designing the Corpus and the kinds of lists and tables to be derived from it. Seeing the potential of the Brown Corpus for studies of American English, scholars in Great Britain proposed assembling a parallel corpus of British English. This work was begun in 1970 at the University of Lancaster and continued there until 1976 under the direction of Geoffrey N. Leech. The project was then transferred to the joint management of the University of Oslo and 1 30Review the Norwegian Computing Centre for the Humanities at Bergen. Thus this corpus is now known as the "LancasterOslo /Bergen Corpus" or LOB. Certain adaptations had to be made in the Brown design, but the basic representation of fifteen genres in 500 two-thousand word samples was maintained, and all the samples were drawn from British English printed in 1961. (The "Adventure and Western Fiction" genre, for instance, was adjusted to include a greater proportion of adventure stories since westerns are less commonly published in Britain...

pdf

Share