In lieu of an abstract, here is a brief excerpt of the content:

201 INTRODUCTION Human translation has become increasingly computerized over the past thirty to thirty-five years. Beginning with the word processors and the term banks of the 1970s, translators have turned to computers for assistance with at least three different tasks: documentary and terminological research, translating itself, and physical production of the translation. As of the 1990s, electronic corpora have played a major role in the first of these tasks: translation-related research. What exactly constitutes a corpus? The term corpus can in fact be interpreted in a narrow or broad sense. According to Antoinee Renouf (1987,1),itis“acollectionoftextsofthewrienorspokenword,whichis stored and processed on computer for purposes of linguistic research.” In other words, she limits the term to machine-readable texts that are selected and compiled for a specific purpose. However, Geoffrey Leech and Steven Fligelstone (1992, 115) broaden the definition of corpus when they state that it consists of “bodies of natural language material (whole texts, samples from texts, or sometimes just unconnected sentences), which are stored in machine-readable form.” This definition is wide enough to cover all computerized material, including what is found on the Internet. In this paper, we will consider corpora in both senses, distinguishing between the two by designating corpus in the narrow sense of “linguistic corpus” or LC and corpus in the broad sense of “general corpus” or GC. We will begin by discussing certain types of corpora that are particularly useful for translation. We will then focus on the use of CHAPTER 14 CORPORA AND TRANSLATION RODA P. ROBERTS AND JACQUELINE BOSSÉ-ANDRIEU 202 Roda P. Roberts and Jacqueline Bossé-Andrieu corpora in translation research. We will first categorize translation problems and then use a sample text to identify translation problems, and finally we will show how corpora can be used to find possible solutions. TYPES OF CORPORA Since not all types of corpora are equally useful for translation, it is importanttobeginbyidentifyingthosethatareparticularlyso.However, just as there is no complete agreement on the definition of corpus, so too there is some disagreement on the different types of corpora that can constitute LC or be found in GC. Consensus is lacking not only on the designations of the corpus types but also on their definitions and their number.1 Indeed, rather different classifications of corpora have been suggested by John Sinclair (1982), Douglas Biber (1994), Mona Baker (1995), Graeme Kennedy (1998), and others. Our intention here is not to try to resolve inconsistencies in the different classifications and definitions provided. Rather, we will identify three generic types of corpora that translators may find particularly interesting, using a suitable (but not necessarily generally accepted) term, and define them. These three generic types are determined on the basis of two criteria: the number of languages contained in a corpus, on the one hand, and the degree and type of “similarity” between sets of texts contained in a corpus, on the other. They are presented below. Unilingual corpora contain texts in one language only. Thus, the British National Corpus contains a number of texts in English. Bilingual/multilingual translation corpora2 contain texts in two languages (“bilingual”) or more than two languages (“multilingual”), with original texts in one language and translations of those originals in the other language or languages. The pairs or sets thus created contain texts that are equivalent in content and style although different in language. An example of a bilingual translation corpus is the Hansard, which contains translated English/French texts from the journal of debates in the Canadian House of Commons: the speeches have been translated by very seasoned translators, mainly from English into French but also some from French into English. Bilingual/multilingual comparable corpora3 contain original texts in two or more languages that are similar in content, style, and function and can therefore be “compared,” but they are not source texts and [52.14.221.113] Project MUSE (2024-04-19 18:09 GMT) Corpora and Translation 203 translations as is the case with translation corpora. An example of a bilingual comparable corpus is Textum (set up specifically for the needs of the Canadian Bilingual Dictionary Project): it contains, for instance, journalistic texts in English and French, published in newspapers of similar standing, wrien during the same period, and covering more or less the same events. Each set of language texts can also be used as a unilingual corpus. All three...

Share