- The handbook of computational linguistics and natural language processing
The research field that brings together computers and language cannot decide whether it is a branch of linguistics. Those within the field who think it is, or should be, a branch think of it as computational linguistics (CL); those who eschew linguistics think of it as natural language processing (NLP). The dichotomy often reflects individual research goals. Those who pursue computational models of language in order to better understand the nature of language itself are obviously on the linguistic side. Those whose goal is to develop computer systems that carry out some useful task with linguistic input or output tend to be on the other side. The processing in natural language processing is based largely on statistical and machine-learning methods, so in this view the fact that the data are natural language and that the processing might use some information about syntax and semantics along the way does not make NLP a branch of linguistics any more than the use of arithmetic makes bookkeeping a branch of mathematics.
The editors of this new handbook, Clark, Fox, and Lappin (CFL), fall on the linguistic side of the divide. Their own research backgrounds are in linguistics and computational modeling of language. And their handbook is a member of the large and well-known series ‘Blackwell handbooks in linguistics’. Nonetheless, they work both sides of the street, titling it The handbook of computational linguistics and natural language processing (hereafter HCLNLP). This contrasts with the one-sided titles of the recently published Handbook of natural language processing (HNLP) (Indurkhya & Damerau 2010), which is firmly in the no-stinkin’-linguists-here tradition, [End Page 897] and the earlier Oxford handbook of computational linguistics (OHCL) (Mitkov 2003), which despite its title has strong coverage of NLP as well as CL.1 But title notwithstanding, it is clear that CFL’s sympathies and interests are on the CL side. Their handbook has a strong emphasis on theory and methods for computational models of language, and far less on practical applications or NLP. Even the five-chapter section entitled ‘Applications’ is not so much about actual applications as about methods and techniques that underlie applications. A reader who wants to learn about NLP applications such as sentiment analysis or biomedical text mining will need to turn instead to the HNLP, which has a chapter on each of them; and a reader interested in text summarization or computer-assisted language learning must turn to the OHCL. By contrast, the more theoretically or linguistically oriented reader who wants to understand computational models of language learning and grammar induction will find them only in the HCLNLP, well explained in a chapter by Alexander Clark and Shalom Lappin. These three handbooks are thus complementary; despite the implication of comprehensiveness in their titles and their overlap in some core topics, the editorial choices in each have resulted in much less redundancy between the books than might have been expected, and the HCLNLP offers the most detailed coverage of the theoretical and linguistic side.
The chapter authors of HCLNLP are well-chosen experts on their topics, including Martha Palmer and Nianwen Xue on the linguistic annotation of electronic text, Matthew W. Crocker on computational psycholinguistics, Ralph Grishman on information extraction from text, and Ehud Reiter on natural language generation. All told, the volume contains twenty-two chapters in 741 pages (including references), an average of about thirty-three pages each—similar to HNLP’s twenty-six chapters in 666 (more densely packed) pages and contrasting with OHCL’s thirty-eight rather shorter chapters in 716 pages (similar in size to those of HCLNLP). Many chapters of HCLNLP give longer and more detailed treatments than their counterparts in the other handbooks. Some topics, especially some machine-learning methods, get a very deep and theoretical treatment; entire chapters are devoted to maximum-entropy...