In lieu of an abstract, here is a brief excerpt of the content:

Reviewed by:
  • Building a Validity Argument for the Test of English as a Foreign Language
  • Khaled Barkaoui
Chapelle, Carol A., Mary K. Enright, & Joan M. Jamieson (Eds.). (2008). Building a Validity Argument for the Test of English as a Foreign Language. New York: Routledge. Pp. 370, US$46.95.

In 2005, the Educational Testing Service launched the Internet-based Test of English as a Foreign Language (TOEFL-iBT), a new version [End Page 657] that claims to be based on a communicative view of language proficiency and to meet high standards of educational measurement. Prior to the launch of this version, numerous studies were undertaken to develop and evaluate it. Building a Validity Argument for the Test of English as a Foreign Language synthesizes the findings of these studies into a single narrative, which is the validity argument for TOEFL score interpretation and use. By providing an account of the rationale, development, and validation research for the new TOEFL, this volume makes several significant and unique contributions.

First, the volume illustrates how theoretical and empirical evidence can be synthesized into a coherent validity argument for the interpretation and use of test scores. The literature has been dominated by an 'accumulation-of-evidence approach' to validation (p. 320) that emphasizes the importance of gathering multiple types of validity evidence but does not provide guidance as to how much evidence is enough or how to organize and synthesize such evidence, once gathered, into a coherent validity argument. Building on the work of Kane (e.g., 2004) and others, this volume describes and illustrates an approach to synthesizing multiple types of evidence into an integrated validity argument for the proposed interpretation and use of test scores.

Chapter 1 discusses the components and structure of the validity argument, while chapter 9 applies this approach to build a coherent validity argument for TOEFL score interpretation and use at the time when the new test was beginning to be used. The ultimate conclusion of the TOEFL validity argument is that 'TOEFL scores are valid for making decisions about the test takers' language readiness for academic study at English-medium universities' (p. 320). This conclusion rests on a chain of six sequential inferences, each based on one warrant that is, in turn, based on one or more assumptions. In the process of developing the new TOEFL, each warrant and assumption was treated both as a basis for developing the test and as a hypothesis to be tested empirically. Chapters 2–8 report the numerous studies undertaken to test these hypotheses and produce support for the new TOEFL validity argument.

The second unique contribution of this volume is that it illustrates the process of test development and how to integrate test development and validation. Test-development processes are not usually addressed adequately in the language testing literature; most published validation research focuses on the analysis of data from tests already in use. Chapters 2–8 describe the processes of developing and researching the new TOEFL, the issues and challenges that the test developers encountered, the conceptual and empirical analyses conducted to address these challenges, and the issues that remain to be resolved. [End Page 658] The volume thus provides a case study of both test development and how to integrate test development and validation.

Finally, this volume describes the process of combining construct theory (second/foreign language proficiency) and measurement theory (validity argument) in the TOEFL development process. Developing a test that embodies communicative views of language proficiency and meets high standards of educational measurement is a challenging task, particularly in the absence of a widely accepted theory of L2 proficiency and one agreed-upon approach to test design to serve as a basis for the new TOEFL score's interpretation and use. Chapelle et al. illustrate the instrumental role played by the argument-based approach to validation in operationally resolving tension between theoretical perspectives in applied linguistics and educational measurement.

The volume has a few limitations. First, it gives a neat retrospective account of what was a complex and iterative process; consequently, the linear structure of the TOEFL validity argument presented here fails to reflect the dynamic, non-linear process that went into its construction. Second, the...

pdf

Share