Validation in Language Assessment (review)

R. A. Brown

doi:10.1353/lan.2001.0069

In lieu of an abstract, here is a brief excerpt of the content:

Reviewed by:

Validation in language assessment ed. by Antony John Kunnan
R. A. Brown

Validation in language assessment. Ed. by Antony John Kunnan. Mahwah, NJ: Lawrence Erlbaum, 1998. Pp. 220.

Traditional paradigmatic definitions of validity tend to assume that reality exists objectively and independently of attempts to measure it. This trenchant collection of eleven papers from the Seventeenth Language Testing Research Colloquium (Long Beach CA, 1995) focuses on whether this assumption is in step with current intellectual and political trends and whether alternatives are needed.

The first of the volume’s two major sections concerns test development and test-taking processes. Dorry M. Kenyon looks at hierarchy of performance difficulty that underlies tests constructed on ACTFL Speaking Proficiency Guidelines. Kenyon discovers that students taking the test agree with the test designers as to which tasks are more difficult than others. John Read presents a test designed to measure ‘depth’ of vocabulary knowledge rather than ‘breadth’ (how many meanings for a word the student knows rather than how many single meaning-word form pairs she can recognize), based on semantic associations rather than definitions. One of his more interesting results is that good language users tend to be good guessers. Ruth Fortus, Rikki Coriat, and Susan Fund attempt to learn what factors affect the difficulty of multiple choice questions on the English subtest of Israel’s Inter-University Psychometric Entrance Test. They found that ‘level of [End Page 419] vocabulary and level of grammatical complexity. . . . [had] the greatest effect on item difficulty’ (61). Questions were difficult to answer correctly to the degree that they contained difficult words and difficult sentence constructions.

The second section is concerned with test-taker characteristics and feedback. Gillian Wigglesworth found that FL students performed better when they had a chance to think about what they were going to say before saying it than if they didn’t have that chance but ‘only in tasks for which cognitive capacity is reached’ (106). James A. Purpura discusses the development and validation of a questionnaire designed to measure learning strategies used by L1 and L2 students. Factor analysis identified three types of strategy (comprehension, storage, and retrieval), each subsuming more specific learning tactics. Caroline Clapham’s research shows that ‘the comparative importance of background knowledge and level of language ability in reading comprehension depends on the specificity of the reading passages’ (142). April Ginther and Joseph Stevens’s research found that ‘differences in exposure to the target language influence the development of proficiency in important ways’ (188). It seems that students who had lived in Spanish-speaking countries tended to speak Spanish better than those who hadn’t. However, because superiority in speaking was relatively unrelated to listening and reading comprehension, vocabulary, and recognition of grammatical structures, the authors conclude that ‘standardized tests may not reflect the same information for all students’ (190).

Annie Brown and Noriko Iwashita investigate the role of language background in the validation of a computer-adaptive test, comparing the performances of Australian, Chinese, and Korean students of Japanese. They found that ‘language distance’ can influence test results. Korean, for example, is extremely similar to Japanese in a number of ways, and Koreans with an equal number of exposure hours learn Japanese more efficiently than do Chinese and Australian students. Item difficulty is not necessarily constant across groups. Kathryn Hill compared the reactions of 94 test-takers (individuals representing eighteen European and fifteen Asian languages) to taped and live interview formats on the Australian access test of vocational English. Across subgroups (language and gender) there was a moderate to strong preference for the live interview on all four counts, although the Asian language background speakers felt more nervous with the face-to-face interview, and females considered it more difficult. Despite preferences for the live format, subjects expressed the belief that both formats accurately measured their English abilities. Bonny Norton and Pippa Stein question the meaning and validity of an English proficiency test based on a reading selection describing police responses to a monkey attack...

Language

Validation in Language Assessment (review)

Share

Additional Information

Project MUSE Mission