In lieu of an abstract, here is a brief excerpt of the content:

Brookings Papers on Education Policy 2002 (2002) 315-323



[Access article in PDF]

Comment by Daniel Koretz

[Building a High-Quality Assessment and Accountability Program:
The Philadelphia Example]
[Figures]

Andrew Porter and Mitchell Chester point out a number of positive aspects of Philadelphia's assessment and accountability program and make their case for them. A couple of aspects of the paper are particularly noteworthy. The authors' attention to decision consistency is laudable, but as Tom Kane suggested during his presentation at the Brookings conference, they probably have it wrong, and the decision inconsistency is probably much worse than a model based simply on sampling of kids suggests. But at least they are attending to the problem. Their use of normative data, which is a passing point in the paper, is an important improvement. Trying to base an accountability system solely on a priori standards, without reference to normative data, is [End Page 315] asking for trouble. The authors found that it was not defensible to use only a priori standards and did make reference to normative data.

Despite these comments, I found the paper disturbing because it gives an unwarrantedly positive view of what has been done in Philadelphia, and I will criticize the paper on four grounds. First, Porter and Chester provide a series of assertions as if they were imperatives. These may or may not be reasonable assertions, but they are nonetheless assertions based in large case on, at best, informed judgment. They say that educators have to implement accountability in specific ways. The word "must" appears repeatedly. I do not think those claims are warranted. Second, they ignore most of the relevant research, which gives me at least a basis for being skeptical about some of the positive results that they cite. Third, the paper provides a cursory and misleading approach to the core problem of the validation of gains. Fourth, the paper therefore provides an overly sanguine view of the impact of the program.

I will take each of these in turn.

Porter and Chester say that one must test in all grades and one must have symmetric accountability; that is, there must be high stakes for both kids and teachers. Perhaps this is true. But, until recently, most systems, including, for instance, educational systems in Japan, did not have these features. So perhaps it is not absolutely necessary to have symmetric accountability and testing in every grade. The evidentiary basis for saying how good or bad it is to do is thin, because it is a new idea that has not been tried very often.

More important, the paper ignores relevant research, and I think this is a critical failure. The only way educators are going to avoid making the same mistakes over and over again is if knowledge about educational reforms is allowed to accumulate. Moreover, earlier research is what sets prior beliefs about what to worry about in looking at new research, and Porter and Chester did not take prior research into account in that way.

What are some findings of research to date about test-based accountability? The effects on practice are mixed. For a good overview of this, I would recommend the work of Brian Stecher.27 It is not uncommon to find positive effects; for instance, an increase in writing. But it is also common to find undesirable effects. Narrowing of instruction is a common negative effect, research has shown that people also take other kinds of shortcuts to raise test scores, and they are inclined to take them if the goals are unreasonably high, which I suspect they were for low-achieving schools in Philadelphia. And most important, the available research, while not copious, fairly consistently [End Page 316] shows inflation of scores--not modest inflation of scores, but egregious inflation of scores.

What do I mean by inflation of scores? I will use research carried out in Kentucky to illustrate this. The system in Philadelphia bears an uncanny resemblance to the system in Kentucky; both were designed by David Hornbeck. While there are some important differences, such as no stakes for individual kids in Kentucky, a large part...

pdf

Share