The Manhattan Institute Center for Civic Innovation has just released the Working Paper Apples to Apples: An Evaluation of Charter Schools Serving General Student Populations, by Jay P. Greene, Greg Forster, and Marcus A. Winters (July, 2003). In the report the authors confuse the ability of schools to improve themselves with their ability to improve their students, as this Web contribution will explain. It is an elementary error that completely invalidates the report.
I remark that in February, 2003, the same authors produced a report Testing High Stakes Tests: Can We Believe the Results of Accountability Tests?. I wrote a Web review of that report in which I explained that the authors confused the predictive power of a high stakes test with its validity as a measure of student learning. That too was an elementary error that completely invalidated the report. (In both cases the report's conclusions are plausible, but that is besides the point.)
The present Apples to Apples report sets out to compare the performance of charter schools with that of public schools serving similar populations. In order to compare similar schools, the report focusses on charter schools that serve a general student population, and the control group of public schools is formed by taking for each charter school the nearest public school that also serves a general population.
The measure of performance is based on whatever standard statewide tests are in place, but Greene et al don't directly use the test scores. Instead, they use the differences in scores between the years 2001 and 2002. At first glance it looks as if they are doing value added assessment, which would make sense, but they are not doing that. So before proceeding I remind the reader of the concept of value-added assessment. See, for example:
Value-added assessment employs, ideally, performance data on individual pupils over multiple years, and looks at improvements over time. It is a way to factor out the effects of different student backgrounds, because these are, one assumes, reflected in their initial test performance. If one doesn't have data on individual pupils then one can use data on grades within a school. In that case the incremental performance that one cares for is that between a certain grade in one year and the next higher grade the next year, on the assumption that this involves approximately the same student population.
Greene et al. could certainly have used such grade-to-grade value added assessment in their work. However, they did something different. The raw data from which they start are the scores by grade in the years 2001 and 2002 for all the schools in their sample, generally for math, reading, and language (the tests vary from state to state). The scores may be scale scores or percentile ranks; this also varies from state to state. Greene et al then evaluated the incremental performance between a certain grade in one year and that same grade the next year. They sum or average these data over all grades, separate for the charter schools and the public schools and separate for each jurisdiction and each tested subject, and finally they report the difference in the increment between the charter schools and the public schools, converted to standard deviation units.
The results of this exercise are provided in Table II in the Appendix to the report, which shows a small (in fact, very small) advantage for charter schools on this measure of differential improvement. In the executive summary, Greene et al express their observations as follows:
Measuring test score improvements in eleven states over a one-year period, this study finds that charter schools serving the general student population outperformed nearby regular public schools on math tests by 0.08 standard deviations, equivalent to a benefit of 3 percentile points for a student starting at the 50th percentile. These charter schools also outperformed nearby regular public schools on reading tests by 0.04 standard deviations, equal to a benefit of 2 percentile points for a student starting at the 50th percentile.
And so, the authors completely confuse a measure of the improvement of schools with a measure of the improvement of student performance. Charter schools could be performing wonderfully or they could be performing dismally relative to public schools in improving student performance, and it would not be seen on the whole school year to year test score improvements that are the basis of this report. It would be seen, of course, in traditional value-added assessment at the pupil or grade level. It might arguably be seen in the data for 2001 or 2002 separately, or in the data for 2001 and 2002 summed or averaged. But in the difference between 2001 and 2002? No way.
(Return to Links, Articles, Essays, and Opinions on K-12 Education or to BJB Essays and Opinions.)
The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by New York University.