The OECD Programme for International Student Assessment (PISA) was developed by an international team of people prominent in educational research and reform. The Dutch Freudenthal Institute played a major part in developing the mathematical literacy component of the PISA 2000 test, and its Director, Professor Jan de Lange, was head of the mathematics experts group. (Other members: Raimondo Bolletta, Sean Close, Maria Luisa Moreno, Mogens Niss, Kyungmee Park, Thomas A. Romberg, and Peter Schueller.) PISA appears to have been heavily influenced by philosophies of authentic assessment and realistic mathematics education (RME), and offers a valuable perspective on the philosophy and politics of this international mathematics education community. Although PISA assessed math, science, and language skills, this contribution looks only at the mathematics assessment.

This is an html version of a plain text email message from July 11, 2002, to friends and colleagues. Please see the Education Page of Bas Braams - Links, Articles, Essays, and Opinions on K-12 Education - for related matter.

[Addendum: Results from the second instance of PISA, PISA 2003, are now available as well, and I have begun to review sample and released questions from that test. See here for reviews of Science Unit 1, Science Unit 2, and Science Unit 3 from the 2003 PISA Framework.]

Dear Colleagues,

The results from PISA 2000 (the OECD Programme for International Student Assessment) were released in December, 2001, and were much in the news that month, as I observed in the United States and also in Dutch and German news sources. The reporting was especially intense and agonized in Germany, which appeared near the bottom on the international ranking. (I lived in Munich for three years in the 1980's and visit the country regularly for work, hence my interest in the situation there.)

Besides participating in the international test Germany also carried out a detailed analysis of results on a much larger sample of students for inter-State comparison. The results of that national assessment were released just last month (June, 2002), and PISA was again all over the German media. It was found that within Germany the two southern states Bavaria and Baden - Wuerttemberg did relatively well on PISA, and it was not overlooked in the commentary that these states are the more traditional ones, with politics dominated by the CDU/CSU Christian Democrats. Reporting spilled over into the Netherlands as well, and the Dutch were pleased to remind themselves that their pupils had done relatively well on the exam --- forgetting that because of sampling problems the Netherlands does not even appear in the official results.

Absent from all the reporting is a careful look at the content of the PISA exam. It seems useful to visit that issue here, with focus on the mathematical component. I believe it will be seen that the PISA exam is highly unsuitable as a test of mathematics education or as a guide to improving math education, and that the international comparisons are gratuitously vulnerable to accidental variations and, let us say, subconscious manipulation.

The international PISA web site is http://www.pisa.oecd.org/. One finds there a description of the testing philosophy and sample questions. The PISA test is designed for 15-year-olds and covers three domains: reading literacy, mathematical literacy, and scientific literacy.

I quoted a page or so of PISA philosophy in a contribution in January, 2002: see here. Key words and phrases: dynamic lifelong learning, real-life situations, students' beliefs, self-regulated learning. "Mathematics literacy is an individual's capacity to identify and understand the role that mathematics plays in the world, to make well-founded mathematical judgements and to engage in mathematics, in ways that meet the needs of that individual's current and future life as a constructive, concerned and reflective citizen." You get the drift, but see my just-cited page for more excerpts or see the PISA web site for the whole thing.

For security reasons the actual PISA test is not published, but supposedly representative sample problems and scoring guidelines are provided at http://www1.oecd.org/publications/e-book/9600051E.PDF.

I will focus here on questions 5 and 6 from the mathematical literacy sample. The questions, described as belonging to the "Big Idea" of space and shape, refer to a drawing of three shapes in the plane. The shapes are labeled (A), (B), and (C). Figure B is close to a circle. Figures A and C look like squid or like an ink-blob: they each have a very jagged edge with lots of indents, and their overall size (diameter) makes it appear that either one might just fit inside or on the circle, figure B. I don't think I'm giving much away if I say that figures A and C have very obviously smaller area and larger circumference than figure B. Here are the questions and scoring guidelines.

Question 5: Shapes

Scoring guideline for Question 5Which of the figures has the largest area? Give explanations to your answer.

Score 1: Answers which indicate shape B, supported with plausible reasoning, for example:

Score 0: Answers which indicate shape B, without plausible support.

"B. It doesn't have indents in it which decreases the area. A and C have gaps."

"B, because it's a full circle, and the others are like circles with bits taken out."

A few further sample answers are provided:

"B, because it has no open areas:", accompanied by a sketch that may be interpreted to show the action of removing bits of the circle figure. (Score 1)

"B, Because it has the largest surface area." (Score 0)

"The Circle. It's pretty obvious." (Score 0)

Question 6: Areas

Describe a method for estimating the area of figure C.

Scoring guideline for Question 6

Score 1: Answers which indicate any reasonable method, such as:

"Draw a grid of squares over the shape and count the squares that are more than half filled by the shape."

"Cut the arms off the shape and rearrange the pieces so that they fill a square then measure the side of the square."

"Build a 3D model based on the shape and fill it with water. Measure the amount of water used and the depth of the water in the model. Derive the area from the information."

Score 0: Other incorrect or incomplete answers. For example:

"The student suggests to find the area of the circle and subtract the area of the cut out pieces. However, the student does not mention about HOW to find out the area of the cut out pieces."

PISA was developed by an international team of people prominent in educational research and reform. The test was taken by 265,000 students from 32 countries. The Dutch Freudenthal Institute played a major part in developing the mathematical literacy component of the test, and its Director, Professor Jan de Lange, was head of the mathematics experts group, which also included Professors Thomas Romberg of Madison, WI, and Mogens Niss of Roskilde, Denmark. PISA is not some random researcher's free-wheeling exploration of a new mode of assessment.

The quoted pair of questions illustrates well, I think, some of the degeneracies of present mathematics education research and of the mathematics education reform trends of the past 10-20 years. Of course we read these questions in the context of the whole PISA assessment.

We observe first of all that this pair of questions requires very little mathematical training, and seems of questionable value for the aim of PISA, which is to compare the outcomes of different educational systems. But I will not dwell on that issue.

Let's study what the scoring guideline for question 5 says about the developers' philosophy towards education. I think that a well-educated 15-year old should have a hard time on that question. It is obvious that figure B has the largest area, but how is our pupil to explain that answer?

This hypothetical pupil can't very well say that figure B is largest because it is round whereas A and C have indents --- that is not a valid reason. Our pupil also can't be happy to say (as I almost did in describing the drawing) that "the three figures have about the same overall size but A and C have indents", because "size" is an ambiguous concept here. Our pupil will not be happy at all to say that A and C would each fit inside B, because just by eye it isn't really clear that A and C strictly fit inside B --- in fact, they probably don't. It looks like they might always stick out a bit, although only a tiny little bit, and much larger pieces are removed through the indentations. Our pupil will struggle to find the right precise language, just as I struggled to describe the problem to you without a drawing, and our pupil might just despair and say that "B" is obviously largest.

On the other hand, a pupil, even a good pupil, who has been educated in the world of fuzzy reform mathematics as espoused by the members of the PISA mathematics expert group has it easy. This pupil has seen these kinds of linguistically challenging questions before and knows that the fuzzy school does not care for precision. This pupil just answers: "B, because it is a full circle and the others are like circles with bits taken out", and gets full credit. We may guess, based on the scoring guidelines, that even the answer "B, because A and C have gaps" can get full credit.

These sample questions and their scoring guidelines also indicate something unpleasant about the developers' philosophy towards research and assessment.

The questions clearly have an enormous range of possible answers, much wider than is typical for a constructed response question on a mathematics exam, impossible to capture in any clear set of scoring guidelines, and certainly nowhere near captured in the given guidelines. The question shares this property with several other questions in the published sample. I would think that such wide open questions are completely inappropriate in a large-scale mathematics assessment, and I find them beyond inappropriate when the assessment covers 32 countries and 20+ different languages.

The developers of PISA will claim, no doubt, that everything has been done to ensure that the scoring would take place in a uniform manner across all participating countries, but in view of the nature of the published sample questions such a claim must appear disingenuous. At least in the mathematical literacy component PISA appears to have been designed so that no matter what controls are imposed later, the staff in the individual participating countries would have a certain amount of leeway for their scoring procedures --- a leeway that can be used "subconsciously" and without leaving a trace of impropriety. I think it plausible that the country rankings reflect to some extent the level of satisfaction of the local PISA staff with their country's recent educational policies.

For the recent German inter-State comparison this last point should not be an issue. One may trust that the scoring was done centrally with properly randomized assignment of papers to scorers. The issue remains what the outcome of this test might imply for educational policy.

Moving away from the pair of sample questions discussed above and looking at the entire set of sample questions on the PISA site one is still struck by the low level of mathematical knowledge and formal schooling that is called for in the test. The German concern over their pupils' performance is appropriate, and contrasts favourably with the Dutch misguided satisfaction over their own results, but I find it remarkable and not at all promising for educational policy that countries are placing such emphasis on this particular test. For a glimpse at the kind of test for 15-year-olds that could provide a good guide for educational policy in mathematics, have a look at the 9th grade placement test for the Singapore New Elementary Mathematics textbook series, at http://singaporemath.com/placement.htm.

Bas Braams

--

Bastiaan J. Braams
- braams@math.nyu.edu

Courant Institute, New York University