Michael Rosen: Some thoughts on the improvement in 'Reading' in the international PIRLS tests

1. The PIRLS 'Reading' test was a test in retrieval, inference and interpretation. In England, we usually lump all this together and call it 'comprehension'. By making this the 'Reading' test, PIRLS acknowledged that the word 'reading' means 'reading with understanding'. There is a debate to be had as to whether that particular test (or any other test) does genuinely find out whether children are or are not understanding what they're reading but let's leave that to another day.

2. The test was taken in 2016 by children who were 9 or 10 years old, (what we call Year 5 in England). They received several months - probably at least a year - in instruction of systematic, synthetic phonics. This is a system of reading which teaches the 'alphabetic code'. That is, it isolates the sounds English speakers make when speaking; isolates the letters and combinations of letters English speakers use to make words; matches the sounds to the letters according to orthodox spellings; uses a variety of strategies to show how words 'blend' letters to make words. However, due to the irregularity of English, some words are taught according to the principle of 'look-and-say'. Some schemes call these 'tricky' words, others call them 'red' words. In other words, this teaching method is not 100% 'phonics'. A small part of it is 'look-and-say'. The process by which we do phonic reading, most people call, 'decoding'. At the end of Year 1 when most children are 6 years old (though some will be still 5) the children do a 'Phonics screening check', in which they will say out loud words on a list. Most of these will be real words, some will be made-up words. The argument for doing this is that children are showing that they have grasped the 'alphabetic code' and are not 'guessing' parts or the whole of words according to say, what the whole word looks like or what letters it starts with. The reason why it's a list and not a series of sentences or a story or any piece of continuous writing, is because this would bring in to the reading process 'meaning' and children might be, they say, 'guessing' the next words or phrase.

2. Nick Gibb, the Schools Minister for England, has claimed that there is one reason for the improved result shown by English children in the PIRLS test: the stricter demand that all schools teach one particular kind of phonics (there are several phonics methods).

3, For Nick Gibb to claim this, it has to be shown that this one change - the stricter demand caused the improvement in the result. I'm not sure that Nick Gibb has understood the rules of 'cause and effect' in scientific experiment. The rules involve such things as:
a) whatever is designated as a single cause, must be the only thing to change in the process leading to the change. In science, if we say that a candle heated some water, (cause and effect), we have to be sure that the water wasn't near a radiator which came on while we applied the candle. What we do is 'hold all the variables constant, while varying the once factor we are testing.'

The time lapse between the time these children did SSPhonics and taking the PIRLS test was about 5 years. Any scientist would therefore ask, were the variables held constant? Indeed, any teacher might ask of the curriculum and their 'intervention', did we do anything different between teaching the children SSPhonics and Year 5? (Different, that is, from the previous 5 year period.)

To my mind, several things happened. For example, in that time, Nick Gibb and others are very keen to say that 'schools improved'. They draw on data taken from Ofsted, to say that schools are getting better, more and more schools are 'outstanding'. Clearly, this judgement did not just involve 'standards of reading'. These were school-wide judgements. Can Nick Gibb or anyone else show that this general improvement in schools was NOT a contributory factor in the PIRLS sample of children improving their Reading scores?

Another factor I saw happening was that alongside the introduction of SSPhonics there was a request (demand?) that schools provide a rich diet of rhymes, stories and poems. Purely anecdotally, I saw that in action several times (indeed my school visits are seen by some schools as fulfilling that specific requirement). I also had a strange discussion with a passionate exponent of SSPhonics (a headteacher) who explained to me a) that he had brought in an hour long session every morning with Reception, Year 1 and Year 2, every morning, at which the teachers did 'rhymes, stories and poems' with the children. He also told me that this 'had nothing to do with reading'. I said that I thought it did, because it enabled the children to 'read' as opposed to 'decode'. It enabled the children to read with understanding, and not just say out loud words on a list.

Unmentioned in the debate around these results is any account of whether schools have or have not instituted such 'rigorous' (!) hour long sessions at which children are free to listen to and interpret rhymes, stories and poems.

This brings us to a problem I predicted would happen the moment the phonics screening check was brought in. I called it 'phonics creep', and it's the process by which people like Nick Gibb describe the reading of 6 year olds as being 'fluent' or 'improving'. The only way he can make such claims is if he uses the improving results of the Phonics Screening Check. I repeat this test is not a test in 'reading'. It's a test in 'decoding'. It is designed precisely and particularly (and quite cleverly) to eliminate 'meaning' (ie 'comprehension). If someone like Nick Gibb describes children's performance at this test as 'reading', he is either ignorant or deceitful. As we all know, it is quite possible to 'decode' without being able to understand what one is reading.

This leads us to another problem in the matter of 'cause and effect'. Having made a claim that 'a' caused 'b', it's generally incumbent on the person making the claim to explain how or why. We will have to wait, I guess, for Nick Gibb to explain this. He has one problem. SSphonics is a method of teaching the alphabetic code - how letters correspond to sounds (how 'graphemes' correspond to 'phonemes'.) Of itself, it cannot teach children how to comprehend, or interpret. That has to come from other processes. So, let us say, that Nick Gibb can explain how SSPhonics instruction helped the children decode the words in the PIRLS test, he also has to show us what enabled them to interpret a) the questions themselves (that is the wording of the questions) and b) the passages that the children were asked about.

Given that both these processes (decoding and interpreting) were being tested at the same time, it's quite possible that in the five years between doing SSPhonics and taking this test, the children have done some important things to help them interpret better.

What might they be?

I've already mentioned one: the possible improvement in provision of rhymes, stories and poems with Reception, Year 1 and Year 2. Certainly, the removal of the National Literacy Strategy, in 2009, may have helped teachers to dump the over-use of 'extracts' and return to whole book teaching. Significant? Possibly.

Here's another: everyone knows the 'familiarity with the test' effect. As teachers and pupils become familiar with a test, it's been observed that scores rise. It's not hard to see why. Teachers see patterns in the wording and methods of the tests and pass these on to the children. They may well coach the children with this, particular the low-performing children who often (in all our experience) find the wording of questions in test conditions difficult. I was in a school recently where the headteacher told me that her KS2 Reading SATS scores were just off 'special measures' levels (for those outside England, that means so low that the school was in danger of being taken over under new management). She implemented a draconian teach-to-the-test regime, with regular 'mock' testing, using past papers, which the teachers got the children to do under the same conditions as the SATs exam itself. She told me she raised the children's scores to 90% at the top level.

Why not? As we all know, doing tests is not some kind of 'pure' assessment of ability or aptitude, but a matter - to some degree - of 'getting' what it is that examiners are asking, knowing the right formulas for answering. Whether this is 'education' is another matter. Whether this method imparts the right or the most useful knowledge and skills is another matter. Whether this really assesses children's ability to do all sorts of other socially useful and desirable things is another matter. And indeed whether this obsession with this kind of testing squeezes out of the curriculum a raft of useful and desirable activities is another matter - for the time being!

In the meantime, we know that plenty of schools are coaching the children to do their Key Stage 2 SATs by teaching them how the questions work.

I ask therefore, why wouldn't an improvement in this PIRLS test be at least partly down to the incredible hard work that teachers do coaching children in how to do such tests?

4. The other crucial aspect of 'cause and effect' that Nick Gibb doesn't seem to have taken on board is again a common issue with scientists. Before diving in to say 'a' causes 'b', they check to see if comparable results are caused or can be caused by another factor - as say, might have occurred in another experiment or (common in medicine) the 'placebo effect' where patients are given a 'blank' pill while other patients are given the drug. If the improvement caused by the drug is not significantly better than the improvement caused by the 'blank' pill, it's not the drug that's causing the improvement.

Analogous things happened around the PIRLS test which will have to be teased out. It's becoming clear that improvements in scores occurred with some other countries which used different methods of teaching reading from those used in England. They may well have incorporated some kind of phonics (hurrah for that, say I) but may well have not used SSP.

So Nick Gibb has a problem there too. He'll have to explain (I'm sure he will) how Ireland, Northern Ireland and some other countries improved their PIRLS scores without doing exactly the same intervention that he is claiming 'caused' the improvement in English children's scores in the PIRLS test.

Or, indeed, how in a previous era, the National Literacy Strategy appeared to have caused an improvement in scores.

5. I guess much of this will play out over the next few months with full statements from, say, NATE, UKLA and others when they've had a chance to check the details. However, Nick Gibb jumped in, overruled an earlier statement from the DfE which warned against being too 'hasty' about saying that it was the introduction of SSPhonics which 'caused' the improved scores.

6. As a PS: I would have hoped that the media could have 'got it' that moving up or down a table doesn't of itself show that performance improves! Arsenal finished 5th last season. If they finish 4th this season, they may have improved. They may not have improved. One cause for the change in place might be that Spurs play worse this season than last.

As it happens on this occasion, the claim is being made that English children not only went up the table (ie in relation to others) but also that their scores improved in relation to themselves.

I just hope, therefore, that PIRLS can confirm that this obeys another rule of scientific testing: they are comparing 'like with like'. That is, you can't say something 'improves' unless you're comparing the same kind of test on the same kind of sample. Another detail it would be good to get confirmation on.

7. A further observation - slightly tongue in cheek. In the period covered by this test, several writers have sold multi-millions of copies of books: Jacqueline Wilson, Julia Donaldson (with Axel Scheffler in particular), David Walliams, Anthony Horowitz, J.K. Rowling, Nick Sherratt and of course Roald Dahl with Quentin Blake.

The sales and borrowings from libraries of these books in England have been staggering and readers must have included some (many?) in the lower percentile in the PIRLS test. (note the sales and borrowings are staggering even if overall it can be claimed that 'reading for pleasure has declined')

Given that the PIRLS test was a comprehension test, and given that comprehension is hugely aided by reading for pleasure, then I will make the claim that the reading of these books is also a contributory factor until such time can prove to me I'm wrong!

Friday, 8 December 2017

Some thoughts on the improvement in 'Reading' in the international PIRLS tests