This post is a slight diversion from the core business of my blog, but bear with me because some of the themes will resonate with issues that I have been writing about over the past couple of years.
Recently, a local school, Durham Free School, made national headlines after an excoriating Ofsted report. The Ofsted inspection was one of a number of inspections called at short notice on faith schools in the region and there seemed to have been a particular focus on determining whether or not the school taught “British values”. The inspectors commented that the school was “…failing to prepare students for life in modern Britain. Some students hold discriminatory views of other people who have different faiths, values or beliefs from themselves”. If true, it would be damning but as my job for the past 20 years has involved developing objective measures to determine the success of policy (albeit for the environment rather than education), I was curious to see just how OFSTED inspectors arrived at this conclusion.
The report itself is not very illuminating on methods: the inspectors “…spoke to students in lessons, at break and during lunchtimes. They also spoke formally to two groups of students on the first day of the inspection.” What, in particular, I wondered, did they ask before arriving at their conclusion about these “discriminatory values”? I’ve read the whole report, I’ve searched the OFSTED website and I’ve looked at OFSTED’s publication Inspecting Schools: A Handbook for Inspectors. The Handbook explains that Inspectors should consider how well management and leadership ensure that the curriculum “… promotes tolerance and respect for people of all faiths, genders, ages, disability and sexual orientation…” but there is nothing that explains how such an evaluation should be performed. The Inspectors, I conclude, simply reported their opinion based on the conversations they had with this small sample of pupils.
Let’s look at this process from a statistical perspective: the Inspector’s opinion is, in effect, a test of the hypothesis that “students hold discriminatory views”, which could be re-cast as a null hypothesis: “students do not hold discriminatory views”. The Inspectors reach their opinion via the conversations mentioned above (no mention of whether there was a set form of questions, whether students were interviewed as a group or individually or whether closed or open questions were used). The outcome is cited in the report in absolute terms but, in reality, is a probability based on the outcomes of the interviews. And, because they only interviewed a sample of students, there will be uncertainties associated with this outcome. The Inspectors might have reached the wrong conclusion. In statistical terms this means that they rejected the null hypothesis based on their sample because most of the students, in fact, hold non-discriminatory views – a “Type 1 error”.
That the Inspectors concluded that “some students” held these views is, perhaps worrying in itself. We could argue, in support of the inspectors, that there should be zero tolerance of discrimination of any kind. Yet this then raises the question of whether the limited sampling program deployed by the inspectors is sufficiently sensitive to detect discrimination in every school where it occurs (i.e. to retain the null hypothesis when it should have been rejected – a “Type2 error”). On the other hand, if Ofsted published detailed guidelines on how such evaluations were to be performed (guidance on sample size, types of questions and so on), and the Inspectors at Durham Free School had given more details of the sample size on which they based their judgements, then perhaps we would be in a better position to evaluate the credibility of their judgements. The reality, I suspect, is that the risk of a wrong outcome will be high because the sample size was small. The two inspectors had just two days to evaluate all aspects of the teaching and governance of the school. Some topics that deserved detailed scrutiny were, inevitably, evaluated in a superficial manner as a result.
The core of the problem was summarised neatly in an editorial in The Independent today which places the blame squarely on Michael Gove, the previous Education Secretary, for putting “British values” onto the list of criteria that Ofsted were required to inspect. The problem, The Independent comments “is that no one can say exactly what it means, which gives inspectors enormous leeway to decide whether or not a school is teaching the said value correctly.” Political ideology, at some point, has to be translated to practical action and the success or otherwise of policy depends on being able to make judgements consistently across the entire country. If Ofsted are unable to convert Michael Gove’s rhetoric into transparent and fair measures, then they should resist being drawn into an arena where objectivity comes second to political grandstanding.