The Stern review

Last week saw the publication of the Stern review into research assessments in the UK.

The only report I saw in the media was on the BBC, here. This suggested that the report said that universities should put more effort into impact. So when my colleague James Mitchell kindly circulated a copy, I approached it in a somewhat prejudiced frame of mind.

I turned first to the appendix, which gave a potted history of the REF and its predecessors, the RSE and RAE, about which I know a little bit. I found some depressing distortions. Here are some, with my glosses.

  • “By the time of RAE 2001, the exercise had become the principal means of assurance of the quality of research.” (p.41) Some context missing here? I do not know anyone other than the funding councils and newspaper league tables who used it in this way.
  • “It was originally intended that the appropriate weighting for impact in the REF should be 25%, but this was discounted in the first REF exercise to 20%, as an acknowledgment of the views of stakeholders that this was a developmental assessment process.” (p.43) Actually, academics roundly criticised and rejected impact as part of the assessment, and as a sop to this very strong feeling the weighting of impact was slightly reduced.
  • “These changes meant that REF2014 took thorough account of the ‘environment’ for both research and impact.” How? “Environment: data on research doctoral degrees awarded, the amounts and sources of external research income and research income-in-kind.” (p.44) In other words, no concern for the actual environment, but the entry of metrics by stealth. More on this later.
  • “Specific changes were introduced [for 2014] that were intended to reduce the burden of REF. However, these were not entirely successful. The costs involved in undertaking the REF, both for institutions and for HEFCE/HE funding bodies, were estimated at £246m for UK HE sector, considerably more than estimates for the 2008 framework which cost around £66 million.” How are these costs measured? In 1996 HEFCE announced proudly that the cost was only £3 million; but they took account neither of the costs to the universities of preparing the submissions nor the time spent by panel members reading the papers. (And a cheap shot: £246m would fund a very great amount of research, half a dozen projects in each university.)

But the rest of the report had some sense in it. I will restrict my comments here to the conclusions. Of course you should not assume that reading what I say is any substitute for reading the real thing. The document is here, and I quote parts of it verbatim.

The review is intended to deal with

  • problems of cost, demotivation, and stress associated with the selectivity of staff submitted;
  • strengthening the focus on the contributions of Units of Assessment and universities as a whole, thus fostering greater cohesiveness and collaboration and allowing greater emphasis on a body of work from a unit or institution rather than narrowly on individuals;
  • widening and deepening the notion of impact to include influence on public engagement, culture and teaching as well as policy and applications more generally;
  • reducing the overall cost of the work involved in assessment, costs that fall in large measure on universities and research institutions;
  • supporting excellence wherever it is found;
  • tackling the under-representation of interdisciplinary research in the REF;
  • providing for a wider and more productive use of the data and insights from the assessment exercise for both the institutions and the UK as a whole.

Some laudable aims here, some perhaps less so; some supported better than others by the content of the document.

I won’t discuss all the recommendations in detail. Specific proposals involve including all research active staff and submitting outputs “at the level of UoA”. Not entirely clear what this means. How is the problem that joint papers with authors at different institutions can be submitted by both, but not if the authors are at the same institution? A quick scan through the document gave no answer to this question.

It is also recommend that outputs should not be “portable”, in an effort to stem the “transfer market” for top performers leading up to the REF census date. They admit that this will discourage researcher mobility, but offer no concrete suggestions here.

It is recommended that impacts should be done at institutional level. It is very difficult to see the rationale for this. Impact often derives from a collaboration between individual researchers at different institutions. (Another recommendation, allowing impact case studies to depend on a body of work and a broad range of outputs, is much more sensible.)

It is also suggested that environment statements should be moved to institutional level. As I already noted, these are mostly based on metrics which do not necessarily describe the research environment of departments, so this just moves from one kind of nonsense to another. In fact it is worse than that. These figures can be changed significantly at the stroke of an administrator’s pen. See Ken Brown’s report on how PhD awards to almost all mathematics departments have been savagely reduced because of a small change in EPSRC policy. Is it sensible for REF inputs to be tied to this?

Now some more general comments.

The review recognises that the REF in its present form tends to push research in safe directions which will guarantee the production of four good papers in the prescribed period; people eschew the risky but potentially more important topics, especially at the start of their careers when they should be most adventurous. But I found no recommendations for dealing with this problem.

The review found that the REF discourages interdisciplinary research, not because it is judged unfairly by the panels, but because researchers perceive that it might be. This is a trifle arrogant: researchers are not stupid and their perceptions are based on experience. In this context, an issue that concerns me (and many of the people who will read this) is the production of software; this is interdisciplinary in the sense that software production is computer science but it is an essential component of research in almost every academic discipline. My own perception is that the REF and its predecessors have never dealt fairly with outputs in the form of software.

On impact, the report says “Studies have demonstrated how the new impact element of the REF has contributed to an evolving culture of wider engagement, thereby enhancing delivery of the benefits arising from research, as captured through the impact case studies.” I am profoundly unconvinced. The words “headless” and “chickens” spring to mind when thinking about the reaction of colleagues to the production of impact case studies. This is mainly due to the extremely narrow definition of impact which is adopted, and the extremely strict rules applying to claiming it. Someone who introduces a new design for clinical trials which will make them yield more information from a given investment with less risk to participants cannot claim an impact case, because the pharmaceutical companies (who would have to certify that they have changed their practice to use the new design) are extremely reluctant to admit that anything was less than perfect with their old procedures. But at least, the review acknowledges the problem of narrow definition of impact, without actually proposing a better one.

And here is a scary statement: “Many, but not all, universities state that they use the REF intensively to manage research performance.” Allowing managers to direct our research is surely a recipe for disaster; can’t they see that?

Overall, my view is that a research assessment involving trusted academics judging papers in their fields can be a very good thing. It should give us the opportunity to showcase our best research and its impact. But the absurdly restrictive rule for impact case studies make it instead seem punitive.

The overall message of the Stern review is that REF is here for the long haul, and within it, impact is in for the long haul; but there is some recognition of flaws in the present system, and some chance of making changes if we all push hard.

What happens next? Stern proposes that the vague principles enunciated in his review should be turned into concrete and specific proposals, which can be put out for consultation, by the end of the year. Can they really do it so fast, without making a terrible botch of the whole thing? Wait and see …

About Peter Cameron

I count all the things that need to be counted.
This entry was posted in maybe politics and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.