The antidote to cranial rectal insertion in the academic world

Let’s use IQ tests in REF

There are 10 types of people. Those who understand binary and those that don’t.

It’s an old one, but it’s worth retelling. 🙂

There are also two types of academic. Those who like metrics and those that don’t. The former group think academic metrics are important quality indicators that allow objective comparison between individuals and groups. This group also likes to emphasise the importance of “excellence”, yet are rarely able to define it when asked. When they are able to define it, it is usually with reference back to the metrics. They are not afraid of “impact” in REF, because they simply see it as another metric.

The second group reject all academic-related metrics as a matter of principle, and consider quality to be something that is beyond definition and certainly un-quantifiable. They positively object to the use of platitudinal statements such as “excellence” or “world leading”. They typically reject the impact agenda wholeheartedly, perhaps for fear that it exposes their lack of capacity to explain their own value in straightforward terms (they do have value, just a lack of ability, or willingness, to say what it is).

Being conflict averse, I see merit in both sides[1].

I believe that metrics are valuable indicators when well-chosen and properly understood – they can monitor progress, highlight problems and support decision making. But when badly-chosen and used unwisely, they can become dangerous distractions and tools of managerial torture.

Metrics are a hot topic for discussion in UK university coffee rooms just now, because of the impending REF (Research Excellence Framework), but they are also a day-to-day factor in academic management. Journal impact factors, grant success rates, income attribution, are all examples of common metrics that are, from my experience at least, generally poorly understood and badly used.

And then there is the h-index, which is especially interesting as it is to be considered in some REF panels. The h-index seems like such a good idea. A simple, single metric that captures what would otherwise be something rather complex. But that’s exactly the problem. It’s merely an index that captures a particular trend in citation rates. It doesn’t capture quality, nor does it capture creativity, or inventiveness, or independent thinking – all qualities I think most academics would consider important. A simple h-index doesn’t even normalise for number of authors[2].

So, for the benefit of the pro-metric academics, and to emphasise my point, let me propose an additional metric to be used in REF, staff appointments and promotions (the first always influences the other two, after all). Why not use an index that quantifies intelligence, is well-correlated with academic achievement and has a substantially longer evaluation history than the h-index. It’s called Intelligence Quotient, or IQ.

Yes, that’s what I propose: simply use IQ scores of academic staff as a means of distributing national Higher Education resources. The funding then goes to the HE Institutions with the cleverest staff. Fair, transparent, and appropriate for institutes of learning, don’t you think? We could also use it as a metric for appointing and promoting staff.

You may think the idea absurd, but how much more absurd is it than using h-indices (or impact factors, or grant success rates, etc, etc)? IQ scores are not correlated with creativity or problem solving skills, but neither is the h-index[3]. And IQ tests don’t test for emotional intelligence – but when has that ever been an issue in academia anyway? But universities are meant to be institutes of academic cleverness, and IQ tests do test for that.

Metrics are useful tools. I wouldn’t drive without a dashboard. I wouldn’t use a cash machine without being able to check my balance. I calculate my average grant success rate because it helps me plan ahead. I regularly check to see how many visitors I have to this blog.

The problem with metrics is when people attach some value to the metrics in themselves. When we mistake the indicators as some kind of direct measure of an elusive property, rather than as heuristics tohelp us monitor our own progress – that is when we risk losing sight of the important things – the things we are really trying to achieve.

[1] You will call me “diplomatic” or “a fence sitter” depending on whether you share my aversion to conflict or not.

[2] I saw a paper once with so many authors that the average contribution of each was about 12 words. I’m sure they were all vital contributors, of course.

[3] To be fair, I’ve not seen any studies on the correlation, so it is more correct to say that there is “no evidence that h-index is correlated with creativity”.


8 comments on “Let’s use IQ tests in REF

  1. Philip Moriarty
    September 9, 2014

    Great post, Iain – it certainly raised a chuckle!

    But IQ tests don’t test for academic cleverness. They test for the ability to do IQ tests. Big difference.

    And metrics are not useful tools in an academic (or, more broadly, education) context. They’re worse than useless because they distort behaviour —'s_law

    For example, Ofsted’s abuse of statistics in generating “metrics” for schools is highly irresponsible:

    • fortiain
      September 11, 2014

      Indeed, I hope it is clear that I am not really proposing IQ tests!
      To address your points, though….
      Any test is only a measure of an ability to do a test, isn’t it? IQ is certainly controversial, but within a westernised academic environment it is a good predictor of academic success in terms of students (possibly because we test students for similar kinds of things in academia as occurs in standard IQ tests).

      It is strange that you say metrics are not useful tools in an education context. Don’t you assign a grade to your students’ work? That’s a metric. Don’t you give them a final degree classification? That’s another metric. I totally agree (as I say in the blog post) that we should never believe completely in these metrics, because any test, any exam, or any assignment merely tests for the ability to do those things. The important thing is that you measure what you value, not value what you measure (someone other than me came up with that one). Metrics are valuable tools, and should be used as with any tool… with great care.

      Goodhart’s “law” (I’m a physicist, so I immediately cringe at the lazy use of the word “law”) is about setting targets based on metrics. I should write another blog post on my dislike of targets. But the idea that metrics modify behaviour… that is kind of the point, isn’t it?

      So, I stand by my blog post — “that metrics are valuable indicators when well-chosen and properly understood – they can monitor progress, highlight problems and support decision making. But when badly-chosen and used unwisely, they can become dangerous distractions and tools of managerial torture.”

      It is not the metrics that are at fault.

  2. Philip Moriarty
    September 11, 2014

    No, of course I didn’t think you really meant that IQ tests should be used!!

    But I still don’t agree. The problem *is* with the (ab)use of metrics. What happens to those metrics? Well, generally they’re used to generate league tables. Then a ranking in a league table is seen as *the* measure of quality of an institute, department, school, faculty, university etc.. (And universities will cherry-pick to find the “best” league table for them. That particular year.)

    **All of the context, which we both agree is essential in order to interpret the rankings (and the metrics on which they’re based) correctly, goes out the window.**
    (Look at the statistical abuse to which Ofsted subjects our schools: )

    All of the sophisticated contextual information is lost and we end up with a single number. This is plugged mindlessly into a league table.

    I sat through an exceptionally worrying talk last week where it was argued that average NSS scores, REF rankings, and A-level tariffs could be brought together into one “uber”-metric in order to rank universities.

    Are you really suggesting that this is a good thing? That this type of “uber”-metric is a worthwhile and quantitatively valid way of ranking universities?

    ” But the idea that metrics modify behaviour… that is kind of the point, isn’t it?”

    No, that’s not the point Goodhart was making at all. (I’m a physicist as well, but I don’t have a problem at all with someone using the term “law” outside of physics. Most people are capable of grasping the importance of context. For example, when I went to work this morning, I don’t mean that I went to the integral of the dot product of the force acting upon me and my displacement vector. 😉 )

    The point Goodhart was raising is simply that when a metric becomes a target (as it invariably will), it’s no longer a good measure.


    P.S. A famous physicist once said (perhaps apocryphally) that not everything that counts can be counted, and that not everything that can be counted, counts.

  3. Philip Moriarty
    September 11, 2014

    Oh, and this: “Any test is only a measure of an ability to do a test, isn’t it? ”

    I’ll say it again. *Context*. The point is that IQ is lazily taken to represent a measure of someone’s intelligence. So all of the context is stripped away and idiotic conclusions like “I’ve got an IQ of 130 so therefore I’m smarter than you, with your measly 125” can be reached. (..and let’s not start dissecting the statistical significance of IQ measures. What’s the effective error bar on those scores?)

  4. Philip Moriarty
    September 11, 2014

    I’m entirely with you on the inherent stochasticity in the grant selection process, however! See the “simple experiment” I suggest to the erstwhile head of EPSRC here:

  5. Philip Moriarty
    September 11, 2014

    One last thing, before I *really* out-stay my welcome, and while I munch on my lunchtime sandwich…

    As regards academic cleverness, exams, and intelligence — some of the very best PhD students I have supervised were not at the top of their undergrad class by quite some stretch. Indeed, a couple of them scraped in at the 2.1/2.2 borderline. (And we could have a separate discussion on whether reducing three/four years’ work down to a 1st/2.1/2.2 etc.. classification is an ideal situation….)

    I may be a dyed-in-the-wool physicist, with all the reductionist tendencies that entails, but even physicists accept that intelligence, like so many other things in life (fortunately), can’t be quantified on the basis of a single number.

    • fortiain
      September 11, 2014

      I reckon we are broadly saying the same thing, but from different starting points.
      To explain what I mean, I could simply turn the quotation around: “There are some things that count that can be counted, and some things that don’t count, even though they can be counted.” As a general rule, in my experience, metrics tend to be used poorly when they are used to compare individuals/groups/institutions (cf interspecific competition in ecology), and are better used when applied comparatively to the individual themselves, over time (cf intraspecific). The Wii is a fine example of this — whereby it emphasises progress of the individual, rather than comparing individuals (which is why, IMHO, it is great for families with multi-ages/abilities).

      And all that other stuff you said, I agree with. 🙂
      Unfortunately, when it comes to many things such as the allocation of resources and REF, the alternative is “judgement” and that is also flawed, so its a case of choosing which flawed system you like the best.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


This entry was posted on December 11, 2012 by in Scholarship and tagged , , , , .
%d bloggers like this: