In an earlier post I criticized a qualitative research paper
on the grounds that it uses unnecessarily convoluted language, the fact that
some of the conclusions are blindingly obvious, the vagueness of the sample on
which the research is based, the lack of any real information about whether the
conclusions always apply or sometimes apply (and if so how often), or any
satisfactory audit trail to link the conclusions with the data on which they
are supposedly based. And the conclusions are so vague it’s actually difficult
to see what they are. This post concerns a statistical research paper and many
of my conclusions are very similar, but with a few important differences. I
think both papers convey almost nothing of interest to any reasonable person.
I
wanted to choose a typical quantitative management research paper, look at it
in detail, decide what was good and bad about it, and how it could be improved.
The paper I chose was
Is
high employee turnover really harmful? An empirical test using company records
published in the top ranking
Academy of Management
Journal. This seemed only moderately complicated in terms of the statistics
used, it’s clearly written, and the topic is easy for the uninitiated, like me,
to appreciate.
The
paper is about the relationship between employee turnover and the financial
performance of organizations. They “tested the hypothesis that employee
turnover and firm performance have an inverted U-shaped relationship: overly high
or low turnover is harmful” and concluded that the hypothesis seemed to be true
“but the inverted U-shape was not observed with certainty”.
The
first thing to note is that they used a single case study. One hundred and ten
branches of just one employment
agency in the Netherlands. A sample of one! They do admit that this is a
problem, but doesn’t it undermine the whole project? The pattern in web
businesses in California, or coffee bars in Portsmouth, would doubtless be very
different.
The
second trivial objection is that the only reason they didn’t observe the
inverted U-shape with certainty is that they didn’t look. There is no graph in
the article (strange given that the article is about the shape of a graph): the
graph they should have shown (on page 3 of
this article)
does show an inverted U-shape, although not a very convincing one because of
the lack of offices with low levels of staff turnover. This is the shape they
observed and, unless they are lying about their data, this
was observed with certainty. Their admission of uncertainty is unnecessary
because it is wrong!
What
they meant to say is that extrapolating the conclusion to the rest of the
population cannot be certain. But what is the rest of the population? The
employment agency only has 110 branches, and using the data to extrapolate
conclusions to coffee bars in Portsmouth in 2012 is obviously silly. What they must
mean is similar branches in similar organizations – but this is inevitably a
little hazy.
This
brings us to the language used in the article. No problems with the text which
is written in good, clear English (perhaps because the authors aren’t English).
The difficulty is the statistical jargon – see Tables 2 and 3 which summarize
the analysis of the data. For example, the lack of certainty in the
extrapolation of the results is measured by two “
p values”. I think it is impossible to explain these in simple
terms, and they also don’t provide a clear answer – hence the rather vague “not
observed with certainty”. I have reanalyzed their data with another method (
Bootstrapping confidence levels ...)
which does yield a clear answer – namely that there is a 65% probability that
the hypothesis applies to the wider population. This makes it quite clear just
how inconclusive the result is.
The
standard approach to analysis adopted by the authors uses jargon and methods
which are difficult to decipher, and fails to provide the answers you want. Why
don’t statisticians use simpler concepts which provide answers to the questions
we need answers to? I wish I knew!
There
are also difficulties with the idea of hypothesis testing here. Let’s take
another example for illustration: academic statisticians in a university. The
hypothesis says that a very high turnover of statisticians is likely to be
harmful to the university’s business. Research contracts won’t be secured, and
students will be unhappy if the academics leave after only a few months. Let’s
now imagine the opposite scenario where statisticians never leave. The students
get the same lecture that’s been served up for the last 30 years, and the lack
of fresh input means that the same tired ideas get used in research projects –
which again is likely to lead to poor performance. Almost inevitably, there is
likely to be an optimum level of turnover which allows new ideas to filter in
but gives sufficient continuity to keep the system working well. In short, if
we plotted a graph of performance against statistician turnover it would be low
for very low and very high turnover; in other words it would be an inverted
U-shape. But this is almost inevitably true. Research is not needed any more
than research is needed to demonstrate there is an optimum level of food
intake: starvation and obesity are both harmful.
The
hypothesis is too obviously true to be worth researching.
Despite
this it might be worth looking at, say, the optimum level of staff turnover in
different industries. This might be 50% (per year) for academic statisticians
to keep new ideas and contacts flowing in, but perhaps 10% with a coffee bar to
help establish rapport with the clientele. Who knows? Research is needed. Quantitative
research in management and similar disciplines is often strangely non-numerical
in that the focus is on hypotheses – is this true or false? – instead of
potentially more interesting questions like how much difference does this make?
But … in
this instance would this really be worthwhile? There are many other variables
impacting on performance (as the data in the original article shows), and it is
difficult to imagine a sensible manager saying “we need to get rid of so-and-so
because our turnover rates are a bit down”. The detail of the individual cases
would be a far more sensible guide than any slight tendency for performance and
turnover to be related by an inverted U-shaped pattern, so is any version of
this research really of any use? Except to statisticians at a loose end?
I am an
academic statistician in a university in the sense that a lot of the work that
comes my way concerns statistics (although I cannot claim any expertise in the
hard core of the discipline), and I’ve been doing it for more than 20 years.
This is definitely too long. Glebbeek and Bax’s argument might be rather silly,
but their implicit conclusion that every occupation has an optimum timespan,
and that I am past my best before date, is right. After Christmas, I am moving
on to pastures new for everyone’s sake.
To
return to the comparison with the qualitative research paper, my conclusions
are similar in many ways. The statistical jargon is unnecessarily convoluted,
the hypothesis is obviously true and so not worth testing, the sample is not
vague but it is very limited, and it is difficult to discern any useful
conclusions. On the positive side it is very obvious how the analysis was
conducted; it’s just that I don’t think it was worth conducting.
This,
of course, is a sample of just one quantitative research article. Obviously I
can’t confidently generalize to all other quantitative research articles. But I
am fairly sure that very similar points apply to a substantial number of them.
There is a more detailed version of this argument in my article
Making statistical methods in management more
useful ....