There's a lot of research papers out there - some good, some
bad, some intelligible, some unintelligible, some disseminated quickly, some
slowly. Now, with covid-19, it is more important than ever that we get to know
whether a research paper should be trusted,
and as many as possible of the
relevant people should be able to understand
it as quickly as possible, and that the research methods employed are not unduly restrictive The
phrases in italics refer to key problems with the system for creating and
disseminating knowledge. The purpose of this note is to explain how a few
publications of mine are relevant to tackling these problems.
Just after deciding to write this note, I received a
"nature briefing" email from the journal Nature summarising some of
their latest highlights. One thing that caught my eye was a paper entitled
"Suppression of COVID-19 outbreak in the municipality of Vo’, Italy"
(E. Lavezzo
et al. Preprint at medRxiv
http://doi.org/ggsmcj). According to the
Nature summary:
"On 21 February, the town of Vo’ reported Italy’s first
COVID-19 death, leading authorities to ban movement in the town and end public
services and commercial activities there for two weeks. Andrea Crisanti at
Imperial College London and his colleagues swabbed almost every resident of Vo’
for viral RNA at the beginning and end of the lockdown. ... The team found that
some 43% of the people infected with SARS-CoV-2 in the town reported no fever
or other symptoms. The researchers observed no statistically significant
difference in potential infectiousness between those who reported symptoms and
those who did not."
The paper in question was posted on
https://www.medrxiv.org/ which describes
itself as a preprint server for health sciences. There is a warning on the top
of each page that the paper is "
not
certified by peer review". This seems to imply that once it has been
"certified" - by a couple of "peers" chosen by the editor
of the journal - then its contents must be right, trustworthy or whatever. This
assumption is implicit in the journalistic practice of giving the name of the
journal a research paper appears in with the implied "must the right, it's
published in this prestigious journal". The paper was posted on 18
April 2020. I am writing this two weeks later and there has been no update: surely
given the urgency of the problem the paper should have been peer reviewed
already, and a link to the update posted so that readers are aware of the
latest version?
In practice, peer review
and the assumptions behind it, are far too crude. At the frontiers of knowledge
science is an uncertain business, and the idea that the standards required of
new research are sufficiently clear cut for a certificate, issued by an editor aided
by a couple of anonymous peer reviewers,
which will "certify" the validity of the research is at best
wildly unrealistic. In the real world both trustworthy and untrustworthy
research is published both in peer reviewed journals and un-peer-reviewed on
preprint servers and other websites. And the timing issue is important - peer
review usually takes months and may take years (my record from submission to
publication is over four years).
A better system is desperately
needed which would produce a critique, and suggest improvements, from as many
relevant perspectives as possible, as quickly and transparently as possible. The
idea that peers are the only relevant judges of quality is very strange. Ideally
we would want review by superiors, but as these obviously don't exist at the frontiers
of knowledge, surely the views of "peers" should be supplemented by "non-peers"
like experts in neighbouring or other relevant disciplines (like statistics) and
members of the potential audience? The
conventional relaxed timetable for the reviewing process is also odd and
unnecessary. This paper should surely be vetted as quickly as possible so that
people know how much trust they should place in its conclusions. A few years
ago I made some suggestions for tackling these issues - see The journal of everything (Times Higher Education,
2010; there is a similar article
here) and
Journals,
repositories, peer review, non-peer review, and the future of scholarly
communication (arxiv.org, 2013). (Confession: the arxiv.org paper
was not peer-reviewed, but an
earlier,
less interesting, paper was peer reviewed;
the Times Higher Education article was
reviewed by the editors.)
It is also important that a research papers should be as
clear and easy to understand as possible, particularly when, as in the case of
the covid-19 paper, it is relevant to a range of experts with different
specialisms - e.g. public-health, immunologists, supply chain experts,
ethicists, anthropologists, historians, as well as medics according to
this
article. I read the abstract to see how much sense it made to me as someone
relative ignorant of all these areas. There were some biological and
epidemiological jargon which was probably inevitable given the topic. I did not
know what a "serial interval" is, but a quick check on wikipedia
solved this problem. There were also some genetic terms and phrases I did not
understand, but their role seemed sufficiently clear from the context. As far
as I could judge - and as a complete novice this is not very far - these
aspects of the abstract were as simple as is consistent with conveying the
meaning.
However I think the statistical approach could be improved
from the point of view of providing an analysis which is as clear and useful as
possible for as wide an audience as possible. The authors "found no
statistically significant difference in the viral load ... of symptomatic versus
asymptomatic infections (p-values 0.6 and 0.2 for E and RdRp genes,
respectively, Exact Wilcoxon-Mann-Whitney test)." This follows a
conventional statistical approach, but it is one that is deeply flawed. We
aren't told a vital bit of information - how big is the difference? Is it big
enough to matter? The fact that is statistically significant seems to imply it
matters but this is not what this actually means. And without a training in
statistics the p values are meaningless.
In my view a
far better way to analyse and present these results would be in terms of
confidence levels. So we might conclude that for the E gene, if we extrapolate
the results beyond the sample studied, we can be 80% confident that symptomatic
infections have a higher viral load than asymptomatic ones. Or perhaps 60%
confident that there is a big difference (more than a specified threshold). I
made these figures up because I haven't got the data, but the principle of
giving a straightforward confidence level should be clear. Then it should be
obvious that 80% confidence means there is a 20% confidence it's
not true. There are more details of this
approach in
Simple methods for estimating confidence levels, or
tentative probabilities, for hypotheses instead of p values.
This
illustrates another theme that I think is very important: keeping academic
ideas as simple as possible. There is, I think, a tendency for academics to
keep things complicated to prove how clever, and essential, they are. The
danger then is that each specialism retreats into its own silo making
communication with other specialisms and the wider public more and more
difficult. I think we should resist this tendency and strive to make things as
simple as possible. But not, of course, as I think Einstein said, simpler. I
have written a few articles on this theme, for example:
I'll make
it simple (Times Higher Education, 2002, also
here),
Maths
should not be hard: the case for making academic knowledge more palatable
(Higher Education Review, 2002) and
Simplifying
academic knowledge to make cognition more efficient: opportunities, benefits
and barriers.
Finally there is the problem that some scientists only accept evidence from statistical studies based on largish samples. Anything less is mere anecdote and not to be taken seriously. But even a single anecdote can prove that something is possible - perhaps that covid can be caught twice. Anecdotes are potentially important, as indeed is fiction (think of Einstein's thought experiments, or Schrodinger's cat), as I argue in
Are ‘Qualitative’ and ‘Quantitative’ Useful Terms for Describing Research? and
Anecdote, fiction and statistics. There is also an article in the
Times Higher Education, which cites several other academic articles, and comes to similar conclusions.