There's a lot of research papers out there - some good, some bad, some intelligible, some unintelligible, some disseminated quickly, some slowly. Now, with covid-19, it is more important than ever that we get to know whether a research paper should be trusted, and as many as possible of the relevant people should be able to understand it as quickly as possible, and that the research methods employed are not unduly restrictive The phrases in italics refer to key problems with the system for creating and disseminating knowledge. The purpose of this note is to explain how a few publications of mine are relevant to tackling these problems.
Just after deciding to write this note, I received a "nature briefing" email from the journal Nature summarising some of their latest highlights. One thing that caught my eye was a paper entitled "Suppression of COVID-19 outbreak in the municipality of Vo’, Italy" (E. Lavezzo et al. Preprint at medRxiv http://doi.org/ggsmcj). According to the Nature summary: "On 21 February, the town of Vo’ reported Italy’s first COVID-19 death, leading authorities to ban movement in the town and end public services and commercial activities there for two weeks. Andrea Crisanti at Imperial College London and his colleagues swabbed almost every resident of Vo’ for viral RNA at the beginning and end of the lockdown. ... The team found that some 43% of the people infected with SARS-CoV-2 in the town reported no fever or other symptoms. The researchers observed no statistically significant difference in potential infectiousness between those who reported symptoms and those who did not."
The paper in question was posted on https://www.medrxiv.org/ which describes itself as a preprint server for health sciences. There is a warning on the top of each page that the paper is "not certified by peer review". This seems to imply that once it has been "certified" - by a couple of "peers" chosen by the editor of the journal - then its contents must be right, trustworthy or whatever. This assumption is implicit in the journalistic practice of giving the name of the journal a research paper appears in with the implied "must the right, it's published in this prestigious journal". The paper was posted on 18 April 2020. I am writing this two weeks later and there has been no update: surely given the urgency of the problem the paper should have been peer reviewed already, and a link to the update posted so that readers are aware of the latest version?
In practice, peer review and the assumptions behind it, are far too crude. At the frontiers of knowledge science is an uncertain business, and the idea that the standards required of new research are sufficiently clear cut for a certificate, issued by an editor aided by a couple of anonymous peer reviewers, which will "certify" the validity of the research is at best wildly unrealistic. In the real world both trustworthy and untrustworthy research is published both in peer reviewed journals and un-peer-reviewed on preprint servers and other websites. And the timing issue is important - peer review usually takes months and may take years (my record from submission to publication is over four years).
A better system is desperately needed which would produce a critique, and suggest improvements, from as many relevant perspectives as possible, as quickly and transparently as possible. The idea that peers are the only relevant judges of quality is very strange. Ideally we would want review by superiors, but as these obviously don't exist at the frontiers of knowledge, surely the views of "peers" should be supplemented by "non-peers" like experts in neighbouring or other relevant disciplines (like statistics) and members of the potential audience? The conventional relaxed timetable for the reviewing process is also odd and unnecessary. This paper should surely be vetted as quickly as possible so that people know how much trust they should place in its conclusions. A few years ago I made some suggestions for tackling these issues - see The journal of everything (Times Higher Education, 2010; there is a similar article here) and Journals, repositories, peer review, non-peer review, and the future of scholarly communication (arxiv.org, 2013). (Confession: the arxiv.org paper was not peer-reviewed, but an earlier, less interesting, paper was peer reviewed; the Times Higher Education article was reviewed by the editors.)
It is also important that a research papers should be as clear and easy to understand as possible, particularly when, as in the case of the covid-19 paper, it is relevant to a range of experts with different specialisms - e.g. public-health, immunologists, supply chain experts, ethicists, anthropologists, historians, as well as medics according to this article. I read the abstract to see how much sense it made to me as someone relative ignorant of all these areas. There were some biological and epidemiological jargon which was probably inevitable given the topic. I did not know what a "serial interval" is, but a quick check on wikipedia solved this problem. There were also some genetic terms and phrases I did not understand, but their role seemed sufficiently clear from the context. As far as I could judge - and as a complete novice this is not very far - these aspects of the abstract were as simple as is consistent with conveying the meaning.
However I think the statistical approach could be improved from the point of view of providing an analysis which is as clear and useful as possible for as wide an audience as possible. The authors "found no statistically significant difference in the viral load ... of symptomatic versus asymptomatic infections (p-values 0.6 and 0.2 for E and RdRp genes, respectively, Exact Wilcoxon-Mann-Whitney test)." This follows a conventional statistical approach, but it is one that is deeply flawed. We aren't told a vital bit of information - how big is the difference? Is it big enough to matter? The fact that is statistically significant seems to imply it matters but this is not what this actually means. And without a training in statistics the p values are meaningless.
In my view a far better way to analyse and present these results would be in terms of confidence levels. So we might conclude that for the E gene, if we extrapolate the results beyond the sample studied, we can be 80% confident that symptomatic infections have a higher viral load than asymptomatic ones. Or perhaps 60% confident that there is a big difference (more than a specified threshold). I made these figures up because I haven't got the data, but the principle of giving a straightforward confidence level should be clear. Then it should be obvious that 80% confidence means there is a 20% confidence it's not true. There are more details of this approach in Simple methods for estimating confidence levels, or tentative probabilities, for hypotheses instead of p values.
This illustrates another theme that I think is very important: keeping academic ideas as simple as possible. There is, I think, a tendency for academics to keep things complicated to prove how clever, and essential, they are. The danger then is that each specialism retreats into its own silo making communication with other specialisms and the wider public more and more difficult. I think we should resist this tendency and strive to make things as simple as possible. But not, of course, as I think Einstein said, simpler. I have written a few articles on this theme, for example: I'll make it simple (Times Higher Education, 2002, also here), Maths should not be hard: the case for making academic knowledge more palatable (Higher Education Review, 2002) and Simplifying academic knowledge to make cognition more efficient: opportunities, benefits and barriers.
Finally there is the problem that some scientists only accept evidence from statistical studies based on largish samples. Anything less is mere anecdote and not to be taken seriously. But even a single anecdote can prove that something is possible - perhaps that covid can be caught twice. Anecdotes are potentially important, as indeed is fiction (think of Einstein's thought experiments, or Schrodinger's cat), as I argue in Are ‘Qualitative’ and ‘Quantitative’ Useful Terms for Describing Research? and Anecdote, fiction and statistics. There is also an article in the Times Higher Education, which cites several other academic articles, and comes to similar conclusions.