Risks, R numbers and raw data: how to interpret coronavirus statistics

<span>Photograph: US Food and Drug Administration/AFP via Getty Images</span>
Photograph: US Food and Drug Administration/AFP via Getty Images

We’re finally over the first peak of the epidemic, but the numbers relating to the virus keep on spreading. Sometimes, however, things get lost in translation from the spreadsheet to the article, broadcast or tweet.

Excess deaths

This is the “gold standard” for comparisons between countries, as it doesn’t depend on the vagaries of what counts as a “Covid death”. But excess to what?

Excess deaths are not counted – they are a constructed number based on assumptions. For example, in England and Wales, the number of excess deaths linked to the epidemic, up to 19 June, could be either:

• 59,259, comparing registrations to the five-year average, starting from 14 March.

• 54,365, comparing registrations to the five-year average, starting at the beginning of 2020.

• 38,485, comparing occurrences, rather than registrations, to the five-year average, starting from 4 January.

Last week, it was reported that there were no excess deaths, when compared with the five-year average for this time of year. But are we back to normal? The graph shows that there was still a large deficit in non-Covid deaths in hospital, which instead were happening at home.

Deaths in England and Wales March-June

Small numbers

As the infections have dropped, the numbers have got smaller. It would be churlish to complain about this, but it does mean that there is far more uncertainty about what is happening. For example, the Office for National Statistics has just reported that the number of people in England testing positive was previously decreasing but has now levelled off. There was some additional modelling, but the raw data comprised “swab tests collected from 23,203 participants, of which 12 individuals tested positive for Covid-19”. This small number of positive tests means there is great uncertainty as to current infection levels.

The R number

There has been a fixation on estimating R, the average number of people that someone infects, but national and even regional averages are becoming less relevant as less virus is circulating and local conditions dominate. After all, R for smallpox is about 3.5 to 6, but we don’t care about this as (fortunately) there is no smallpox around.

The published caveats around R are excellent, explaining that there is both uncertainty about average figures, and also regional variability. But it does not stop people asking whether an interval of 0.8 to 1.1 in London is really different to one of 0.7 to 1.0 in the south-west. Answer: not much. The crucial measure now is the number of infections in the community. The best analogy is with the pollen count. Once you know your personal vulnerability, you also need good data about local conditions.

The ‘risk of dying from Covid’

The phrase “the risk of dying from Covid” should be avoided as it is so ambiguous: does it mean the risk if you get Covid, or does it mean the risk of a non-infected person both catching and dying from Covid?

The Office for National Statistics reported that Black, Asian and minority ethnic (BAME) groups were about twice as likely, after adjusting for some contextual factors, to die from Covid, meaning that BAME groups had a higher risk of both getting the disease and then dying from it. An unknown part of this excess risk could come from an increased risk of catching the virus, perhaps through coming in contact with more people in their daily lives.

But the BBC News at Ten on 7 May reported that BAME individuals were “90% more likely to die, if they became seriously ill with Covid-19”, which is not what was being claimed and could be very misleading.

Test accuracy

Anything to do with the accuracy of diagnostic tests is notoriously badly reported. The Daily Mail had a story about an antigen test with 70% “sensitivity”, a piece of useful jargon which means that it will be negative in 30% of patients with Covid. The headline was “30% of negative tests are wrong”, which is itself wrong – the true figure would be about 2%.

The explanation requires care. Assume around 6% of those tested are infected and such a test will only show a positive result in 1% of people who do not have the virus (ie a “specificity” of 99%). Then some simple calculations reveal that the actual proportion of negative tests that are wrong is 2%. Not 30%.

David Spiegelhalter is chair of the Winton Centre for Risk and Evidence Communication at Cambridge University. He is the author of The Art of Statistics