COVID-19

If you are using these graphs or the historical data I've made available (e.g. for analysis, planning, making reports, personal usage, curiosity, etc.), I would greatly appreciate if you could drop me a quick email to s.murdoch@ucl.ac.uk so I can better understand what this service is used for.

Between October 2020 and May 2022, UCL published statistics of COVID-19 cases for UCL staff and students. This page shows data for the full period, so as to better understand any trends present.

Before 2022-03-02, data was published during weekdays, with cases from Saturday to Monday being merged into the results reported on Tuesday. Daily cases reported on Tuesday are shown as being shared equally over Saturday, Sunday, and Monday to avoid a misleading weekly peak. Weekly and total cases don't need such smoothing, so data for Saturday and Sunday are simply omitted. From 2022-03-02 onwards, data was published weekly and so daily cases are calculated by sharing the weekly case count over the applicable seven days. Similarly, cases reported over vacations are smoothed over the applicable period. See the official UCL website for important caveats on how this data should be interpreted. Analysis code and historical data is on GitHub.

Weekly case statistics

Before 2021-01-05 this graph is based on a rolling 7-day total of the daily reported cases, rather than the reported weekly statistics. See the discussion below for the reason for this change.

Daily case statistics

 

Total cases since start of Term 1

* On 26 October 2020, an additional 89 cases were added to the total that were not previously included in daily or weekly statistics.

Discussion

The above charts report the data as published by UCL. The only adjustments I have made are to smooth the data covering the weekend and Monday over the three days in question, such that a misleading peak on Tuesday is prevented, and to re-calculate the weekly statistics (see below why). I have avoided drawing any conclusions or predictions from this data, though I welcome those more qualified to do so to use the data as they wish. However, there are some aspects of the data which might not be obvious. In this section, I will, therefore discuss some of these findings in the hope that it will assist those in interpreting the data. Please do get in contact if you have any comments or questions.

Definition of on-campus cases

The definition of an on-campus case for the purposes of these statistics is “those who have visited UCL buildings, spending 15 minutes or more, in the two days before starting experiencing symptoms or requesting a test.” This is very similar to the criteria used by the UK Test and Trace system for identifying cases of potential onward transmission (forward-tracing). However, this definition of “on-campus” isn‘t suitable for identifying the source of an infection (backwards-tracing), because it generally takes four to five days from exposure to the start of symptoms. For this reason, the on/off-campus definition is not a reliable indication of whether an infection occurred on-campus, nor particularly helpful for identifying clusters of infection.

Relationship between daily, weekly and total figures

The case statistics published include a daily, weekly and total (since the start of term 1). These are obviously related, e.g. subtracting consecutive totals should give the daily number of cases. As of 14 November, this is indeed the case, other than the additional 89 cases published on 26 October.

A rolling 7-day total of the daily number of cases should also give the weekly totals. However, in the published data, this is not the case. In the chart below, the reported weekly totals for student cases are shown, along with a rolling 7-day total of the daily number of cases. As can be seen, the reported on-campus weekly totals tend to be larger than the rolling 7-day total. Off-campus student cases also exhibit a similar discrepancy, but there the calculated total is larger than the reported total.

The reason for this discrepancy is that 79 cases were incorrectly carried forward in the 7-day totals because they had incomplete information. As of 4 November, this issue has been resolved. The issue also did not affect the daily or overall case totals. To more accurately represent the statistics before the correction, the weekly graph shown above (before 2021-01-05) is calculated from the daily statistics, rather than using the reported weekly statistics.

Uncertainty in the true number of cases

The reported statistics only cover cases where UCL has been made aware of a positive COVID-19 test result. For this to have happened, someone would need to have symptoms, try and succeed to get tested, and report the result to UCL. If any of these steps don’t occur, a positive case would not be present in the published statistics.

UCL does not carry out testing of staff or students without symptoms. The University of Cambridge does and, as of 1 November 2020, about a third of positive cases were identified through this asymptomatic testing programme. It’s not clear how many of these cases would have become symptomatic, and it’s not clear how Cambridge’s example generalises, but it does seem likely that some asymptomatic cases are being missed in the UCL statistics.

Not all individuals with symptoms will choose to get tested, and some who try will not be able to get a test, although UCL does offer its own testing programme for eligible staff and students. Therefore some symptomatic cases of COVID-19 will not be identified through testing.

Even those individuals who receive a positive test result may not all report their result to UCL. On 27 October there was a one-off addition of 89 cases to the total, which were mainly students who received positive test results, did not notify UCL, but were identified because they used UCL’s testing service. I’ve also heard anecdotes of students not wanting to report cases to UCL. It’s not clear how many other instances there are of positive results not being reported.

For all these reasons, it’s clear that the statistics are an underestimate of the true number of cases, but it’s not clear (to me) by how much. Do get in touch if you have any good ideas on how to estimate the level of uncertainty.