XiScience Prevalence Model

Calibrated Incidence Data

The graph above shows our in-house SARS-CoV -2 prevalence estimate in blue, superimposed with prevalence estimates from the Infection Survey for the UK (grey), and symptomatic COVID prevalence estimates (red) from the COVID Symptom Study.

Our in-house model is derived from daily swab tests data (by date of swab test), taken from the UK government website (see: government data ). This data is calibrated to account for the daily number of swab tests performed, which gives a percentage positivity.

This positivity data is then scaled to fit hospital admissions, and adjusted for pillar variance (see explanation). The resulting dataset is then converted into an incidence (this is displayed on the datahub front page as ‘NEW CASES - UK Positive Tests’).

Prevalence from Incidence

Using the methodology of Varsavsky et al (see:, this incidence data is converted into an estimation of actively infectious SARS-CoV-2 cases, by applying a gamma distribution based COVID recovery model, and fitting the resulting dataset to the Infection Survey data for the UK.

This gives a dataset that as we can see from the graph above, fits well (on the date of writing this [12th Feb 21]) with the magnitude and time trends of Infection Survey data totalled for the UK. It also tracks well the changes over time for the Symptom Study dataset, though with differing magnitude.

It is important to note that the Symptom Study only detects symptomatic COVID, whereas both the Infection Survey, and UK swab testing detect both symptomatic and asymptomatic SARS-CoV -2 infection.

Why Create another Model?

The importance of having this in-house derived active case estimator, is that firstly it is derived from a different data source to both the Infection Survey and the Symptom Study. Thus it forms a complement to these datasets, giving us another measure to help see the current status of the coronavirus epidemic in the UK.

Secondly, though a gold standard for active case estimation, the UK Infection Survey data only reports once a week, requires manual extraction of data, with the addition of processing and upload. Further to this, on the date the Infection Survey data is published, the data is already a week old, as it is based on data up to a week before the report date. This makes it difficult to have a contemporary view on a daily basis of the status of the coronavirus epidemic nationally. Our in-house model solves this, as it updates live, as new swab test data is published on the government website.


All data presented in the charts and analysis on this website come exclusively from external sources. The data is downloaded and updated live using APIs as you peruse this site. This datahub is intended as a convenient portal to access the data, and see a variety of different analytical representations. It is not intended as a primary source, and as such we make no warranties or claims to the authenticity or accuracy of the data presented here; you should access directly the source data for critical decision making.