Skip to content

COVID-19

This page is a compact summary of our COVID-19 work, with all the relevant links to download our COVID-19 datasets.

I just want the data!

Our work

Our World in Data (OWID) has collected COVID-19 data from various domains since the pandemic started. We believe that to make progress against the Coronavirus disease – COVID-19 outbreak, we need to understand how the pandemic is developing. For this, we need reliable and timely data. Therefore, we have focused on bringing together the research and statistics on the COVID-19 outbreak.

Legacy data work

We started working on COVID-19 data in early 2020, developing and implementing several data pipelines to process and publish the data. All this work has been live and shared with the public via our GitHub repository https://github.com/owid/covid-19-data and our old COVID documentation. We have complemented our data work with extensive research articles, which we have shared on our topic page.

Publications

Hasell, J., Mathieu, E., Beltekian, D. et al. A cross-country database of COVID-19 testing. Sci Data 7, 345 (2020). https://doi.org/10.1038/s41597-020-00688-8

Mathieu, E., Ritchie, H., Ortiz-Ospina, E. et al. A global database of COVID-19 vaccinations. Nat Hum Behav 5, 947–953 (2021). https://doi.org/10.1038/s41562-021-01122-8

Herre, B., Rodés-Guirao, L., Mathieu, E. et al. Best practices for government agencies to publish data: lessons from COVID-19. The Lancet Public Health, Viewpoint, Volume 9, ISSUE 6, e407-e410 (2024). https://doi.org/10.1016/S2468-2667(24)00073-2

Transition to ETL

We started working on COVID-19 before our ETL system. In mid-2024, we decided to migrate all our COVID-19 data work into ETL and make our data available from our catalog.

Download data

Our compact COVID-19 dataset is a compilation of the most relevant COVID-19 indicators we have collected in the last few years. It consolidates indicators from various datasets into a single file. It comes with metadata, which explains all the indicators in detail. In the past, this dataset was generated and shared in our GitHub repository.

Download our compact dataset (CSV) Download metadata

In addition to our compact dataset, we provide individual datasets with all our COVID-19 indicators. These files are direct exports from our ETL.

Data Metadata
Cases and Deaths download download
Excess Mortality download download
Excess Mortality (The Economist) download download
Hospitalizations download download
Vaccinations download download
Vaccinations (by age) download download
Vaccinations (by manufacturer) download download
Vaccinations (US) download download
Testing download download
Reproduction rate download download
Google mobility download download
Government response policy download download
Attitudes (YouGov) download download
Donations (COVAX) download download

All our COVID-19 data pipelines are specified in our DAG.

Data providers

The data produced by third parties and made available by Our World in Data is subject to the license terms from the original third-party authors. We will always indicate the data source in our database, and you should always check the license of any such third-party data before use.

Learn more about the licensing in the metadata files.

Understanding our metadata

Our metadata contains all the relevant information about an indicator. This includes licenses, descriptions, units, etc. We use this metadata to bake our charts on our site.

Learn more in our metadata reference.

Acces the data with our catalog

Our catalog library is in alpha.

Install our catalog package

pip install owid-catalog

Usage and preview

URIs identify our data, and for COVID data these go like this:

data://garden/covid/latest/{DATASET_NAME}/{TABLE_NAME}

Where:

  • DATASET_NAME is the name of the dataset (e.g. case_death)
  • TABLE_NAME is the name of the table (e.g. case_death)

→ Learn more about our URIs

Notes:

  • A dataset can be a collection of tables (equivalent to DataFrames). For instance, several files (or DataFrames) might be in our 'Vaccination' dataset (e.g., global data, US data, etc.).
  • Our excess mortality dataset is currently under the namespace excess_mortality, i.e. with URIs data://garden/excess_mortality/latest/{DATASET_NAME}/{TABLE_NAME}.

Check all our COVID data

Run:

from owid import catalog

# Preview list of available datasets (each row = dataset)
catalogs.find(namespace="covid")

# You can load any dataset (using the row of the above-returned table)
tb = catalogs.find(namespace="covid").iloc[3].load()

Load data

Use a URI from the table below1.

Data category URI
Cases and deaths garden/covid/latest/cases_deaths/cases_deaths
Excess Mortality garden/excess_mortality/latest/excess_mortality/excess_mortality
Excess Mortality (The Economist) garden/excess_mortality/latest/excess_mortality_economist/excess_mortality_economist
Hospitalisations garden/covid/latest/hospital/hospital/
Google Mobility garden/covid/latest/google_mobility/google_mobility
Policy Response (OxCGRT) garden/covid/latest/oxcgrt_policy/oxcgrt_policy
Indicator decoupling garden/covid/latest/decoupling/decoupling
YouGov garden/covid/latest/yougov/yougov
YouGov (Composite) garden/covid/latest/yougov/yougov_composite
Vaccinations (US) garden/covid/latest/vaccinations_us/vaccinations_us
Testing garden/covid/latest/testing/testing
Sequencing / Variants garden/covid/latest/sequence/sequence
Decoupling garden/covid/latest/decoupling/decoupling
Sweden confirmed deaths garden/covid/latest/sweden_covid/sweden_covid/
UK COVID Data garden/covid/latest/uk_covid/uk_covid/

and run the following code:

from owid import catalog

rc = catalog.RemoteCatalog()
uri = "..."
df = rc[uri]

Access metadata

Objects df are not pure pandas DataFrames, but rather owid.catalog.Table datasets, which behave like DataFrames but also contain metadata. You can access metadata like this:

# Table metadata
df.metadata
# Column (or indicator) metadata
df[column_name].metadata

→ Learn more about our metadata


  1. more items are being added to this table shortly.