cowidev.megafile.steps

cowidev.megafile.steps.cgrt

merge

cowidev.megafile.steps.cgrt.clean_cgrt(url, columns_rename, country_mapping)[source]
cowidev.megafile.steps.cgrt.get_cgrt(bsg_latest: str, bsg_diff_latest: str, country_mapping: str)[source]

Downloads the latest OxCGRT dataset from BSG’s GitHub repository Remaps BSG country names to OWID country names

Returns:

cgrt {dataframe}

cowidev.megafile.steps.core

cowidev.megafile.steps.core.get_base_dataset(logger)[source]

Get owid datasets from: jhu, reproduction rate, hospitalizations, testing ,vaccinations, CGRT.

cowidev.megafile.steps.hosp

merge

cowidev.megafile.steps.hosp.get_hosp(data_file: str)[source]

cowidev.megafile.steps.jhu

cowidev.megafile.steps.jhu.add_cumulative_deaths_last12m(df: DataFrame) DataFrame[source]
cowidev.megafile.steps.jhu.get_jhu(jhu_dir: str)[source]

Reads each COVID-19 JHU dataset located in /public/data/jhu/ Melts the dataframe to vertical format (1 row per country and date) Merges all JHU dataframes into one with outer joins

Returns:

jhu {dataframe}

cowidev.megafile.steps.macro

cowidev.megafile.steps.macro.add_macro_variables(complete_dataset: DataFrame, macro_variables: dict, data_dir: str)[source]

Appends a list of ‘macro’ (non-directly COVID related) variables to the dataset The data is denormalized, i.e. each yearly value (for example GDP per capita) is added to each row of the complete dataset. This is meant to facilitate the use of our dataset by non-experts.

cowidev.megafile.steps.reprod

merge

cowidev.megafile.steps.reprod.get_reprod(file_url: str, country_mapping: str)[source]

cowidev.megafile.steps.test

cowidev.megafile.steps.test.get_testing()[source]

Reads the main COVID-19 testing dataset located in /public/data/testing/ Rearranges the Entity column to separate location from testing units Checks for duplicated location/date couples, as we can have more than 1 time series per country

Returns:

testing {dataframe}

cowidev.megafile.steps.variants

cowidev.megafile.steps.variants.get_variants(cases_file: str, variants_file: str) DataFrame[source]

Fetches the processed data from CoVariants.org and merges it with biweekly cases from JHU.

cowidev.megafile.steps.variants.read(path, **kwargs)[source]

cowidev.megafile.steps.vax

cowidev.megafile.steps.vax._add_rolling(df: DataFrame) DataFrame[source]
cowidev.megafile.steps.vax.add_rolling_vaccinations(df: DataFrame) DataFrame[source]
cowidev.megafile.steps.vax.get_vax(data_file)[source]

cowidev.megafile.steps.xm

cowidev.megafile.steps.xm._add_last12m_to_metric(df: DataFrame, column_metric: str, column_location: str, scaling: int, scaling_slug: str) DataFrame[source]
cowidev.megafile.steps.xm.add_excess_mortality(df: DataFrame, wmd_hmd_file: str, economist_file: str) DataFrame[source]
cowidev.megafile.steps.add_cumulative_deaths_last12m(df: DataFrame) DataFrame[source]
cowidev.megafile.steps.add_excess_mortality(df: DataFrame, wmd_hmd_file: str, economist_file: str) DataFrame[source]
cowidev.megafile.steps.add_macro_variables(complete_dataset: DataFrame, macro_variables: dict, data_dir: str)[source]

Appends a list of ‘macro’ (non-directly COVID related) variables to the dataset The data is denormalized, i.e. each yearly value (for example GDP per capita) is added to each row of the complete dataset. This is meant to facilitate the use of our dataset by non-experts.

cowidev.megafile.steps.add_rolling_vaccinations(df: DataFrame) DataFrame[source]
cowidev.megafile.steps.get_base_dataset(logger)[source]

Get owid datasets from: jhu, reproduction rate, hospitalizations, testing ,vaccinations, CGRT.