cowidev.megafile.steps#

cowidev.megafile.steps.cgrt#

merge

cowidev.megafile.steps.cgrt.clean_cgrt(url, columns_rename, country_mapping)[source]#

cowidev.megafile.steps.cgrt.get_cgrt(bsg_latest: str, bsg_diff_latest: str, country_mapping: str)[source]#

Downloads the latest OxCGRT dataset from BSG’s GitHub repository Remaps BSG country names to OWID country names

Returns:: cgrt {dataframe}

cowidev.megafile.steps.core#

cowidev.megafile.steps.core.get_base_dataset(logger)[source]#: Get owid datasets from: jhu, reproduction rate, hospitalizations, testing ,vaccinations, CGRT.

cowidev.megafile.steps.hosp#

merge

cowidev.megafile.steps.hosp.get_hosp(data_file: str)[source]#

cowidev.megafile.steps.jhu#

cowidev.megafile.steps.jhu.add_cumulative_deaths_last12m(df: DataFrame) → DataFrame[source]#

cowidev.megafile.steps.jhu.get_jhu(jhu_dir: str)[source]#

Reads each COVID-19 JHU dataset located in /public/data/jhu/ Melts the dataframe to vertical format (1 row per country and date) Merges all JHU dataframes into one with outer joins

Returns:: jhu {dataframe}

cowidev.megafile.steps.macro#

cowidev.megafile.steps.macro.add_macro_variables(complete_dataset: DataFrame, macro_variables: dict, data_dir: str)[source]#: Appends a list of ‘macro’ (non-directly COVID related) variables to the dataset The data is denormalized, i.e. each yearly value (for example GDP per capita) is added to each row of the complete dataset. This is meant to facilitate the use of our dataset by non-experts.

cowidev.megafile.steps.reprod#

merge

cowidev.megafile.steps.reprod.get_reprod(file_url: str, country_mapping: str)[source]#

cowidev.megafile.steps.test#

cowidev.megafile.steps.test.get_testing()[source]#

Reads the main COVID-19 testing dataset located in /public/data/testing/ Rearranges the Entity column to separate location from testing units Checks for duplicated location/date couples, as we can have more than 1 time series per country

Returns:: testing {dataframe}

cowidev.megafile.steps.variants#

cowidev.megafile.steps.variants.get_variants(cases_file: str, variants_file: str) → DataFrame[source]#: Fetches the processed data from CoVariants.org and merges it with biweekly cases from JHU.

cowidev.megafile.steps.variants.read(path, **kwargs)[source]#

cowidev.megafile.steps.vax#

cowidev.megafile.steps.vax._add_rolling(df: DataFrame) → DataFrame[source]#

cowidev.megafile.steps.vax.add_rolling_vaccinations(df: DataFrame) → DataFrame[source]#

cowidev.megafile.steps.vax.get_vax(data_file)[source]#

cowidev.megafile.steps.xm#

cowidev.megafile.steps.xm._add_last12m_to_metric(df: DataFrame, column_metric: str, column_location: str, scaling: int, scaling_slug: str) → DataFrame[source]#

cowidev.megafile.steps.xm.add_excess_mortality(df: DataFrame, wmd_hmd_file: str, economist_file: str) → DataFrame[source]#

cowidev.megafile.steps.add_cumulative_deaths_last12m(df: DataFrame) → DataFrame[source]#

cowidev.megafile.steps.add_excess_mortality(df: DataFrame, wmd_hmd_file: str, economist_file: str) → DataFrame[source]#

cowidev.megafile.steps.add_macro_variables(complete_dataset: DataFrame, macro_variables: dict, data_dir: str)[source]#: Appends a list of ‘macro’ (non-directly COVID related) variables to the dataset The data is denormalized, i.e. each yearly value (for example GDP per capita) is added to each row of the complete dataset. This is meant to facilitate the use of our dataset by non-experts.

cowidev.megafile.steps.add_rolling_vaccinations(df: DataFrame) → DataFrame[source]#

cowidev.megafile.steps.get_base_dataset(logger)[source]#: Get owid datasets from: jhu, reproduction rate, hospitalizations, testing ,vaccinations, CGRT.