Regions data
Description of the dataset
This dataset contains useful information about the countries and regions of OWID datasets.
There are no snapshot
or meadow
steps for this data, only a garden
step.
We could consider creating a grapher
step to feed the charts on the region definitions page](https://ourworldindata.org/world-region-map-definitions).
All tables in the dataset are indexed by code
, which is defined as the ISO alpha-3 code, when it exists, and otherwise as a custom OWID code. For example, since Europe doesn't have an ISO code, it has the code OWID_EUR
. There are no specific rules for how to define these custom codes, but as a guideline:
- The code following
OWID_
should not exist already as an ISO code. This rule has not been followed in the past, and we haveOWID_NAM
forNorth America
(whileNAM
is also the code ofNamibia
). This can lead to confusion, and hence we should try to apply this rule in the future. - If the region is a sub-region of a country, append its code after the country's code. For example, the code of
Madrid
(which is a region in Spain) could beOWID_ESP_MAD
, given thatMAD
does not exist (Madagascar
's code isMDG
).
Tables contained in the regions
dataset:
aliases
: Region aliases (i.e. variants of the region name). Columns:alias
: Alternative name for a region. For example, forUnited States
, there is a row for the aliasUS
and another row for the aliasUSA
.
definitions
: Region definitions. Columns:name
: Name of the region (that will be shown in most charts).short_name
: Short version of the name of the region (that will be shown in specific charts that have limited space).region_type
: Region type. Currently, the options are:country
: Country (e.g. 'France'). The official status of a region may be unclear in some cases, but we tend to include as many countries as possible.continent
: Inhabited continent (namely 'Africa', 'Asia', 'Europe', 'North America', 'Oceania', and 'South America').aggregate
: Region that is not a country and includes other countries (e.g. 'Channel Islands', 'European Union (27)', 'Melanesia', 'Polynesia', 'World').other
: Regions that may not be considered countries by certain data providers, or that have a custom definition (like 'Serbia excluding Kosovo') and that are not aggregates of other countries.
is_historical
: True if the region does not exist anymore, and False otherwise.defined_by
: Institution that contained the region in a dataset. For example, if a regionNorth America (BP)
is added to theregions
dataset,defined_by
would bebp
(the namespace that dataset belongs to).
legacy_codes
: Legacy codes. Columns:cow_code
: Correlates of War numeric code.cow_letter
: Correlates of War letter code.imf_code
: International Monetary Fund code.iso_alpha2
: 2-letter International Organization for Standardization alpha-2 code.iso_alpha3
: 3-letter International Organization for Standardization alpha-3 code.kansas_code
: TODO: Describe this code.legacy_country_id
: TODO: Describe this code.legacy_entity_id
: TODO: Describe this code.marc_code
: MARC (Machine Readable Cataloging) code.ncd_code
: TODO: Describe this code.penn_code
: Country code for the Penn World Tables.unctad_code
: UNCTAD (United Nations Conference on Trade and Development) code.wikidata_code
: Wikidata code. To create the URL of the wikidata page of the region, append the wikidata code to: http://www.wikidata.org/entity/
members
: Region members (roughly, sub-regions that would need to be added up when aggregating data for the region). Columns:member
: Region member. For example, regionAfrica
contains one row for each region in Africa (including historical regions).
related
: Other possible region members to be aware of. This includes regions with an unclear official status that may be members of another region according to some data providers, but not according to others. Columns:member
: Related region (e.g. an overseas territory).
transitions
: Historical transitions between regions. Columns:end_year
: Last year the historical region existed.successor
: Country that existed (fromend_year
on) in the same geographical space as the historical region.
How to make changes to the dataset
New aliases and short names can be added to the dataset without creating a new dataset version. For that, we can use the harmonize
tool in etl
.
TODO: Close issue: https://github.com/owid/etl/issues/845
For any other type of change to the dataset:
- If the change does not affect existing datasets, it can be done without creating a new dataset version. For example, adding a new region for a particular institution (e.g.
North America (BP)
) does not affect any other existing dataset. - If the change does affect existing datasets, then a new version needs to be created. For example, if sub-regions of a country are added, and they are also added as the members of a continent, this could affect existing datasets (that happened to have data for those sub-regions). However, if it's clear that the changes do not affect existing countries, then there is no need to update the dataset version.