Title: | CEPII's GeoDist datasets in R |
---|---|
Description: | Provides data on countries and their main city or agglomeration and the different distance measures and dummy variables indicating whether two countries are contiguous, share a common language or a colonial relationship. The reference article for these datasets is Mayer and Zignago (2011). |
Authors: | Mauricio Vargas [aut, cre] , Centre d'études prospectives et d'informations internationales (CEPII) [dtc] |
Maintainer: | Mauricio Vargas <[email protected]> |
License: | CC0 |
Version: | 0.1 |
Built: | 2024-11-11 06:16:09 UTC |
Source: | https://github.com/pachadotdev/cepiigeodist |
Provides different distance measures and dummy variables indicating whether the two countries are contiguous, share a common language or a colonial relationship. There are two kinds of distance measures: simple distances, for which only one city is necessary to calculate international distances; and weighted distances, for which we need data on principal cities in each country. The simple distances are calculated following the great circle formula, which uses latitudes and longitudes of the most important city (in terms of population) or of its official capital. These two variables incorporate internal distances based on areas provided in the ‘geo_cepii' dataset. The two weighted distance measures use city-level data to assess the geographic distribution of population inside each nation. The idea is to calculate distance between two countries based on bilateral distances between the largest cities of those two countries, those inter-city distances being weighted by the share of the city in the overall country’s population. The distance formula used is a generalized mean of city-to-city bilateral distances developed by Head and Mayer (2002), which takes the arithmetic mean and the harmonic means as special cases.
A data frame with 50176 observations on the following 14 variables.
iso_o
Country of origin as ISO codes in three characters.
iso_d
Country of destination as ISO codes in three characters.
contig
Variable coded as 1 when the two countries are next to each other and 0 otherwise.
comlang_off
Variable coded as 1 when the two countries share the same official language.
comlang_ethno
Variable coded as 1 when the two countries have at least 9% of their population speaking the same language.
colony
Variable coded as 1 when the country in 'iso_o' was ever a colony of the country in 'iso_d'.
comcol
Variable coded as 1 when the two country share the same colonizer after 1945.
curcol
Variable coded as 1 when the country in 'iso_o' is a colony of the country in 'iso_d'.
col45
Variable coded as 1 when the country in 'iso_o' is a colony of the country in 'iso_d' after 1945.
smctry
Variable coded as 1 when the two countries were or are the same country.
dist
Simple distance (most populated cities, km)
distcap
Simple distance between capitals (capitals, km)
distw
Weighted distance (pop-wt, km) with theta=1 (theta measures the sensitivity of trade flows to bilateral distance dkl)
distwces
Weighted distance (pop-wt, km) theta=-1.
http://www.cepii.fr/CEPII/en/bdd_modele/download.asp?id=6
Mayer, T. & Zignago, S. (2011) Notes on CEPII's distances measures: the GeoDist Database CEPII Working Paper 2011-25
Head, K. & Mayer, T. (2002) Illusory Border Effects: Distance Mismeasurement In-flates Estimates of Home Bias in Trade CEPII Working Paper 2002-01
# filter countries that share borders dist_cepii[dist_cepii$contig == 1, ]
# filter countries that share borders dist_cepii[dist_cepii$contig == 1, ]
There are firstly three identification codes of the country according to the ISO classification, the country's area in square kilometers, used to calculate in particular its internal distance. Variables indicating whether the country is landlocked and which continent it is part of are also included.
A data frame with 238 observations on the following 34 variables.
iso2
ISO codes in two characters.
iso3
ISO codes in three characters.
cnum
ISO codes in three numbers.
country
Name of country in English.
pays
Name of country in French.
area
Country's area in km2.
dis_int
Internal distance of country i, dii=.67*sqrt(area/pi) (an often used measure of average distance between producers and consumers in a country). See Head and Mayer, 2002 for more on this topic.
landlocked
Dummy variable set equal to 1 for landlocked countries.
continent
Continent to which the country is belonging.
city_en
Names of capitals or main cities of the country in English.
city_fr
Names of capitals or main cities of the country in French.
lat
Latitude of the city.
lon
Longitude of the city.
cap
Variable equals to 1 if the city is the capital of the country, to 0 if the city is the most populated city (maincity equals to 1) but not the capital, and to 2 in the cases of two capitals, if the city is the most populated but the "second" capital or the previous capital.
maincity
Variable coded as 1 when the city is the most populated of the country and as 2 otherwise.
citynum
Number of cities for each country used to calculate the weighted distances described in Mayer and Zignago, 2011.
langoff_1
Official or national languages and languages spoken by at least 20% of the population of the country (and spoken in another country of the world) following the same logic than the "open-circuit languages" in Mélitz (2002).
langoff_2
Same as langoff_1.
langoff_3
Same as langoff_1.
lang20_1
Languages (mother tongue, lingua francas or second languages) spoken by at least 20% of the population of the country.
lang20_2
Same as lang20_1.
lang20_3
Same as lang20_1.
lang20_4
Same as lang20_1.
lang9_1
Languages (mother tongue, lingua francas or second languages) spoken by between 9% amd 20% of the population of the country.
lang9_2
Same as lang9_1.
lang9_3
Same as lang9_1.
lang9_4
Same as lang9_1.
colonizer1
Colonizers of the country for a relatively long period of time and with asubstantial participation in the governance of the colonized country.
colonizer2
Same as colonizer1.
colonizer3
Same as colonizer1.
colonizer4
Same as colonizer1.
short_colonizer1
Colonizers of the country for a relatively short period of time orwith only low involvement in the governance of the colonized country.
short_colonizer2
Same as short_colonizer1.
short_colonizer3
Same as short_colonizer1.
http://www.cepii.fr/CEPII/en/bdd_modele/download.asp?id=6
Mayer, T. & Zignago, S. (2011) Notes on CEPII's distances measures: the GeoDist Database CEPII Working Paper 2011-25
Head, K. & Mayer, T. (2002) Illusory Border Effects: Distance Mismeasurement In-flates Estimates of Home Bias in Trade CEPII Working Paper 2002-01
# filter to avoid multiple records for the same country geo_cepii[geo_cepii$cap == 1 & geo_cepii$maincity == 1, ]
# filter to avoid multiple records for the same country geo_cepii[geo_cepii$cap == 1 & geo_cepii$maincity == 1, ]