Introduction to geoheatmap

logo

The geoheatmap R package aims to provide an easy way for building cartogram heatmaps for regions worldwide.

For the graph aesthetics, we took inspirations from the maps of the statebins package, and extended on its functionality by allowing for user-defined grids, which means that the cartogram heatmaps can also represent territories outside the US. Though any well-defined grid will technically work, grids from geofacet are an excellent starting point.

Installation and Setup

The package geoheatmap imports grids from the geofacet package, and also includes the option to make hovering graphs, which is based on functionalities from the plotly package.

install.packages("geoheatmap")
library(geoheatmap)
library(geofacet)
library(plotly)
library(viridisLite)

The use of package viridisLite here is to make colour-blind friendly graphical options for the examples below.

Usage

In this vignette we show multiple implementations of the package, namely for continuous data and discrete data, and we will show how users can specify the column with the grid names and can make an interactive (hovering) plot instead of a stationary one.

Dataset for examples

The data used for the following examples is available in the geoheatmap package under the name internet, and is originally from the World Bank Group (2024), retrieved from https://data.worldbank.org/indicator/IT.NET.USER.ZS.

data(internet, package = "geoheatmap")
head(internet)
#> # A tibble: 6 × 3
#>   country      year users
#>   <chr>       <dbl> <dbl>
#> 1 Aruba        1990     0
#> 2 Afghanistan  1990     0
#> 3 Angola       1990     0
#> 4 Albania      1990     0
#> 5 Andorra      1990     0
#> 6 Arab World   1990     0

Internet.rda lists a good chunk of countries (country) worldwide, in an alphabetical and chronological order (year: 1990 - 2016), with the column users depicting the percentage of individuals in a given country having used internet in some capacity in the previous 3 months.

For the examples in this vignette, we use only the data from one year, specifically from 2015.

internet_2015 <- subset(internet, year == 2015)

Continuous scale examples

In the first example, a cartogram heatmap is shown for continuous data. This graph shows the internet usage across Europe with a gradient color scale and the europe_countries_grid1 grid from the geofacet package. Though most countries are well above the halfway mark of population internet use, highest percentages can be found in Northwestern European countries.

geoheatmap(facet_data= internet_2015, grid_data= europe_countries_grid1,
           facet_col = "country", value_col = "users", 
           low = "#56B1F7", high = "#132B43") + 
  labs(title = "2015 Internet Usage in Europe")

The gradient scale is the default for continuous data, but you are of course not limited to this default. You can change the fill scale via the ggplot2_scale_function argument, and add additional arguments to be passed down directly inside the function.

For example, if any middle point is of interest, also a divergent color scale can be applied for continuous data. This can be done by specifying the scale_fill_gradient2 for the argument ggplot2_scale_function and adding the additional information for this scale (low, mid, high, midpoint).

geoheatmap(facet_data = internet_2015, 
           grid_data = europe_countries_grid1,
           facet_col = "country", 
           value_col = "users",
           name = "Internet users: divergent",
           ggplot2_scale_function = scale_fill_gradient2, 
           low =  viridis(10)[1], 
           mid = "white", 
           high = viridis(10)[8], 
           midpoint = 75,  
           round = TRUE) + 
  labs(title = "2015 Internet Usage in Europe")

Note that in this example we additionally set the argument round = TRUE to show the cartogram heatmap version with tiles with rounded corners.

Discrete scale examples

To show discrete data, users can either ask ggplot2 to bin the data or bin the data themselves.

In this first example, the data is binned by asking ggplot2 to bin the data via the scale_fill_binned function. This time we focus on Africa by using the grid africa_countries_grid1 from the geofacet pacakage. The resulting graph shows how African countries differed in using the internet in year 2015, possibly tied to economic development of a given country.


geoheatmap(facet_data= internet_2015, grid_data= africa_countries_grid1,
           facet_col = "country", value_col = "users",
           name= "Internet users: binned",
           ggplot2_scale_function = scale_fill_binned,
           type= "viridis") + 
  labs(title = "Internet Usage in Africa")

Another option is to discretize our data ourselves, e.g. by specifying our own breaks, and to then plot this data as is done in the next graph.

internet_2015$users_bin= cut(internet_2015$users, breaks = c(-Inf, 25, 50, Inf), labels = c("0-25", "26-50", "51 and up"))
geoheatmap(facet_data= internet_2015, grid_data= africa_countries_grid1,
           facet_col = "country", value_col = "users_bin",
           name= "Internet users: binned",
           ggplot2_scale_function = scale_fill_brewer,
           type = "seq", palette= "Greens", na.value= "grey50" ) + 
  labs(title = "Internet Usage in Africa")

With the manual breaks we can, for example, put more focus on countries that surpassed the halfway mark (50%) of population internet usage: Morocco, Mauritius, Seychelles and South Africa.

Grid language options

Sometimes, grids have local as well as anglophone location names, with the default being set to the latter. If you would like to use the regional version (e.g. because your data frame operates with native names), you can pass it in as an additional argument using merge_col.

As an example, let’s look at the grid de_states_grid1 that contains regions of Germany. In this grid, both name and name_de are available, with diverging state names in most cases. Via the merge_col argument, users can define which column they want to use in the cartogram.

de_states_grid1
#>    row col code                   name                name_de
#> 1    1   2   SH     Schleswig-Holstein     Schleswig-Holstein
#> 2    1   3   HH                Hamburg                Hamburg
#> 3    1   4   MV Mecklenburg-Vorpommern Mecklenburg-Vorpommern
#> 4    2   2   HB                 Bremen                 Bremen
#> 5    2   3   NI           Lower Saxony          Niedersachsen
#> 6    2   4   BE                 Berlin                 Berlin
#> 7    2   5   BB            Brandenburg            Brandenburg
#> 8    3   2   NW North Rhine-Westphalia    Nordrhein-Westfalen
#> 9    3   4   ST          Saxony-Anhalt         Sachsen-Anhalt
#> 10   4   1   SL               Saarland               Saarland
#> 11   4   2   RP   Rhineland-Palatinate        Rheinland-Pfalz
#> 12   4   3   HE                  Hesse                 Hessen
#> 13   4   4   TH              Thuringia              Thüringen
#> 14   4   5   SN                 Saxony                Sachsen
#> 15   5   3   BW      Baden-Württemberg      Baden-Württemberg
#> 16   5   4   BY                Bavaria                 Bayern

For illustration purposes, let’s make up a dataset that works with state names native to a German speaker and plot this data as a cartogram heatmap.

# Dummy data frame with German states and number of football teams
football_teams= data.frame(state = c("Baden-Württemberg", "Bayern", 
                                     "Berlin", "Brandenburg",
                                     "Bremen", "Hamburg", 
                                     "Hessen", "Mecklenburg-Vorpommern",
                                     "Niedersachsen", "Nordrhein-Westfalen",
                                     "Rheinland-Pfalz", "Saarland", 
                                     "Sachsen", "Sachsen-Anhalt",
                                     "Schleswig-Holstein", "Thüringen"),
                           teams = c(18, 22, 8, 6, 4, 5, 14, 3, 
                                     12, 28, 10, 3, 9, 5, 7, 4)
                           )

geoheatmap(facet_data= football_teams,
        grid_data= de_states_grid1,
        facet_col = "state",value_col = "teams",merge_col = "name_de",
        name= "No. of teams",
        low = "lightblue", high = plasma(2)[1],
        round = TRUE) + 
  labs(title = "Football teams in German states")

By specifying merge_col = "name_de", the geoheatmap() function merges the correct data set and grid columns together before producing a plot. Though purely fictional, this plot shows that Nordrhein-Westfalen state is leading in number of football teams, something the authors suspect to be true regardless, as the state is Germany’s most populous.

Interactive plots

You also have the option to make any given plot created with geoheatmap() interactive with plotly directly in the function call, by specifying hover = TRUE.

geoheatmap(facet_data= football_teams,
        grid_data= de_states_grid1,
        facet_col = "state",value_col = "teams",merge_col = "name_de",
        name= "No. of teams",
        low = "lightblue", high = plasma(2)[1],
        hover = TRUE)

Note that this hovering option is only available for cartogram heatmaps with un-rounded tiles. This means that calling round = TRUE in conjunction with hover = TRUE does not work (yet).

geoheatmap(facet_data= football_teams,
        grid_data= de_states_grid1,
        facet_col = "state",value_col = "teams",merge_col = "name_de",
        name= "No. of teams",
        low = "lightblue", high = plasma(2)[1],
        round = TRUE,
        hover = TRUE)
#> Warning in geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_GeomRtile() has yet to be implemented in plotly.
#>   If you'd like to see this geom implemented,
#>   Please open an issue with your example code at
#>   https://github.com/ropensci/plotly/issues

Grids

In the above examples, we already saw three grids, but many more grids are available in the geofacet package. Additionally, users can specify their own grids.

List available grids

To get the list of names of available grids in the geofacet package so far, call:

geofacet::get_grid_names()
#> Note: More grids are available by name as listed here: https://raw.githubusercontent.com/hafen/grid-designer/master/grid_list.json
#>   [1] "us_state_grid1"                           
#>   [2] "us_state_grid2"                           
#>   [3] "eu_grid1"                                 
#>   [4] "aus_grid1"                                
#>   [5] "sa_prov_grid1"                            
#>   [6] "gb_london_boroughs_grid"                  
#>   [7] "nhs_scot_grid"                            
#>   [8] "india_grid1"                              
#>   [9] "india_grid2"                              
#>  [10] "argentina_grid1"                          
#>  [11] "br_states_grid1"                          
#>  [12] "sea_grid1"                                
#>  [13] "mys_grid1"                                
#>  [14] "fr_regions_grid1"                         
#>  [15] "de_states_grid1"                          
#>  [16] "us_or_counties_grid1"                     
#>  [17] "us_wa_counties_grid1"                     
#>  [18] "us_in_counties_grid1"                     
#>  [19] "us_in_central_counties_grid1"             
#>  [20] "se_counties_grid1"                        
#>  [21] "sf_bay_area_counties_grid1"               
#>  [22] "ua_region_grid1"                          
#>  [23] "mx_state_grid1"                           
#>  [24] "mx_state_grid2"                           
#>  [25] "scotland_local_authority_grid1"           
#>  [26] "us_state_without_DC_grid1"                
#>  [27] "italy_grid1"                              
#>  [28] "italy_grid2"                              
#>  [29] "be_province_grid1"                        
#>  [30] "us_state_grid3"                           
#>  [31] "jp_prefs_grid1"                           
#>  [32] "ng_state_grid1"                           
#>  [33] "bd_upazila_grid1"                         
#>  [34] "spain_prov_grid1"                         
#>  [35] "ch_cantons_grid1"                         
#>  [36] "ch_cantons_grid2"                         
#>  [37] "china_prov_grid1"                         
#>  [38] "world_86countries_grid"                   
#>  [39] "se_counties_grid2"                        
#>  [40] "uk_regions1"                              
#>  [41] "us_state_contiguous_grid1"                
#>  [42] "sk_province_grid1"                        
#>  [43] "ch_aargau_districts_grid1"                
#>  [44] "jo_gov_grid1"                             
#>  [45] "spain_ccaa_grid1"                         
#>  [46] "spain_prov_grid2"                         
#>  [47] "world_countries_grid1"                    
#>  [48] "br_states_grid2"                          
#>  [49] "china_city_grid1"                         
#>  [50] "kr_seoul_district_grid1"                  
#>  [51] "nz_regions_grid1"                         
#>  [52] "sl_regions_grid1"                         
#>  [53] "us_census_div_grid1"                      
#>  [54] "ar_tucuman_province_grid1"                
#>  [55] "us_nh_counties_grid1"                     
#>  [56] "china_prov_grid2"                         
#>  [57] "pl_voivodeships_grid1"                    
#>  [58] "us_ia_counties_grid1"                     
#>  [59] "us_id_counties_grid1"                     
#>  [60] "ar_cordoba_dep_grid1"                     
#>  [61] "us_fl_counties_grid1"                     
#>  [62] "ar_buenosaires_communes_grid1"            
#>  [63] "nz_regions_grid2"                         
#>  [64] "oecd_grid1"                               
#>  [65] "ec_prov_grid1"                            
#>  [66] "nl_prov_grid1"                            
#>  [67] "ca_prov_grid1"                            
#>  [68] "us_nc_counties_grid1"                     
#>  [69] "mx_ciudad_prov_grid1"                     
#>  [70] "bg_prov_grid1"                            
#>  [71] "us_hhs_regions_grid1"                     
#>  [72] "tw_counties_grid1"                        
#>  [73] "tw_counties_grid2"                        
#>  [74] "af_prov_grid1"                            
#>  [75] "us_mi_counties_grid1"                     
#>  [76] "pe_prov_grid1"                            
#>  [77] "sa_prov_grid2"                            
#>  [78] "mx_state_grid3"                           
#>  [79] "cn_bj_districts_grid1"                    
#>  [80] "us_va_counties_grid1"                     
#>  [81] "us_mo_counties_grid1"                     
#>  [82] "cl_santiago_prov_grid1"                   
#>  [83] "us_tx_capcog_counties_grid1"              
#>  [84] "sg_planning_area_grid1"                   
#>  [85] "in_state_ut_grid1"                        
#>  [86] "cn_fujian_prov_grid1"                     
#>  [87] "ca_quebec_electoral_districts_grid1"      
#>  [88] "nl_prov_grid2"                            
#>  [89] "cn_bj_districts_grid2"                    
#>  [90] "ar_santiago_del_estero_prov_grid1"        
#>  [91] "ar_formosa_prov_grid1"                    
#>  [92] "ar_chaco_prov_grid1"                      
#>  [93] "ar_catamarca_prov_grid1"                  
#>  [94] "ar_jujuy_prov_grid1"                      
#>  [95] "ar_neuquen_prov_grid1"                    
#>  [96] "ar_san_luis_prov_grid1"                   
#>  [97] "ar_san_juan_prov_grid1"                   
#>  [98] "ar_santa_fe_prov_grid1"                   
#>  [99] "ar_la_rioja_prov_grid1"                   
#> [100] "ar_mendoza_prov_grid1"                    
#> [101] "ar_salta_prov_grid1"                      
#> [102] "ar_rio_negro_prov_grid1"                  
#> [103] "uy_departamentos_grid1"                   
#> [104] "ar_buenos_aires_prov_electoral_dist_grid1"
#> [105] "europe_countries_grid1"                   
#> [106] "argentina_grid2"                          
#> [107] "us_state_without_DC_grid2"                
#> [108] "jp_prefs_grid2"                           
#> [109] "na_regions_grid1"                         
#> [110] "mm_state_grid1"                           
#> [111] "us_state_with_DC_PR_grid1"                
#> [112] "fr_departements_grid1"                    
#> [113] "ar_salta_prov_grid2"                      
#> [114] "ie_counties_grid1"                        
#> [115] "sg_regions_grid1"                         
#> [116] "us_ny_counties_grid1"                     
#> [117] "ru_federal_subjects_grid1"                
#> [118] "us_ca_counties_grid1"                     
#> [119] "lk_districts_grid1"                       
#> [120] "us_state_without_DC_grid3"                
#> [121] "co_cali_subdivisions_grid1"               
#> [122] "us_in_northern_counties_grid1"            
#> [123] "italy_grid3"                              
#> [124] "us_state_with_DC_PR_grid2"                
#> [125] "us_state_grid7"                           
#> [126] "sg_planning_area_grid2"                   
#> [127] "ch_cantons_fl_grid1"                      
#> [128] "europe_countries_grid2"                   
#> [129] "us_states_territories_grid1"              
#> [130] "us_tn_counties_grid1"                     
#> [131] "us_il_chicago_community_areas_grid1"      
#> [132] "us_state_with_DC_PR_grid3"                
#> [133] "in_state_ut_grid2"                        
#> [134] "at_states_grid1"                          
#> [135] "us_pa_counties_grid1"                     
#> [136] "us_oh_counties_grid1"                     
#> [137] "fr_departements_grid2"                    
#> [138] "us_wi_counties_grid1"                     
#> [139] "africa_countries_grid1"                   
#> [140] "no_counties_grid1"                        
#> [141] "tr_provinces_grid1"

This list is constantly being updated as authors of the geofacet made uploading your own grids possible. You can learn how to submit your own by following the steps in next section.

Creating your own grid

For detailed instructions on creating a custom grid, see the “Creating your own grid” section in the geofacet vignette. You can find it at: https://cran.r-project.org/package=geofacet/vignettes/geofacet.html

Theme

The default theme is set to theme_void() as cartograms do not require axes etc., but this can be either overwritten, or added onto depending on intended plot purposes.