Simple R map tutorial with ggplot

Author

Jacob Høigilt

Published

May 11, 2023

What this tutorial covers

I will go through two ways of creating a map based on one of the several existing datasets freely available for that purpose. The first map shows how you can create a world map and then zoom in on a specific region using latitude and longitude coordinates. The second shows how you create a map consisting only of selected countries from the dataset. Along the way, I will briefly explain what the various functions do and provide links to web resources where you can learn more about them.

Required R packages

You need the following packages:

library(dplyr)
library(ggplot2)
library(sf)
library(giscoR)
library(ggrepel)

The dplyr package provides a handy set of tools to manipulate datasets. It has functions to change variable names, duplicate variables, filter values, etc. It is a basic staple of many R scripts. Read more about dplyr here.

ggplot2 is the workhorse in this map script. It is the package that allows us to visualize as a map the spatial data in our dataset. ggplot is an essential package for any visualization of data, such as bar charts or graphs. Read about it here.

sf stands for simple features. Simple features is an official standard to help computers describe the spatial geometry of objects in the world, such as countries with borders. The sf package helps R interpret and process sf data. This is way more complicated than I can explain here, but there’s a very pedagogical introduction here.

giscoR is a package that liaises between R and the EU’s “Geographic Information System of the COmmission” (GISCO). This body “is responsible for meeting the European Commission’s geographical information needs at 3 levels: the European Union, its member countries, and its regions”. But it also produces spatial datasets containing mapping data for all the world’s countries, which is what we are going to use here.

ggrepel is a package which ensures that graphical elements don’t overlap, whether they are text, points or other graphics. We use it here because we don’t want the country names on our map to overlap each other.

OK, now that we know what the packages do we’re ready to start using them and write some code.

Creating two necessary dataframes and plotting a simple world map

First, we get the data from GISCO with the gisco_get_countries command. We store the information in a dataframe we call ‘world’. Then we create a second R object called ‘world_points’. The function st_centroid simply calculates the center point in all the countries contained in the dataframe ‘world’. We need this information later, when we want to insert the country names for each country on the map.

world <- gisco_get_countries()
world_points <- st_centroid(world)

If you have a look at the dataframe ‘world’ (use the R command view(world)) you will see that it consists of 257 observations or rows, each corresponding to one of the world’s countries, and six variables, one of which is the coordinates information that describes the borders. With this dataset we can very easily create a nice world map:

ggplot(data = world) +
  geom_sf(fill = "antiquewhite")