Turning your small-multiple charts into maps using R
The geofacet R package provides a way to flexibly visualize data for different geographical regions by providing a ggplot2 faceting function facet_geo()
which works just like ggplot2’s built-in faceting, except that the resulting arrangement of panels follows a grid that mimics the original geographic shape as closely as possible.
The idea is similar to that of the statebin, providing the additional benefits:
- Each cell in the grid can present any amount of data instead of a single value
- Each cell in the grid can be composed of any kind of plot conceivable with ggplot2
- Grids representing any geographic topology (via a set of built-in or user-defined grids) can be used
The merits of this visualization approach are not discussed at length here but will be addressed in a blog post soon (link will be provided when posted) along with a history this approach being used in the past.
Installation and Setup
install.packages("geofacet")
# or from github:
# remotes::install_github("hafen/geofacet")
library(geofacet)
library(ggplot2)
Usage
The main function in this package is facet_geo()
and its use can be thought of as equivalent to ggplot2’s facet_wrap()
except for the output it creates. If you know how to use ggplot2, then you know how to use this package.
Let’s consider an example based on this article, which uses emoji / Chernoff faces to show various quality-of-life metrics for US states.
This data is available in the geofacet package under the name state_ranks
.
head(state_ranks)
#> state name variable rank
#> 1 AK Alaska education 28
#> 2 AK Alaska employment 50
#> 3 AK Alaska health 25
#> 4 AK Alaska wealth 5
#> 5 AK Alaska sleep 27
#> 6 AK Alaska insured 50
A state with a rank of 1 is doing the best in the category and a rank of 51 is the worst (Washington DC is included).
Let’s use geofacet to create a bar chart of the state rankings. To do so, we create a ggplot2 plot using geom_col()
to make a bar chart of the variable vs. rank. Then, instead of using facet_wrap()
to facet the plot by state, we instead use facet_geo()
:
ggplot(state_ranks, aes(variable, rank, fill = variable)) +
geom_col() +
coord_flip() +
theme_bw() +
facet_geo(~ state)
While this plot may not be as fun as the Chernoff faces, geofacet allows us to use a much more powerful visual encoding system (length of bars) to help the viewer much more effectively grasp what is going on in the data. For example, states with very low rankings across most variables (HI, VT, CO, MN) stand out, and geographical trends such as the southern states consistently showing up in the bottom of the rankings stands out as well. Why don’t people sleep in Hawaii?
This plot helps illustrate a couple of advantages this approach has over a traditional geographical visualization approaches such as choropleth plots:
- We can plot multiple values per geographical entity
- We can use more effective visual encoding schemes (color, which is used in choropleth-type maps, is one of the least effective ways to visually encode information)
Note that other than the arrangement of the facets, every other aspect of this plot behaves in the way you would expect a ggplot2 plot to behave (such as themes, flipping coordinates, etc.).
Geofacet Options
There are a few options in facet_geo()
worth discussing:
- With the
grid
argument, we can provide either a string specifying a built-in named grid to use, or we can provide our own grid as a data frame. - With the
label
argument, we can specify which grid variable we want to use to label the facets.
For example, another built-in grid in the package is called “us_state_grid2”.
head(us_state_grid2)
#> row col code name
#> 1 6 7 AL Alabama
#> 2 1 1 AK Alaska
#> 3 6 2 AZ Arizona
#> 4 6 5 AR Arkansas
#> 5 6 1 CA California
#> 6 5 3 CO Colorado
Let’s use this grid to plot the seasonally adjusted US unemployment rate over time, using the state names as the facet labels:
ggplot(state_unemp, aes(year, rate)) +
geom_line() +
facet_geo(~ state, grid = "us_state_grid2", label = "name") +
scale_x_continuous(labels = function(x) paste0("'", substr(x, 3, 4))) +
labs(title = "Seasonally Adjusted US Unemployment Rate 2000-2016",
caption = "Data Source: bls.gov",
x = "Year",
y = "Unemployment Rate (%)") +
theme(strip.text.x = element_text(size = 6))
With this we can see how the unemployment rate varies per state and how some of the patterns are spatially similar.
Other Grids
Specifying a grid is as easy as creating a data frame with columns containing the names and commonly-used codes for the geographical entities, as well as a row
and col
variable specifying where the entity belongs on the grid.
Finding grids and adding your own
The list of names of available grids can be found here. In addition, creating your own grid is as easy as specifying a data frame with columns row
and col
containing unique pairs of positive integers indicating grid locations and columns beginning with name
and code
. You may want to provide different options for names, such as names in different languages, or different kinds of country codes, etc. (
One way to create a grid is to take an existing one and modify it. For example, suppose we don’t like where Wisconsin is located. We can simply change its location and preview the resulting grid with grid_preview()
.
my_grid <- us_state_grid1
my_grid$col[my_grid$code == "WI"] <- 7
grid_preview(my_grid)
A much more fun way to design a grid is with a JavaScript app, “Grid Designer”. You can launch this application starting from scratch by visiting this link or from R by calling:
grid_design()
This will open up a web application with an empty grid and instructions on how to fill it out. Basically you just need to paste in csv content about the geographic entities (the row
and col
columns are not required at this point). For example, you might go to wikipedia to get a list of the names of counties in the state of Washington and enter in that list into the app. Then a grid of squares with these column attributes will be populated and you can interactively drag the squares around to get the grid you want. You can also add a link to a reference map to help you as you arrange the tiles.
Another way to use the designer is to populate it with an existing grid you want to modify. For example, if I want to modify us_state_grid1
, I can call:
grid_design(data = us_state_grid2, img = "https://bit.ly/us-grid")
The app will look like this:
If you want to visit the app and edit this example live in a dedicated window, click here.
Submitting a grid
One of the most important features of this package is its facilities for encouraging and making it easy for users to create and share their grids. Creating a grid is usually very subjective and it is difficult to automate. Therefore we want this package to be a resource for making it easy to crowdsource the creation of useful grids.
There are two ways to share a grid. If you created a grid my_grid
in R, you can run:
grid_submit(my_grid, name = "my_grid1", desc = "An awesome grid...")
This will open up a GitHub issue with a template for you to fill out. You can look at closed issues for examples of other grid submissions.
The other way to submit a grid is to use the grid designer app and when you are done, click the “Submit Grid to GitHub” button, where in a similar fashion a GitHub issue will be opened.
Note that both of these approaches require you to have a GitHub account.
More examples of GeoFacet in action
A few ways to look at the 2016 election results
ggplot(election, aes("", pct, fill = candidate)) +
geom_col(alpha = 0.8, width = 1) +
scale_fill_manual(values = c("#4e79a7", "#e15759", "#59a14f")) +
facet_geo(~ state, grid = "us_state_grid2") +
scale_y_continuous(expand = c(0, 0)) +
labs(title = "2016 Election Results",
caption = "Data Source: 2016 National Popular Vote Tracker",
x = NULL,
y = "Percentage of Voters") +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
strip.text.x = element_text(size = 6))
ggplot(election, aes(candidate, pct, fill = candidate)) +
geom_col() +
scale_fill_manual(values = c("#4e79a7", "#e15759", "#59a14f")) +
facet_geo(~ state, grid = "us_state_grid2") +
theme_bw() +
coord_flip() +
labs(title = "2016 Election Results",
caption = "Data Source: 2016 National Popular Vote Tracker",
x = NULL,
y = "Percentage of Voters") +
theme(strip.text.x = element_text(size = 6))
ggplot(election, aes(candidate, votes / 1000000, fill = candidate)) +
geom_col() +
scale_fill_manual(values = c("#4e79a7", "#e15759", "#59a14f")) +
facet_geo(~ state, grid = "us_state_grid2") +
coord_flip() +
labs(title = "2016 Election Results",
caption = "Data Source: 2016 National Popular Vote Tracker",
x = NULL,
y = "Votes (millions)") +
theme(strip.text.x = element_text(size = 6))
A longer version of this post originally appeared in “Introduction to Geofacet.”
- Turning your small-multiple charts into maps using R - October 1, 2024