How to geocode a csv of addresses in R
Sometimes all you want are some lat, long coordinates to map your data. The following tutorial shows you how to batch geocode a csv of addresses in R using the ggmap package, which asks Google to geocode the addresses using its API. FYI, the limit is 2,500 queries per day.
Download the ggmap package in R Studio
We’ll need ggmap, a spatial visualization package, to geocode the csv. To install it in R Studio, open a new R script in “File” > “New File” > “R Script.” Type install.packages(‘ggmap’) on line 1 of the top-left pane. Click “Run” or hit Shift-Command-Return. You should see the package downloading and installing in the console pane.
Prepare your csv of addresses
We’ll be using a csv with the names, addresses and other information for 114 craft breweries in Massachusetts. Make sure to note the name – “addresses“ – of the column listing the addresses. You’ll get the best results if town and state are all listed in that column, too.
Copy over the R script
In R Studio, open a new R script in “File” > “New File” > “R Script.” Scroll down and copy the R script from the Gist or click here to see it on Github. Keep scrolling for an explanation of what each line does.
Understanding the R script
First, we load the ggmap library.
Next, we’ll add a line with file.choose that opens a file selection box and allows you to search through your computer to find the csv.
fileToLoad <- file.choose(new = TRUE)
Next, R will read the csv and store it as the variable origAddress.
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)
Next, the script initializes a data frame to store the geocoded addresses while the loop, which starts with for(i in 1:nrow(origAddress)), processes the rest. For more on loops in R, go here.
Now, look through the whole script again.
Important: Notice the $addresses[i] within ggmap’s geocode function.
geocode(origAddress$addresses[i], output = "latlona", source = "google")
After the loop runs, you’ll want to write all the data to a csv named geocoded.csv.
write.csv(origAddress, "geocoded.csv", row.names=FALSE)
Run the script
Click “Run” or hit Shift-Command-Return. You should see each row being passed to Google’s API in the console pane. For 114 rows, it took around 30 seconds.
Check the geocoded csv
Your new geocoded spreadsheet should contain three appended columns – lon, lat, and geoAddress. Sweet.
Now, to map the data
We’ll use Carto.com and upload geocoded.csv in the Connect dataset section. The data – with the lat, lon recognized by Carto and stored as the_geom – should look like this:
Next, click Create Map on the bottom-right. Voilá: