How to get Twitter data with rtweet in R
There’s a new process for getting data from Twitter with rtweet, so I put this document together as a guide for users looking for help in setting up the API, creating the tokens, and getting the data. For an introductory tutorial to R, click here, and be sure to browse all our Data Journalism in R section.
Load the packages
library(rtweet) library(httr) library(tidyverse)
Apply for an API
The major change is a new application process for getting your Twitter API. This involves answering some basic questions about how you’ll be using the data, and how you’ll be interacting with likes, retweets, followers, etc. These are the topics I needed to address.
1. The core use case, intent, or business purpose for your use of the Twitter APIs
2. If you intend to analyze Tweets, Twitter users, or their content, share details about the analyses you plan to conduct and the methods or techniques
3. If your use involves Tweeting, Retweeting, or liking content, share how you will interact with Twitter users or their content
4. If you’ll display Twitter content off of Twitter, explain how and where Tweets and Twitter content will be displayed to users of your product or service, including whether Tweets and Twitter content will be displayed at row level or aggregated
After you submit your application, the review is pretty quick (24 hours in my case. An actual human likely reviews the applications, because I was sent an additional email requesting that I expand on a particular item.)
Set up your twitter app
After getting approved, you’ll see the following in your Twitter developer dashboard:
Fill out the name and description fields, and the website url is a twitter account.
The callback url I used is from the
rtweet info page.
Assign your twitter app name to
## app name from api set-up appname <- "newsandnumbers-twitter-app"
Now create the Consumer API
key and the Consumer API
secret key from the Keys and Tokens tab on the Twitter development dashboard.
key <- "jibberish" secret <- "jibberishjibberishjibberishjibberishjibberish"
Authenticate via web browser (interactive)
Now you can create a
# create token named "twitter_token" twitter_token <- rtweet::create_token(app = appname, consumer_key = key, consumer_secret = secret)
This will take you into the browser and you’ll see this
Click Authorize app and close the browser tab.
twitter_token in your home directory
twitter_token somewhere safe, like your home directory.
## path of home directory home_directory <- path.expand("~") ## combine with name for token file_name <- file.path(home_directory, "twitter_token.rds") ## save token to home directory saveRDS(twitter_token, file = file_name)
You can also add the
twitter_token to your
## assuming you followed the procodures to create "file_name" ## from the previous code chunk, then the code below should ## create and save your environment variable. cat(paste0("TWITTER_PAT=", file_name), file = file.path(home_directory, ".Renviron"), append = TRUE)
Collect some twitter data
Now to test this new API, we’ll use the examples from the vignette
rtweet: Collecting Twitter Data
## search for 18000 tweets using the rstats hashtag rt <- rtweet::search_tweets( "#rstats", n = 18000, include_rts = FALSE )
## preview tweets data rt %>% dplyr::glimpse(78)
## Observations: 3,270 ## Variables: 88 ## $ user_id <chr> "1270025862", "1270025862", "1270025862",... ## $ status_id <chr> "1062835275360059393", "10621355682204631... ## $ created_at <dttm> 2018-11-14 22:30:50, 2018-11-13 00:10:27... ## $ screen_name <chr> "DataScientistsF", "DataScientistsF", "Da... ## $ text <chr> "A Mathematician’s Perspective on Topolog... ## $ source <chr> "IFTTT", "IFTTT", "IFTTT", "IFTTT", "IFTT...
Preview some users data
users_data(rt) %>% dplyr::glimpse(78)
## Observations: 3,270 ## Variables: 20 ## $ user_id <chr> "1270025862", "1270025862", "1270025862", ... ## $ screen_name <chr> "DataScientistsF", "DataScientistsF", "Dat... ## $ name <chr> "Data Scientists", "Data Scientists", "Dat... ## $ location <chr> "Paris, France", "Paris, France", "Paris, ... ## $ description <chr> "#BigData #DataScience #Analytics #Statist... ## $ url <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA... ## $ protected <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, ...
## plot time series (if ggplot2 is installed) ts_plot(rt)
There you have it! All set up and good to go!
- Update: How to geocode a CSV of addresses in R - June 13, 2020
- A roundup of coronavirus dashboards, datasets and resources - April 17, 2020
- Diagnosing the accuracy of your linear regression in R - November 27, 2019