How to get Twitter data with rtweet in R

Data Journalism in R, How to
Share on FacebookShare on Google+Tweet about this on TwitterPin on PinterestShare on LinkedInEmail this to someone

There’s a new process for getting data from Twitter with rtweet, so I put this document together as a guide for users looking for help in setting up the API, creating the tokens, and getting the data. For an introductory tutorial to R, click here, and be sure to browse all our Data Journalism in R section.

Load the packages

library(rtweet)
library(httr)
library(tidyverse)

Apply for an API

The major change is a new application process for getting your Twitter API. This involves answering some basic questions about how you’ll be using the data, and how you’ll be interacting with likes, retweets, followers, etc. These are the topics I needed to address.

1. The core use case, intent, or business purpose for your use of the Twitter APIs
2. If you intend to analyze Tweets, Twitter users, or their content, share details about the analyses you plan to conduct and the methods or techniques
3. If your use involves Tweeting, Retweeting, or liking content, share how you will interact with Twitter users or their content
4. If you’ll display Twitter content off of Twitter, explain how and where Tweets and Twitter content will be displayed to users of your product or service, including whether Tweets and Twitter content will be displayed at row level or aggregated

After you submit your application, the review is pretty quick (24 hours in my case. An actual human likely reviews the applications, because I was sent an additional email requesting that I expand on a particular item.)

Set up your twitter app

After getting approved, you’ll see the following in your Twitter developer dashboard:

Fill out the name and description fields, and the website url is a twitter account.

Callback url

The callback url I used is from the rtweet info page.

Create the appname

Assign your twitter app name to appname.

## app name from api set-up
appname <- "newsandnumbers-twitter-app"

Now create the Consumer API key and the Consumer API secret key from the Keys and Tokens tab on the Twitter development dashboard.

key <- "jibberish"

secret <- "jibberishjibberishjibberishjibberishjibberish"


Authenticate via web browser (interactive)

Now you can create a twitter_token.

# create token named "twitter_token"
twitter_token <- rtweet::create_token(app = appname,
                                    consumer_key = key,
                                    consumer_secret = secret)

This will take you into the browser and you’ll see this

Click Authorize app and close the browser tab.

Save twitter_token in your home directory

Put your twitter_token somewhere safe, like your home directory.

## path of home directory 
home_directory <- path.expand("~")
## combine with name for token
file_name <- file.path(home_directory,
                       "twitter_token.rds")
## save token to home directory
saveRDS(twitter_token, file = file_name)

You can also add the twitter_token to your ".Renviron" variable.

## assuming you followed the procodures to create "file_name"
## from the previous code chunk, then the code below should
## create and save your environment variable.
cat(paste0("TWITTER_PAT=", file_name),
    file = file.path(home_directory, ".Renviron"),
    append = TRUE)

Collect some twitter data

Now to test this new API, we’ll use the examples from the vignette rtweet: Collecting Twitter Data

## search for 18000 tweets using the rstats hashtag
rt <- rtweet::search_tweets(
  "#rstats", n = 18000, include_rts = FALSE
)
## preview tweets data
rt %>% dplyr::glimpse(78)
## Observations: 3,270
## Variables: 88
## $ user_id                 <chr> "1270025862", "1270025862", "1270025862",...
## $ status_id               <chr> "1062835275360059393", "10621355682204631...
## $ created_at              <dttm> 2018-11-14 22:30:50, 2018-11-13 00:10:27...
## $ screen_name             <chr> "DataScientistsF", "DataScientistsF", "Da...
## $ text                    <chr> "A Mathematician’s Perspective on Topolog...
## $ source                  <chr> "IFTTT", "IFTTT", "IFTTT", "IFTTT", "IFTT...


Preview some users data

users_data(rt) %>% dplyr::glimpse(78)
## Observations: 3,270
## Variables: 20
## $ user_id                <chr> "1270025862", "1270025862", "1270025862", ...
## $ screen_name            <chr> "DataScientistsF", "DataScientistsF", "Dat...
## $ name                   <chr> "Data Scientists", "Data Scientists", "Dat...
## $ location               <chr> "Paris, France", "Paris, France", "Paris, ...
## $ description            <chr> "#BigData #DataScience #Analytics #Statist...
## $ url                    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ protected              <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, ...


## plot time series (if ggplot2 is installed)
ts_plot(rt)

There you have it! All set up and good to go!

Martin Frigaard is a graduate student at UCSF. Find him on Twitter.

Leave a Reply