Data Journalism in R Tutorials

Sentiment analysis of (you guessed it!) Donald Trump’s tweets

In my last post, I demonstrated how to use the twitteR package to collect data from Twitter in R. I have since learned that while twitteR still works, it is in a state of “leisurely deprecation” according to its creator Jeff Gentry. It is being phased out in favor of a newer R package called rtweet, created by Michael W. Kearney. In the few weeks that I've been testing it out, rtweet is a lot easier to use and avoids some of the setup required with twitteR, particularly setting up an application on Twitter's developer website. So, briefly and before we get to the main event, I wanted to walk you through getting rtweet set up.

As I said, it's pretty easy to get up and running. Start by installing and then loading the package:

install.packages("rtweet")
library(rtweet)

If you followed my previous guide, you probably opted to use a local .httr-oauth when prompted during the installation of twitteR. If this is the case, you may need to manually unlink it to prevent it from interfering with rtweet:

unlink(".httr-oauth")

And that's it…well, kind of. The first time you make an API request, a browser window will open, and you'll be prompted to enter your Twitter credentials. Let's use the get_timelines() function to get 3200 of Trump's most recent tweets, and, if this is your first try with rtweet, the browser window should appear:

trumpTBL <- get_timelines(c("realDonaldTrump"), n = 3200)

Something cool you'll notice here is that rather than storing the tweets in a list and then needing to convert to a tbl, this function automatically creates a tbl, which is nice. After this runs, you should see the tbl called trumpTBL in your environment. Give it a click and you'll see 42 columns worth of data for each tweet. You can use the get_timelines() function to get the timelines for multiple accounts, by separating the account names with a comma within the parentheses.

For our purposes, we don't need all the information about the tweets, and in my experience, working with such a large tbl can take a lot of time. So, let's slim this down a bit by selecting only a few relevant variables: id, text, and date:

trumpTBL_Slim <- trumpTBL %>%
  select(status_id, text, created_at) 

When Does Trump Tweet?

Now let's see when Trump's been tweeting (I should note that the code below is adapted from David Robinson's excellent analysis of Trump's tweets by device, as first posted here):

trumpTBL_Slim %>%
  count(hour = hour(with_tz(created_at, "EST"))) %>%
  mutate(percent = n / sum(n)) %>%
  ggplot(aes(hour, percent, color = hour)) +
  geom_line() +
  scale_y_continuous(labels = percent_format()) +
  labs(x = "Hour of day (EST)",
       y = "% of tweets",
       color = "")

plot of chunk unnamed-chunk-6

As we've all come to know too well, Trump tweets most early in the morning and then rounds out the day with some late evening tweeting, presumably while watching cable news and drinking Diet Coke.

Sentiment Analysis

But what's the mood of his tweets? To get at this question, we can employ sentiment analysis. Now, there are a lot of ways to computationally gauge sentiment, but here I'm going to walk through one, using Saif Mohammad’s NRC Emotion lexicon and Matthew Jockers' (hotly debated) Syuzhet Package. If you're not up on what was perhaps one of the most controversial digital humanities debates of the last several years, don't worry about it. We're going to skip the part where Jockers extracts plotlines, since that doesn't really apply to tweets.

The NRC Emotion Lexicon is a list of words associated with eight emotions: anger, fear, anticipation, trust, surprise, sadness, joy, and disgust. The syuzhet package's get_nrc_sentiment() function “returns a data frame in which each row represents a sentence from the original file. The columns include one for each emotion type was well as the positive or negative sentiment valence,” according to Jockers–except in our case each row is a tweet. First, we need to install and load syuzhet:

install.packages("syuzhet")
library(syuzhet)

Next, we extract the text column from the trumpTBL_Slim dataframe we created above:

tweet_text <- trumpTBL_Slim$text
head(tweet_text)
## [1] "Lightweight Senator Kirsten Gillibrand, a total flunky for Chuck Schumer and someone who would come to my office “begging” for campaign contributions not so long ago (and would do anything for them), is now in the ring fighting against Trump. Very disloyal to Bill &amp; Crooked-USED!"
## [2] "Despite thousands of hours wasted and many millions of dollars spent, the Democrats have been unable to show any collusion with Russia - so now they are moving on to the false accusations and fabricated stories of women who I don’t know and/or have never met. FAKE NEWS!"              
## [3] "Another false story, this time in the Failing @nytimes, that I watch 4-8 hours of television a day - Wrong!  Also, I seldom, if ever, watch CNN or MSNBC, both of which I consider Fake News. I never watch Don Lemon, who I once called the “dumbest man on television!” Bad Reporting."    
## [4] "Very little discussion of all the purposely false and defamatory stories put out this week by the Fake News Media. They are out of control - correct reporting means nothing to them. Major lies written, then forced to be withdrawn after they are exposed...a stain on America!"          
## [5] "Getting closer and closer on the Tax Cut Bill. Shaping up even better than projected. House and Senate working very hard and smart. End result will be not only important, but SPECIAL!"                                                                                                     
## [6] "Things are going really well for our economy, a subject the Fake News spends as little time as possible discussing! Stock Market hit another RECORD HIGH, unemployment is now at a 17 year low and companies are coming back into the USA. Really good news, and much more to come!"

Then, we extract the NRC sentiment scores for each tweet:

nrc_data <- get_nrc_sentiment(tweet_text)

We can take a peek at the tweets that scored high (here I'm saying a count greater than or equal to 3 is high) for each emotion by subsetting them from the larger set, like so:

Angry Tweets

angry_items <- which(nrc_data$anger >= 3)
tweet_text[angry_items] %>% head()
## [1] "LAST thing the Make America Great Again Agenda needs is a Liberal Democrat in Senate where we have so little margin for victory already. The Pelosi/Schumer Puppet Jones would vote against us 100% of the time. He’s bad on Crime, Life, Border, Vets, Guns &amp; Military. VOTE ROY MOORE!"  
## [2] "Democrats refusal to give even one vote for massive Tax Cuts is why we need Republican Roy Moore to win in Alabama. We need his vote on stopping crime, illegal immigration, Border Wall, Military, Pro Life, V.A., Judges 2nd Amendment and more. No to Jones, a Pelosi/Schumer Puppet!"      
## [3] "People who lost money when the Stock Market went down 350 points based on the False and Dishonest reporting of Brian Ross of @ABC News (he has been suspended), should consider hiring a lawyer and suing ABC for the damages this bad reporting has caused - many millions of dollars!"       
## [4] "After years of Comey, with the phony and dishonest Clinton investigation (and more), running the FBI, its reputation is in Tatters - worst in History! But fear not, we will bring it back to greatness."                                                                                      
## [5] "A disgraceful verdict in the Kate Steinle case! No wonder the people of our Country are so angry with Illegal Immigration."                                                                                                                                                                    
## [6] "After North Korea missile launch, it's more important than ever to fund our gov't &amp; military! Dems shouldn't hold troop funding hostage for amnesty &amp; illegal immigration. I ran on stopping illegal immigration and won big. They can't now threaten a shutdown to get their demands."

Anticipation Tweets

anticipation_items <- which(nrc_data$anticipation >= 3)
tweet_text[anticipation_items] %>% head()
## [1] "Another false story, this time in the Failing @nytimes, that I watch 4-8 hours of television a day - Wrong!  Also, I seldom, if ever, watch CNN or MSNBC, both of which I consider Fake News. I never watch Don Lemon, who I once called the “dumbest man on television!” Bad Reporting."    
## [2] "Things are going really well for our economy, a subject the Fake News spends as little time as possible discussing! Stock Market hit another RECORD HIGH, unemployment is now at a 17 year low and companies are coming back into the USA. Really good news, and much more to come!"         
## [3] "LAST thing the Make America Great Again Agenda needs is a Liberal Democrat in Senate where we have so little margin for victory already. The Pelosi/Schumer Puppet Jones would vote against us 100% of the time. He’s bad on Crime, Life, Border, Vets, Guns &amp; Military. VOTE ROY MOORE!"
## [4] "Big crowd expected today in Pensacola, Florida, for a Make America Great Again speech. We have done so much in so short a period of time...and yet are planning to do so much more! See you there!"                                                                                          
## [5] "A must watch: Legal Scholar Alan Dershowitz was just on @foxandfriends talking of what is going on with respect to the greatest Witch Hunt in U.S. political history. Enjoy!"                                                                                                                
## [6] "The Christmas Story begins 2,000 years ago with a mother, a father, their baby son and the most extraordinary gift of all—the gift of God’s love for all of humanity.\n\nWhatever our beliefs, we know that the birth of Jesus Christ and the story of his life... https://t.co/P94C3LjWlx"

Disgust Tweets

disgust_items <- which(nrc_data$disgust >= 3)
tweet_text[disgust_items] %>% head()
## [1] "Fake News CNN made a vicious and purposeful mistake yesterday. They were caught red handed, just like lonely Brian Ross at ABC News (who should be immediately fired for his “mistake”). Watch to see if @CNN fires those responsible, or was it just gross incompetence?"              
## [2] "People who lost money when the Stock Market went down 350 points based on the False and Dishonest reporting of Brian Ross of @ABC News (he has been suspended), should consider hiring a lawyer and suing ABC for the damages this bad reporting has caused - many millions of dollars!"
## [3] "A disgraceful verdict in the Kate Steinle case! No wonder the people of our Country are so angry with Illegal Immigration."                                                                                                                                                             
## [4] "We should have a contest as to which of the Networks, plus CNN and not including Fox, is the most dishonest, corrupt and/or distorted in its political coverage of your favorite President (me). They are all bad. Winner to receive the FAKE NEWS TROPHY!"                             
## [5] "Horrible and cowardly terrorist attack on innocent and defenseless worshipers in Egypt. The world cannot tolerate terrorism, we must defeat them militarily and discredit the extremist ideology that forms the basis of their existence!"                                              
## [6] "...LaVar, you could have spent the next 5 to 10 years during Thanksgiving with your son in China, but no NBA contract to support you. But remember LaVar, shoplifting is NOT a little thing. It’s a really big deal, especially in China. Ungrateful fool!"

Fear Tweets

fear_items <- which(nrc_data$fear >= 3)
tweet_text[fear_items] %>% head()
## [1] "Another false story, this time in the Failing @nytimes, that I watch 4-8 hours of television a day - Wrong!  Also, I seldom, if ever, watch CNN or MSNBC, both of which I consider Fake News. I never watch Don Lemon, who I once called the “dumbest man on television!” Bad Reporting."      
## [2] "Putting Pelosi/Schumer Liberal Puppet Jones into office in Alabama would hurt our great Republican Agenda of low on taxes, tough on crime, strong on military and borders...&amp; so much more. Look at your 401-k’s since Election. Highest Stock Market EVER! Jobs are roaring back!"        
## [3] "I had to fire General Flynn because he lied to the Vice President and the FBI. He has pled guilty to those lies. It is a shame because his actions during the transition were lawful. There was nothing to hide!"                                                                              
## [4] "The Kate Steinle killer came back and back over the weakly protected Obama border, always committing crimes and being violent, and yet this info was not used in court. His exoneration is a complete travesty of justice. BUILD THE WALL!"                                                    
## [5] "A disgraceful verdict in the Kate Steinle case! No wonder the people of our Country are so angry with Illegal Immigration."                                                                                                                                                                    
## [6] "After North Korea missile launch, it's more important than ever to fund our gov't &amp; military! Dems shouldn't hold troop funding hostage for amnesty &amp; illegal immigration. I ran on stopping illegal immigration and won big. They can't now threaten a shutdown to get their demands."

Joy Tweets

joy_items <- which(nrc_data$joy >= 3)
tweet_text[joy_items] %>% head()
## [1] "It was my great honor to celebrate the opening of two extraordinary museums-the Mississippi State History Museum &amp; the Mississippi Civil Rights Museum. We pay solemn tribute to our heroes of the past &amp; dedicate ourselves to building a future of freedom, equality, justice &amp; peace. https://t.co/5AkgVpV8aa"
## [2] "The Christmas Story begins 2,000 years ago with a mother, a father, their baby son and the most extraordinary gift of all—the gift of God’s love for all of humanity.\n\nWhatever our beliefs, we know that the birth of Jesus Christ and the story of his life... https://t.co/P94C3LjWlx"                                  
## [3] "This week, the Senate can join the House &amp; take a strong stand for the Middle Class families who are the backbone of America. Together, we will give the American people a big, beautiful Christmas present-a massive tax cut that lets Americans keep more of their HARD-EARNED MONEY! https://t.co/9jddEW2Oo5"         
## [4] "Departing @JBA_NAFW for St. Charles, Missouri to help push our plan for HISTORIC TAX CUTS across the finish line.\n\nA successful vote in the Senate this week will bring us one giant step closer to delivering an incredible victory for the American people!\nhttps://t.co/jR1DEUnm2h https://t.co/XF9sRwdV8u"            
## [5] "RT @FLOTUS: The decorations are up! @WhiteHouse is ready to celebrate! Wishing you a Merry Christmas &amp; joyous holiday season! https://t.co/…"                                                                                                                                                                            
## [6] "We should have a contest as to which of the Networks, plus CNN and not including Fox, is the most dishonest, corrupt and/or distorted in its political coverage of your favorite President (me). They are all bad. Winner to receive the FAKE NEWS TROPHY!"

Sad Tweets

sadness_items <- which(nrc_data$sadness >= 3)
tweet_text[sadness_items] %>% head()
## [1] "LAST thing the Make America Great Again Agenda needs is a Liberal Democrat in Senate where we have so little margin for victory already. The Pelosi/Schumer Puppet Jones would vote against us 100% of the time. He’s bad on Crime, Life, Border, Vets, Guns &amp; Military. VOTE ROY MOORE!"
## [2] "Democrats refusal to give even one vote for massive Tax Cuts is why we need Republican Roy Moore to win in Alabama. We need his vote on stopping crime, illegal immigration, Border Wall, Military, Pro Life, V.A., Judges 2nd Amendment and more. No to Jones, a Pelosi/Schumer Puppet!"    
## [3] "People who lost money when the Stock Market went down 350 points based on the False and Dishonest reporting of Brian Ross of @ABC News (he has been suspended), should consider hiring a lawyer and suing ABC for the damages this bad reporting has caused - many millions of dollars!"     
## [4] "Economy growing! Excluding hurricane effects, CEA estimates that real GDP growth would have been 3.9% in Q3.\n\nStock market at a new high, unemployment at a low. We are winning and TAX CUTS will shift our economy into high gear! https://t.co/HrKIF72VqE"                               
## [5] "Horrible and cowardly terrorist attack on innocent and defenseless worshipers in Egypt. The world cannot tolerate terrorism, we must defeat them militarily and discredit the extremist ideology that forms the basis of their existence!"                                                   
## [6] "ObamaCare premiums are going up, up, up, just as I have been predicting for two years. ObamaCare is OWNED by the Democrats, and it is a disaster. But do not worry. Even though the Dems want to Obstruct, we will Repeal &amp; Replace right after Tax Cuts!"

Surprise Tweets

surprise_items <- which(nrc_data$surprise >= 3)
tweet_text[surprise_items] %>% head()
## [1] "Yesterday, I was thrilled to be with so many WONDERFUL friends, in Utah’s MAGNIFICENT Capitol.\n\nIt was my honor to sign two Presidential Proclamations that will modify the national monuments designations of both Bears Ears and Grand Staircase Escalante...\nhttps://t.co/jiTHcPovCi https://t.co/dIpdAUVoRB"
## [2] "Good luck to @Joy_Villa on her decision to enter the wonderful world of politics. She has many fans!"                                                                                                                                                                                                              
## [3] "Wacky Congresswoman Wilson is the gift that keeps on giving for the Republican Party, a disaster for Dems. You watch her in action &amp; vote R!"                                                                                                                                                                  
## [4] "Money pouring into Insurance Companies profits, under the guise of ObamaCare, is over. They have made a fortune.\nDems must get smart &amp; deal!"                                                                                                                                                                 
## [5] "Can't believe I finally got a good story in the @washingtonpost. It discusses the enthusiasm of \"Trump\" voters through campaign...."                                                                                                                                                                             
## [6] "I hope Republican Senators will vote for Graham-Cassidy and fulfill their promise to Repeal &amp; Replace ObamaCare. Money direct to States!"

Trust Tweets

trust_items <- which(nrc_data$trust >= 3)
tweet_text[trust_items] %>% head()
## [1] "It was my great honor to celebrate the opening of two extraordinary museums-the Mississippi State History Museum &amp; the Mississippi Civil Rights Museum. We pay solemn tribute to our heroes of the past &amp; dedicate ourselves to building a future of freedom, equality, justice &amp; peace. https://t.co/5AkgVpV8aa"
## [2] "A big contingent of very enthusiastic Roy Moore fans at the rally last night. We can’t have a Pelosi/Schumer Liberal Democrat, Jones, in that important Alabama Senate seat. Need your vote to Make America Great Again! Jones will always vote against what we must do for our Country."                                    
## [3] "We believe that every American should stand for the National Anthem, and we proudly pledge allegiance to one NATION UNDER GOD! https://t.co/r2ITtWwfVs"                                                                                                                                                                      
## [4] "LAST thing the Make America Great Again Agenda needs is a Liberal Democrat in Senate where we have so little margin for victory already. The Pelosi/Schumer Puppet Jones would vote against us 100% of the time. He’s bad on Crime, Life, Border, Vets, Guns &amp; Military. VOTE ROY MOORE!"                                
## [5] "Today, our entire nation pauses to REMEMBER PEARL HARBOR—and the brave warriors who on that day stood tall and fought for America. \n\nGod Bless our HEROES who wear the uniform, and God Bless the United States of America. #PearlHarborRemembranceDay\U0001f1fa\U0001f1f8 https://t.co/qGhlsPlxtH"                        
## [6] "Yesterday, I was thrilled to be with so many WONDERFUL friends, in Utah’s MAGNIFICENT Capitol.\n\nIt was my honor to sign two Presidential Proclamations that will modify the national monuments designations of both Bears Ears and Grand Staircase Escalante...\nhttps://t.co/jiTHcPovCi https://t.co/dIpdAUVoRB"

Overall Emotions

Finally, let's get a sense of the overall emotions in Trump tweet's by plotting the percent of each emotion across all the tweets:

barplot(
  sort(colSums(prop.table(nrc_data[, 1:8]))), 
  horiz = TRUE, 
  cex.names = 0.7, 
  las = 1, 
  col=heat.colors(8), 
  main = "Emotions in Trump's tweets", xlab="Percentage"
)

plot of chunk unnamed-chunk-18

And there you have it. Is this what you would have expected, or not? The temptation to draw inferences and theorize about this is great, but that's not my job here, so I'll leave that to you. Thanks for following along, and stay tuned for more from “Data Journalism in R.”

DON’T MISS  Facebook political ads reveal an early but limited look at the Democratic playbook
Jonathan D. Fitzgerald
Latest posts by Jonathan D. Fitzgerald (see all)

3 thoughts on “Sentiment analysis of (you guessed it!) Donald Trump’s tweets

  1. Its very interesting post and I have replicated the previous one and this one. But I am getting following error for nrc
    nrc_data <- get_nrc_sentiment(tweet_text)
    Error in tolower(char_v) :
    invalid input 'RT @EricTrump: #MakeAmericaGreatAgain 🇺🇸🇺🇸🇺🇸 https://t.co/b81NtQGhHa&#039; in 'utf8towcs'
    Can you please help me in this regard

    1. Hi Zahid, thanks for following along, and sorry for the delay in responding. I’m afraid what you have there is a problem caused by unrecognized characters in that particular tweet. My suggestion would be to manually remove it from the data. That should solve the problem.

Leave a Reply

Your email address will not be published. Required fields are marked *

Get the latest from Storybench

Keep up with tutorials, behind-the-scenes interviews and more.