Data Journalism in R Tutorials

How to Analyze bluesky Posts and Trends with R

If the only things you’re doing on bluesky are scrolling, liking and posting, then you are still riding a bike with training wheels. Hear me out.

There are several simple and free tools out there that let you take advantage of bluesky’s secret weapon: its open-source skeleton.

A how-to

A few firehose ideas:

  • Pull thousands of posts to look for trends
  • See how key phrases change over time
  • See articles others are recommending

First, you’re probably thinking: why spend time doing this? We have this embedded idea that truly great content will always rise to the top. For a detailed discussion on why not, see these. But the short answer is: quality is part of the equation but not the whole story.

Wheel graphic showing content at the core but also user connections, functional connections, and product connections.
From Bharat Anand’s book “The Content Trap.”

You can’t download all two billion posts, but you can download a whole lot more than you ever could scrolling. And you can use that data to look for patterns. For example, what if we took a look at 100,000 posts from the last day?

latest_posts <- 
  bs_search_posts(
  "*",
  sort = 'latest',
  since = '2025-10-10T00:00:00.000Z', #or whatever time you want the latest_posts. I didn't want to hardcode this but couldn't' figure out a way to  put the time in that exact format.
  until = NULL,
  mentions = NULL,
  author = NULL,
  lang = NULL,
  domain = NULL,
  url = NULL,
  tag = NULL,
  cursor = NULL,
  limit = 100000,   # You will want to limit this!
  user = "walinchus.bsky.social",
  pass = Sys.getenv("BSKY_PASS"),
  auth = bs_auth("walinchus.bsky.social", Sys.getenv("BSKY_PASS")),
  clean = TRUE
)

Analysis idea number one: what comes up most often? Here, we take out the most popular words (prepositions, etc.) and look only at the frequency of other words.

stop_words1 <- as_tibble(stopwords::stopwords("ja", source = "marimo")) %>%  rename(word=value)
stop_words2 <- as_tibble(stopwords::stopwords("pt", source="snowball")) %>%  rename(word=value)
stop_words <- bind_rows(stop_words,stop_words1,stop_words2)


post_words <- posts %>% 
  unnest_tokens(word, text) %>%
  anti_join(stop_words)

post_words %>% 
  count(word) %>% 
  arrange(desc(n)) %>% 
  slice_head( n=100)
```

Then a table: word
<chr>
n
<int>
people	297			
trump	258			
time	215			
2025	181			
prize	178			
10	176			
day	164			
peace	163			
love	153			
nobel	148

Idea number two: Looking at which words were important in each hour. To do this, we are going to employ a neat little trick called TF-IDF. Check out Julia Silge’s work here for a full explanation but essentially: What words were important in each chunk?

This is really fascinating to see the effects of time zones, and when certain news broke!

Chart showing the TF-IDF of words in certain hours

Another idea: let’s say you want to see not just what people like, but what others are recommending to their friends. For example, this post doesn’t have many likes, but already has a lot of reposts. And so neat! I did not know this about hippos.

DON’T MISS  How to add meta tags to optimize your news article for social media

This is just scratching the surface. There’s a lot of really granular data here. You can have fun with this! HMU with more ideas!

example of data from the bluesky firehose

The code for this can be found here.

This mostly uses the bskyr #rstats package developed by Christopher Kenny. Check out his vignettes here.

This also looks cool: David Hood shows you can build a big selection of follower/following connections from an initial curated list of related people on his Github.

Disclaimers! I am not employed by bluesky, nor do I know anyone who is. This is not an endorsement and I have not extensively tested how well this works. Please respect the rate limits; this prevents overwhelming the system. And big thanks to Mike Stucka (@mikestucka.bsky.social ) and all the folks at IRE who encouraged me to test this out. I was very against signing up for yet another social media site until I saw what it could do!

Lucia Walinchus
Latest posts by Lucia Walinchus (see all)
DON’T MISS  How to animate a map in Adobe After Effects

Leave a Reply

Your email address will not be published. Required fields are marked *

Get the latest from Storybench

Keep up with tutorials, behind-the-scenes interviews and more.