How to

Analyzing gender differences in music themes and lyrics

Music is a form of self-expression, allowing for a range of human emotions and creativity that can unite people of diverse backgrounds. So it may come as a surprise to some that there are widespread gender differences in the music industry.

According to a diversity in music report released by the University of Southern California in January 2020, only about 21% of artists are women, as are 12% of songwriters and about 2% of producers. These statistics influence the way music is written, produced and released, regardless of the singer’s gender.

Unfortunately, there is not enough data collected about people identifying as trans and non-binary in the music industry.

As someone who is an enthusiastic music listener and who grew up with a musical background, understanding the inequalities in the sector is important to me. This passion led me to pick this topic for my Media Innovation master’s thesis at Northeastern University Journalism School. The overall inequality and lack of representation of women and non-binary individuals inspired me to study and analyze how this impacts the music industry and how music is written and produced.

Data Collection

The corpus of the songs included 64 pop songs from the Billboard 100. Thirty of those songs were from male artists, 30 from female artists and four from non-binary or artists who use they/them pronouns, in addition to she/her or he/him pronouns. From the “Billboard ‘The Hot 100’ Songs” dataset on Kaggle, which included songs from 1958, I picked relationship “breakup” songs from the past three decades.   

To collect the number of pronouns in each song, I manually picked breakup songs from Billboard and collected the lyrics into a spreadsheet. Once those lyrics were saved, I imported the files into AntConc and scraped through the list of breakup song lyrics. In AntConc, I used the advanced search option to search for the personal pronouns: “I”, “You”, “me”, “he”, “she”, “we” and “us”. Once the lyrics included those words, I saved them and cleaned the data in a spreadsheet. Data clean up included cleaning up contractions and correcting spellings of words.

The artists’, songwriters’, and producers’ gender identities were collected from their social media pages, Wikipedia page and other online articles and news sources. Song data came from lyric sites such as Genius and AZLyrics, and songs that were chosen came from the Billboard 100’s charts. The songs that were picked had lyrics and themes that had some sort of references to a past relationship or an ex.

Data Analysis

To analyze gender pronouns and themes in the songs, I collected online lyric data and then entered it into AntConc and Voyent Tools, allowing for lyrical analysis such as word combinations and collecting usage of pronouns. While I used R and Flourish to visualize the pronouns, I used Julia Silge’s R code Gender Roles with Text Mining and N-grams to help create visualizations. In addition to the R code, I used Flourish to make a few other graphics.

After the words were collected and saved, the data was imported into R, and using the code from Julia Silgle’s GitHub, I was able to create a few data visualizations using the pronouns and highlighting which words came after them. In addition, I imported the dataset into Flourish and created the visualizations of the pronouns based on the number of pronouns that appear.

Breakdown of common words used by female and male singers to describe their ex or how they feel after a breakup. Image credits: Samuel Chuan

In addition to the word count, I used R to collect the number of times a pronoun appeared in a song. This allowed me to see which pronoun was used the most based on the gender identity and if there were any similarities or differences between the singers. Overall, regardless of gender identity, the pronoun count was consistent across the groups.

Theme and Lyrics Analysis

One of the data analyses I had done was studying the structure of each song, in which I found 3 major lyrical themes. These are lyrics about themselves and how they feel, lyrics about their ex, and lyrics about the relationship.

This one of the data analyses I had done was studying the structure of each song, in which I found three major lyrical themes. These are lyrics about themselves and how they feel, lyrics about their ex partners and lyrics about the relationship. This analysis allowed me to better understand the major themes in these breakup songs, and if there were any major differences among singers.

Once I got the breakdown of the song structure, I was able to analyze the major themes of the songs and what type of language the singers used to describe the relationship. 

When looking at the songs, there was a clear difference between how male singers and female singers talk about their exes. While male singers tend to use more negative words and curse words to describe their exes, female singers use both positive and negative words equally.

After collecting and analyzing this data to count for themes of the songs, I manually scrapped lyric data from the songs and put the major themes into four different categories: anger, self-improvement, sadness and partying/drinking. Self-improvement themes skewed towards female singers while partying skewed toward male. While the themes of anger and sadness appear equally in songs regardless of gender identity, the main difference was the language and words that were used.

Lastly, I looked at the Billboard Top 10 where the majority of the songs were by male artists. By looking at this Billboard Top 10 for the past month, I understood how gender representation plays a role in all aspects of music and at different levels. Regardless of the week, male artists dominated the charts while female artists struggled to have more than five artists on the list. Unfortunately, there were no non-binary artists on the Billboard Top 100 chart throughout the semester.

After finishing this project it was easy for me to see how gender inequality can impact how listeners express themselves and see them represented in music and in pop culture in general. Overall, gender representation is an issue in pop culture and this might not change until we have more women and LGBTQ+ folks in power and in media.

Hopefully, the music industry will improve become more inclusive of folks of different backgrounds, until then it’s important that we change ourselves and understand how pop culture influences our self-expression and views.

Leave a Reply

Your email address will not be published.