The data for this project comes from a number of independent sources that were merged into a final dataset.
Interactions with these below sources to create visuals were done through various programs and IDEs, including R Studio, Visual Studio Code, Jupyter Notebooks, Excel, and Adobe Premiere Pro:
Billboard Year-End Hot
100 Getting the data for the top songs through Wikipedia, and then importing into Excel. |
LyricsGenius Finding lyrics to each of the songs through this Python client for the Genius API. |
NLTK Matching SentimentIntensityAnalyzer scores to each song. |
Kaggle Matching song features defined by the Spotify API to each song. |
Final Dataset Merging the two previous datasets with a left join on the sentiment scores. |
Above all, the people behind the programs are:
Special thanks to Cornell Data Journal for the platform.
For more details on the backend of this project, view our GitHub Repository