https://github.com/BaileeRice/Module20
Testing machine learning models to see if they can accurately determine the genre of songs
https://github.com/BaileeRice/Module20
excel google-slides kmeans random-forest sql tableau
Last synced: about 1 month ago
JSON representation
Testing machine learning models to see if they can accurately determine the genre of songs
- Host: GitHub
- URL: https://github.com/BaileeRice/Module20
- Owner: BaileeRice
- Created: 2022-10-20T01:09:09.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-11-21T15:25:50.000Z (over 2 years ago)
- Last Synced: 2023-05-30T00:20:42.446Z (almost 2 years ago)
- Topics: excel, google-slides, kmeans, random-forest, sql, tableau
- Language: Jupyter Notebook
- Homepage:
- Size: 13.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Module20
## Segment #3For this project, we were interested in seeing just how efficient a machine learning model will be at accurately predicting the genre of a song. We obtained dataset from Kaggle as it was one of the most popular ones out there, and from this dataset we decided to focus on four specific genres; Rock, R&B, Country, and Rap. All genre were solely chosen based on the fact that they contained the most data collected by spotify as shown in the graph below.
(https://www.kaggle.com/datasets/thedevastator/popularity-of-spotify-top-tracks-by-genre?select=rap_playlist_tracks.csv).
With access to the many files containing different data about genre, we decided to only use the playlist track data and focused on 11 specific attrubutes as shown on the image below. These attributes were chosen because we felt as though they would increase the accuracy of the predictions made by the machine.

Using the K-means clustering method we created an unsupervised machine learning model. The image below depicts a snipet of the code we used for the mlm. The number of clusters represents the four genres we'll focus on for the project.

Although unsupervised machine learning was a successful at showing patterns within the data as it relates the genre, it was not the best at accurately predicting genre while taking into consideration the 11 attributes in the table above. The image below shows one of the patterns found using this model. The tableau link below will allow you to manipulate the attributes and see the patterns.

A supervised version of the machine learning model was created, and upon testing, we found that it had a balanced accuracy score of 72.6%. We also found that speechiness, danceability, and acousticness were the top three biggest contributors of accurate genre prediction as shown in the image below.

Slides:https://docs.google.com/presentation/d/1y_DgnlQ9wYASojwS9idZCyP-hs8Qp631jknh44w6OYo/edit?usp=sharing
_______________________________________
#Tableau
Baseline Dashboard -https://public.tableau.com/app/profile/melanie.taylor6095/viz/SpotifyMLClassifications/Baseline_1?publish=yes
Track Breakdown Classification - https://public.tableau.com/app/profile/melanie.taylor6095/viz/SpotifyMLClassifications/SongClassificationBreakdown?publish=yes
_________________________________________Baseline Attributes Story - https://public.tableau.com/app/profile/melanie.taylor6095/viz/SpotifyMLClassifications/BaselineAttributes?publish=yes
Results Story - https://public.tableau.com/app/profile/melanie.taylor6095/viz/SpotifyMLClassifications/Results?publish=yes