{"id":20757288,"url":"https://github.com/piebro/spotify-statistics","last_synced_at":"2025-04-29T10:36:55.525Z","repository":{"id":200539733,"uuid":"705745566","full_name":"piebro/spotify-statistics","owner":"piebro","description":"A website to generate detailed statistics from your entire spotify streaming history.","archived":false,"fork":false,"pushed_at":"2025-02-07T08:47:38.000Z","size":524,"stargazers_count":9,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-30T12:11:23.295Z","etag":null,"topics":["data-visualization","pyodide","spotify","statistics"],"latest_commit_sha":null,"homepage":"https://piebro.github.io/spotify-statistics","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/piebro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-16T15:59:29.000Z","updated_at":"2025-03-26T11:56:11.000Z","dependencies_parsed_at":"2025-01-23T08:24:29.907Z","dependency_job_id":"95d88abe-74a3-4485-a384-3b2a58603693","html_url":"https://github.com/piebro/spotify-statistics","commit_stats":null,"previous_names":["piebro/spotify-statistics"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fspotify-statistics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fspotify-statistics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fspotify-statistics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fspotify-statistics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/piebro","download_url":"https://codeload.github.com/piebro/spotify-statistics/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251484774,"owners_count":21596809,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-visualization","pyodide","spotify","statistics"],"created_at":"2024-11-17T09:41:19.516Z","updated_at":"2025-04-29T10:36:55.503Z","avatar_url":"https://github.com/piebro.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Personal Spotify Statistics\n\nI like statistics and listening to music, and I also enjoy the yearly Spotify Wrapped. After watching one at the end of each year, I am always interested in more in-depth statistics over multiple years. That's why I created a [website](https://piebro.github.io/spotify-statistics) and a Python script to generate personal Spotify usage statistics.\n\n\n![Average hours played per year and month](assets/avg_hours_played_per_year_month.png)\n\nThis works well because, a few years ago, the EU passed the [GDPR Act](https://www.wired.co.uk/article/what-is-gdpr-uk-eu-legislation-compliance-summary-fines-2018), which enables EU citizens to access all personal data a company has stored about you. Spotify stores a log of your listening history, including partial listens. This data log is a treasure if you are interested in your own listening behavior. You can request your `Extended streaming history` data at https://www.spotify.com/us/account/privacy/.\n\nThe website automatically generates many different statistics from your listening history data. The data is only processed in your browser and never leaves your computer.\n\nMore advanced and custom statistics can be generated using the Python scripts. The data is enriched with Wikidata and the Spotify API, for example genre of artists, album publishing year or artist popularity.\n\n![Percentage of rock, hiphop, pop songs played over time](assets/percentage_of_rock_hiphop_pop_songs_played_over_time.png)\n\n## Creating custom statistics\n\n### Setup Access to the Spotify API\n\n1. Go to `https://developer.spotify.com/dashboard/applications`\n2. Click \"Create an app\"\n3. Enter a name and description\n4. Click \"Create\"\n5. Click \"Edit settings\"\n6. Click \"Add new redirect URI\" and enter `http://localhost:8888/callback`\n7. Click \"Save\"\n8. Copy the client ID and client secret\n9. Create a `.env` file with the following variables:\n    ```bash\n    SPOTIFY_CLIENT_ID=\u003cspotify-client-id\u003e\n    SPOTIFY_CLIENT_SECRET=\u003cspotify-client-secret\u003e\n    ```\n10. Run `uv run src/add_refeshtoken_to_env.py` to add the Spotify refresh token to the `.env` file\n\n### Create the database\n\n```bash\nuv run src/create_db.py \"Path-To-Spotify-Extended-Streaming-History-Folder\"\nuv run src/enrich_with_internet_data.py\n```\n\n### Create the statistics\n\nThe easiest way to get started is to use the `getting-started.ipynb` notebook.\nThe documentation of the database and a lot of query examples are in `db_documentation.md`.\n\nIt also works quite well to generate new queries using Chat-Bots.\nYou can paste the `db_documentation.md` file and ask the bot to generate a query for a specific question or idea.\n\n## Some thoughts about the data\n\nI downloaded my own data several times over the past few months without any issues. However, the last time I downloaded it, there was some missing data for the year 2017. For this time period, the reason_end data field consistently showed none instead of the actual reason why the song ended. If you notice anything unusual in your statistics, it might be because the Spotify data export was inaccurate, and I would recommend re-downloading your data.\n\nThe data is divided into multiple JSON files, each approximately 10.5MB in size, containing the streaming history. This JSON is comprised of a lengthy list of objects, each representing a listening log. Additionally, there's a PDF file that offers explanations for each data field in various languages. Some preprocessing is performed to work with the data more easily.\n\nI initially used the `ts` (timestamp) as my standard time reference, but I noticed that there were instances where multiple songs were logged at the exact same time. I suspect this might have occurred due to a lack of internet connection at those moments. That's why I looked at the `offline_timestamp`. The time in this field is saved as a [Unix timestamp](https://www.unixtimestamp.com/). However, some entries in this field don't make sense (they were less than 100 and would be from the 1970s). To address this, I utilize the `offline_timestamp` if it seems plausible; otherwise, I revert to using the normal `ts` for the timestamp.\n\n## Similar Projects\n\n- [Awesome Spotify Stats](https://github.com/rimsiw/awesome-spotify-stats/blob/main/README.md)\n- [Analyzing Spotify stream history](https://ericchiang.github.io/post/spotify/)\n- [Should have been listening to Phoebe Bridgers](https://www.darrenshaw.org/blog/2023/01/05/should-have-been-listening-to-phoebe-bridgers.html)\n- [Your Spotify](https://github.com/Yooooomi/your_spotify)\n\n## Contributing\n\nContributions to this project are welcome. Feel free to report bugs, suggest ideas or create merge requests.\n\n## Developing\n\n### Update the website data\n\n[uv](https://docs.astral.sh/uv/getting-started/installation/) is used in the project to run Python.\n\n```bash\n# Use the script to update the data in the folder website/assets\nuv run website/data_crunching.py \"Path-To-Spotify-Extended-Streaming-History-Folder\"\n\n# Run a simple Python server to view your stats in the browser\nuv run -m http.server\n\n# Open http://0.0.0.0:8000/ in your browser\n```\n\n### Formatting and linting\n\nThe project uses the Python code formatter and linter [Ruff](https://github.com/astral-sh/ruff) for python.\n\n```bash\nuv run ruff check src/*.py --fix\nuv run ruff format src/*.py\n```\n\n[Prettier](https://prettier.io/playground/) is used for linting the `website/index.js` file with a `print-width` of 120, `tab-width` of 4, and using single quotes. Additionally, I used [Stylelint](https://stylelint.io/demo/) for linting the `website/index.css` file.\n\n\n## Website Statistics\n\nThere is lightweight tracking for the website using Plausible. Anyone interested can view these statistics at https://plausible.io/piebro.github.io%2Fspotify-statistics. Note that only users without an AdBlocker are counted, so these statistics underestimate the actual number of visitors. I would assume that a significant number of people visiting the site, including myself, have an AdBlocker enabled.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpiebro%2Fspotify-statistics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpiebro%2Fspotify-statistics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpiebro%2Fspotify-statistics/lists"}