{"id":22624674,"url":"https://github.com/barrettotte/anilist-ml","last_synced_at":"2026-05-15T20:31:33.524Z","repository":{"id":114088894,"uuid":"537955467","full_name":"barrettotte/anilist-ml","owner":"barrettotte","description":"Training a binary classifier model to predict if I would recommend an anime using my Anilist user data.","archived":false,"fork":false,"pushed_at":"2022-10-05T18:25:26.000Z","size":26714,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-16T22:18:59.496Z","etag":null,"topics":["anilist","binary-classification","data-visualization","machine-learning","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/barrettotte.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-17T23:40:07.000Z","updated_at":"2022-10-06T15:13:15.000Z","dependencies_parsed_at":null,"dependency_job_id":"68cb97c7-0996-4a3f-a778-13a7cd09c6c3","html_url":"https://github.com/barrettotte/anilist-ml","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/barrettotte/anilist-ml","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrettotte%2Fanilist-ml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrettotte%2Fanilist-ml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrettotte%2Fanilist-ml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrettotte%2Fanilist-ml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/barrettotte","download_url":"https://codeload.github.com/barrettotte/anilist-ml/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/barrettotte%2Fanilist-ml/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33078898,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-15T20:25:35.270Z","status":"ssl_error","status_checked_at":"2026-05-15T20:25:34.732Z","response_time":103,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anilist","binary-classification","data-visualization","machine-learning","scikit-learn"],"created_at":"2024-12-09T00:17:20.509Z","updated_at":"2026-05-15T20:31:33.508Z","avatar_url":"https://github.com/barrettotte.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# anilist-ml\n\nLearning the basics of machine learning with Anilist data\n\n## Summary\n\nI originally set out to make a model trained on my Anilist data that was able\nto predict the score (1-10) that I would probably give an anime.\n\nAt first I tried to make a regression model. I realized since I was only rating 1-10\nI would probably have more accuracy using a classifier model. Classifier models were definitely more accurate.\nBut, the best I was able to pull off was a random forest classifier with an accuracy of ~38%.\n\nI think my data is subpar since I only have ~500 data points and generally skewed data.\nI tend to give an anime a score of 7 more than anything else and also tend to not be very critical.\nSo, entries with a score below 5 are very rare and I think it makes training more difficult.\n\nAfter messing around for a while I decided to switch to a binary classification and change my target.\nWould I recommend or not recommend an anime?\nWith this change, my final result was a random forest classifier with an accuracy of ~84-85%.\n\nSomeone knowledgeable in machine learning could probably point out what I was doing wrong immediately.\nBut, oh well this was my first ML project and I have a lot to learn. \nThis wasn't exactly a win, but hopefully the next ML project goes better.\n\n## Results\n\nThe final model is located in [model.ipynb](model.ipynb).\n\n```txt\nLog loss = 4.934196584711905\nROC AUC Score = 0.8524492234169653\n\nClassification Report:\n              precision    recall  f1-score   support\n\n       False   0.826087  0.612903  0.703704        31\n        True   0.865169  0.950617  0.905882        81\n\n    accuracy                       0.857143       112\n   macro avg   0.845628  0.781760  0.804793       112\nweighted avg   0.854351  0.857143  0.849922       112\n```\n\nROC Plot\n\n![roc.png](roc.png)\n\n### Notebooks\n\n1. [fetch.ipynb](fetch.ipynb) - Fetches user and anime data using Anilist's GraphQL API.\n2. [clean.ipynb](clean.ipynb) - Cleans fetched data to prepare for data visualization and model training.\n3. [explore.ipynb](explore.ipynb) - Some data visualizations and general exploration of data.\n4. [model-select-reg.ipynb](model-select-reg.ipynb) - Testing out different regression models.\n5. [model-select-cls.ipynb](model-select-cls.ipynb) - Testing out different classifier models.\n6. [model.ipynb](model-final.ipynb) - Final model training, verification, and evaluation.\n\n## Data\n\n- fetched\n  - [data/anime-YYYYMMDD-raw.csv](data/anime-20220927-raw.csv) - raw Anilist anime data\n  - [data/user-YYYYMMDD-raw.csv](data/user-20220927-raw.csv) - raw Anilist user data\n- cleaned/enriched\n  - [data/anime-YYYYMMDD-clean.csv](data/anime-20220927-clean.csv) - cleaned anime data; usesless columns dropped and missing data filled\n  - [data/user-YYYYMMDD-clean.csv](data/user-20220927-clean.csv) - cleaned user data; useless columns dropped\n  - [data/user-YYYYMMDD-enriched.csv](data/user-20220927-enriched.csv) - user data joined with anime data\n- regression\n  - [data/user-YYYYMMDD-reg-train.csv](data/user-20220927-reg-train.csv) - train data for regression models\n  - [data/user-YYYYMMDD-reg-valid.csv](data/user-20220927-reg-valid.csv) - validation data for regression models\n  - [data/user-YYYYMMDD-reg-test.csv](data/user-20220927-reg-test.csv) - test data for regression models\n- classification\n  - [data/user-YYYYMMDD-cls-train.csv](data/user-20220927-cls-train.csv) - train data for classifier models\n  - [data/user-YYYYMMDD-cls-valid.csv](data/user-20220927-cls-valid.csv) - validation data for classifier models\n  - [data/user-YYYYMMDD-cls-test.csv](data/user-20220927-cls-test.csv) - test data for classifier models\n\nI also made the Anilist anime data (as of 09/27/2022) available on \n[Kaggle](https://www.kaggle.com/datasets/barrettotte/anilistanimedata).\n\n## References\n\n- [Anilist Interactive GraphQL Tool](https://anilist.co/graphiql)\n- [Anilist GraphQL Documentation Explorer](https://anilist.github.io/ApiV2-GraphQL-Docs/)\n- [Coursera Machine Learning Specialization (Andrew Ng)](https://www.coursera.org/specializations/machine-learning-introduction)\n- [Hands-On Machine Learning with Scikit-Learn, Keras \u0026 TensorFlow. Geron](https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1492032646)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbarrettotte%2Fanilist-ml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbarrettotte%2Fanilist-ml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbarrettotte%2Fanilist-ml/lists"}