{"id":19829609,"url":"https://github.com/debakarr/myanimelist-data-set-creator","last_synced_at":"2025-05-01T14:33:37.452Z","repository":{"id":28683582,"uuid":"117518013","full_name":"debakarr/myanimelist-data-set-creator","owner":"debakarr","description":"Collection of some simple python scripts to create https://myanimelist.net/ anime and user data set.","archived":false,"fork":false,"pushed_at":"2022-12-08T00:58:06.000Z","size":22334,"stargazers_count":42,"open_issues_count":14,"forks_count":6,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-06T14:51:15.527Z","etag":null,"topics":["anime","api","database","dataset","manga","myanimelist","scraping"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/debakarr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-01-15T08:27:49.000Z","updated_at":"2023-09-11T18:57:55.000Z","dependencies_parsed_at":"2022-07-27T16:33:40.125Z","dependency_job_id":null,"html_url":"https://github.com/debakarr/myanimelist-data-set-creator","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/debakarr%2Fmyanimelist-data-set-creator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/debakarr%2Fmyanimelist-data-set-creator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/debakarr%2Fmyanimelist-data-set-creator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/debakarr%2Fmyanimelist-data-set-creator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/debakarr","download_url":"https://codeload.github.com/debakarr/myanimelist-data-set-creator/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251890349,"owners_count":21660503,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anime","api","database","dataset","manga","myanimelist","scraping"],"created_at":"2024-11-12T11:19:21.935Z","updated_at":"2025-05-01T14:33:36.074Z","avatar_url":"https://github.com/debakarr.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# myanimelist-data-set-creator\nCollection of some simple python scripts to create https://myanimelist.net/ anime and user data set.\n\n## This is not maintained anymore. Bulk request is not encourage, hence discontinuing the Project.\n\n***\n\n## [Myanimelist Anime Dataset upto May 7 2018](https://raw.githubusercontent.com/Dibakarroy1997/myanimelist-data-set-creator/master/Anime%20Dataset%20Generator%20Script/Dataset/myAnimeListDataset%20%5B07-05-2018%5D.csv) [This may take some time to load]\n\n### [For latest dataset click here](https://docs.google.com/spreadsheets/d/1brguO5nGfXS-Fr1Xcf3pqPTQoBUPGLTYM_EMAA9yJFw/edit?usp=sharing) [Constantly updating]\n\n***\n\n**NOTE**: This page contains lots of GIF. So it may take a lot of time to load. Please be patient.\n\n***\n\n# How to use Anime Dataset Generator\n\nThis script can be used to download anime dataset from [**Myanimelist**](https://myanimelist.net/) using an unofficial MyAnimeList REST API, [**Jikan**](https://jikan.me/docs).\n\n#### Column metadata:\n\n* animeID: id of anime as in anime url [https://myanimelist.net/anime/\u003cspan style=\"color:red\"\u003e**1**\u003c/span\u003e](https://myanimelist.net/anime/1)\n* name: title of anime\n* premiered: premiered on. default format (season year) \n* genre: list of genre\n* type: type of anime (example TV, Movie etc) \n* episodes: number of episodes\n* studios: list of studio\n* source: source of anime (example original, manga, game etc) \n* scored: score of anime\n* scoredBy: number of member scored the anime\n* members: number of member added anime to their list\n\n***\n\n### Using Heroku\n\n* Before doing this I will request you to watch this video - [Google Sheets and Python](https://youtu.be/vISRn5qFrkM). Here we are using this concept as a base. I just integrated this with heroku.\n\n* First you need to visit [this link](https://console.developers.google.com/cloud-resource-manager) to create a project inside Google cloud resource manager.\n\n* Click **CREATE PROJECT**, then give it a name. [If below gif is low quality, then click here](https://gfycat.com/gifs/detail/VibrantQuarterlyFieldmouse).\n\n\n![](demo/createProject.gif)\n\n* Now you need to enable Google sheet API for your project.\n\n![](demo/enableAPI.gif)\n\n* Next you need to get credential file. [If below gif is low quality, then click here](https://gfycat.com/gifs/detail/InsecureExcellentImpala).\n\n![](demo/createClientScretJSON.gif)\n\n* [Get files for deployment here](https://github.com/Dibakarroy1997/myanimelist-data-set-creator/tree/master/Anime%20Dataset%20Generator%20Script/Using%20Heroku).\n\n* Add client_secret.json and give access to the spreadsheet. Spreadsheet contains header, which you need to add. [Watch how to do that here](https://youtu.be/M-q0ptxOJB0).\n\n* Before deploying to Heroku. You need to create an app. [If below gif is low quality, then click here](https://gfycat.com/gifs/detail/AggressiveParallelDevilfish).\n\n![](demo/preDeploy.gif)\n\n* At last just push to heroku master and start the worker dyno. [Watch how to do that here](https://youtu.be/BvlCLwEMKHg)\n\n**NOTE**: If the worker doesn't starts amnually, you can start it using the following command: **heroku ps:scale worker=1**\n\n* Final Product:\n\n![](demo/herokuFinal.gif)\n\n***\n\n### Using your own PC/Laptop\n\n#### Syntax\n```\npython getAnime.py starting_index ending_index [output_file.csv]\n```\n\n\n#### Demo:\n\n![](demo/getAnime.gif)\n\n***\n\n# How to use User Daataset Generator\n\nThis script can be used to download user dataset from [**Myanimelist**](https://myanimelist.net/) using an API, [**Kuristina**](https://github.com/TimboKZ/kuristina).\n\n#### Column metadata:\n\n* userID: MAL user ID\n* animeID: id of anime as in anime url https://myanimelist.net/anime/ID\n* score: score by the use for anime with id = animeID (if user haven't score the anime then this field is 0).\n\n#### Syntax\n```\npython getUser.py UserList.txt [User.csv]\n```\n\n**NOTE**: Make sure you have a **Userlist.txt** file containing the name of the users. If you don't have that then use the scrapper([scrap from club]() or [scrap from post]()).\n\n#### How to create User List from forum post:\nFor this you need to get topic ID.\nGo to [**MAL**](https://myanimelist.net/) -\u003e [**Community** -\u003e **Forums**](https://myanimelist.net/forum/) -\u003e **Select a forum**\n\nFor example for the following forums links their respective ID are highlighted in bold below:\n\n[https://myanimelist.net/forum/?topicid=1699126](https://myanimelist.net/forum/?topicid=1699126) -\u003e **1699126**\n\n[https://myanimelist.net/forum/?topicid=1696289](https://myanimelist.net/forum/?topicid=1696289) -\u003e **1696289**\n\nAfter getting the topic ID, you can use **createUserListFromPost** script.\n\n###### Syntax:\n```\npython createUserListFromPost.py topicID [UserList.txt]\n```\n\n#### How to create User List from club:\nFor this you need to get club ID.\nGo to [**MAL**](https://myanimelist.net/) -\u003e [**Community** -\u003e **Clubs**](https://myanimelist.net/forum/) -\u003e **Select a club**\n\nFor example for the following clubs links their respective ID are highlighted in red below:\n\n[https://myanimelist.net/clubs.php?cid=72250](https://myanimelist.net/clubs.php?cid=72250) -\u003e **72250**\n\n[https://myanimelist.net/clubs.php?cid=32683](https://myanimelist.net/clubs.php?cid=32683) -\u003e **32683**\n\nAfter getting the topic ID, you can use **createUserListFromClub** script.\n\n###### Syntax:\n```\npython createUserListFromClub.py clubID [UserList.txt]\n```\n\n#### Demo:\n\n###### Create User List from forum\n\n![](demo/createUserListFromForum.gif)\n\n###### Create User List from club\n\n![](demo/createUserListFromClub.gif)\n\n###### Get user dataset\n\n![](demo/getUser.gif)\n\n***\n\n#### TO DO LIST\n* Scrapping Locally ✔\n* Scrapping using Heroku ✔\n* Creating Heroku Deploy Button ⌛\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdebakarr%2Fmyanimelist-data-set-creator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdebakarr%2Fmyanimelist-data-set-creator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdebakarr%2Fmyanimelist-data-set-creator/lists"}