Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ukrbublik/malscan
MyAnimeList scanner, for recommender systems
https://github.com/ukrbublik/malscan
Last synced: about 1 month ago
JSON representation
MyAnimeList scanner, for recommender systems
- Host: GitHub
- URL: https://github.com/ukrbublik/malscan
- Owner: ukrbublik
- Created: 2017-01-16T16:13:54.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-07-22T20:38:20.000Z (over 6 years ago)
- Last Synced: 2024-05-02T01:38:34.178Z (7 months ago)
- Language: JavaScript
- Size: 75.2 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# MyAnimeList scanner #
### About
Scans MAL site for data need for recommeder system [You Can (Not) Recomend](https://github.com/ukrbublik/You-Can-Not-Recommend).For speed-up scans in parallel:
- Each scanner instance can perform several http requests at time (in queue).
- Several scanner instances can safely run together from many processes and PCs.
- Can use different "data providers" - parsing MAL site directly and with proxies, unofficial MAL API servers (see `mal_api_server`). See classes `MalDataProvider` -> `MalParser`, `MalApiClient`.Safe parallelization is implemented with help of redis transactions (see class `MalBaseScanner`).
Scanned data is saved to PostgreSQL db (see `data/db-schema.sql`, class `MalDataProcesser`).
### Using
Install PostgreSQL db schema `data/db-schema.sql`Set options in `config/config-scanner.js`
Run `node index.js`
Add manually tasks to redis: `rpush mal.queuedTasks `
See progress at cmd logs
### Tasks
List of tasks to grab only new data:- `GenresOnce` - grab genres, once
- `Animes_New` - grab new animes
- `AnimesUserrecs_New` - grab users' anime-to-anime recommendations
- `UserLogins_New` - grab user id <-> login pairs
- `UserLists_New` - grab user lists, only for users with never checked yet list
- `UserProfiles_New` - grab user profile data, only for users with never checked yet profileList of tasks to check udpates:
- `UserListsUpdated_Active` - check updates of active user lists, run frequently
- `UserListsUpdated_WithoutList` - check appearing of user lists, run rarely
- `UserListsUpdated_NonActive` - check updates of nonactive user lists, run rarely
- `UserLists_Updated` - grab updated user lists, after `UserListsUpdated_*`
- `AnimesUserrecs_All` - regrab users' anime-to-anime recommendations, run it rarely, like once in week..
- `UserProfiles_All` - just to update favs, run it very rarely!
- `Animes_All` - just to check possible updates of genres, relations, run it very rarely!Special tasks to fix possible problems with logins swaps, will be added automatically:
- `SpUserLogins_Re`
- `UserProfiles_Re`### Todo
Adding tasks from timer