Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/againpsychox/opgg-data-miner
Toolkit for Op.GG data mining, including crawling pages. The project have educational propose: You might be better of using Riot API than reusing data from aggregator websites like Op.GG.
https://github.com/againpsychox/opgg-data-miner
data-mining league-of-legends leagueoflegends opgg scrapper
Last synced: about 2 months ago
JSON representation
Toolkit for Op.GG data mining, including crawling pages. The project have educational propose: You might be better of using Riot API than reusing data from aggregator websites like Op.GG.
- Host: GitHub
- URL: https://github.com/againpsychox/opgg-data-miner
- Owner: AgainPsychoX
- Created: 2023-06-14T19:46:18.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-17T16:14:11.000Z (8 months ago)
- Last Synced: 2024-05-17T17:30:50.336Z (8 months ago)
- Topics: data-mining, league-of-legends, leagueoflegends, opgg, scrapper
- Language: TypeScript
- Homepage:
- Size: 1.53 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Op.GG data miner toolkit
Toolkit for Op.GG data mining, including crawling pages. The project have educational propose: You might be better of using Riot API than reusing data from aggregator websites like Op.GG.
### Usage
1. Install [Node](https://nodejs.org/en/download/) (LTS should be fine). Make sure they are accessible via `PATH` environment variable.
2. Clone the repository, then navigate with command prompt to the project root directory.
3. Use `npm install` to install dependencies.
4. Run it (multiple options):
1. You can use `npm run cli:ts` to run it as Typescript (`ts-node` mode), passing params should look like: `npm run cli:ts --- --help`.
2. You can compile it (to JavaScript) by running `npm run build` once, then you can use `npm run cli:js` in similar fashion as above.
5. (Optional) Use `npm link` (with admin privileges) to make the tool available as `opgg --help`.#### Examples
```properties
# To collect games for certain user, outputs `games.json`
opgg history euw Azzapp# To collect data for all users (infinite process), see `cache` folder; stop with Ctrl+C
opgg spider euw Azzapp
# and to continue after crash/stopping
opgg spider continue
```### To-do
+ Fix wiki scrapper:
+ Shen & Kennen names are bugged/empty
+ Nunu is named differently in OpGG static data
+ progress bars
+ handle URLs
+ regex: `/(?:(\w+)\.)?op\.gg\/summoners?\/(?:(\w+)\/)?(?:userName=)?([^?#\/\s]*)/i` handles well:
+ `op.gg/summoners/euw/Azzapp`
+ `https://www.op.gg/summoners/euw/Azzapp`
+ `https://euw.op.gg/summoner/userName=AgainPsychoX`
+ `https://www.op.gg/summoners/euw/Azzapp/matches/ewOhykeZdeeskvBSovvxqie5BuF8-a1Z515jCKtAw2I%3D/1686681922000`
+ distribute work over multiple proxies to avoid 429 Too Many Requests
+ ...