https://github.com/ulyssear/scrap_metacritic
https://github.com/ulyssear/scrap_metacritic
github-workflow nodejs puppeteer scraper
Last synced: 30 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/ulyssear/scrap_metacritic
- Owner: ulyssear
- Created: 2022-12-30T10:33:01.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-01-04T13:49:28.000Z (over 3 years ago)
- Last Synced: 2025-01-28T15:39:51.237Z (over 1 year ago)
- Topics: github-workflow, nodejs, puppeteer, scraper
- Language: JavaScript
- Homepage:
- Size: 10.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Metacritic bot scraper
## Installation
1. Clone the repository
```bash
git clone
```
2. Install the dependencies with npm
```bash
npm install
```
3. Fix Puppeteer by launching the following command (powershell or bash, not cmd)
```bash
./bin/init
```
## Usage
| Parameter | Description | Default value |
| --- | --- | --- |
| bot_name | Name of the bot | 'metacritic' |
| date | Date of the data | Date with this format : "2023-01-01" |
| root_path | Root path of the project | path.resolve('.') |
| data_directory | Directory where the data will be stored | 'data' |
| executable_path | Path to the executable | '' |
| os | Operating system | OS |
| executable | Executable | 'chrome' |
| headless | Headless mode | 'true' |
| start | Index of first task | 0 |
| end | Index of last task | -1 |
Available browsers :
| Name | Value |
| --- | --- |
| Chrome | 'chrome' |
| Edge | 'edge' |
| Firefox | 'firefox' |
Available operating systems :
| Name | Value |
| --- | --- |
| Windows | 'windows' |
| Linux | 'linux' |
| Mac | 'mac' |
Example :
To execute the script with edge as browser on the current operating system :
```bash
node index.js --executable="edge"
```
To execute the script with chrome headless on linux :
```bash
node index.js --executable="chrome" --headless="true" --os="linux"
```
To execute a chunk of tasks (from the tenth to the twentieth task):
```bash
node index.js --start=10 --end=20
```