https://github.com/agateau/ghi-scraper
GitHub Issue Scraper
https://github.com/agateau/ghi-scraper
git-scraper
Last synced: 6 months ago
JSON representation
GitHub Issue Scraper
- Host: GitHub
- URL: https://github.com/agateau/ghi-scraper
- Owner: agateau
- License: apache-2.0
- Created: 2022-12-04T17:18:02.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2023-08-05T17:30:20.000Z (over 2 years ago)
- Last Synced: 2025-02-13T19:49:51.781Z (11 months ago)
- Topics: git-scraper
- Language: Python
- Homepage:
- Size: 159 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GitHub Issue Scraper
A Python-based scraper to download all issues and pull-requests from a GitHub repository.
Stores the issues and pull-requests as JSON files.
## Installation
```
pipx install git+https://github.com/agateau/ghi-scraper
```
## Usage
To avoid getting rate-limited: create a GitHub token (it only needs the "read repo" permission), store it in `$GITHUB_TOKEN`.
Run the scraper:
```
ghi-scraper user/repo where/to/store/the/files
```
If you plan to run this regularly, you can use the `--since` argument to reduce traffic and avoid getting rate-limited:
```
# Scrap since an absolute date
ghi-scraper user/repo where/to/store/the/files --since 2022-12-02
# Scrap all changes of the last 2 days
ghi-scraper user/repo where/to/store/the/files --since 2d
# Scrap all changes of the last 6 hours
ghi-scraper user/repo where/to/store/the/files --since 6h
```