https://github.com/airsequel/airput
CLI tool for populating Airsequel with data. Includes a crawler for GitHub repos metadata.
https://github.com/airsequel/airput
airsequel crawling github haskell
Last synced: 7 months ago
JSON representation
CLI tool for populating Airsequel with data. Includes a crawler for GitHub repos metadata.
- Host: GitHub
- URL: https://github.com/airsequel/airput
- Owner: Airsequel
- Created: 2020-06-13T12:01:58.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2024-02-20T17:07:24.000Z (over 1 year ago)
- Last Synced: 2025-01-13T16:28:20.708Z (9 months ago)
- Topics: airsequel, crawling, github, haskell
- Language: Haskell
- Homepage:
- Size: 83 KB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# Airput
CLI tool for populating Airsequel with data.
Includes a crawler for metadata of GitHub repos.## Usage
```txt
⬆️ Airput ⬆️Usage: airput COMMAND
CLI tool for populating Airsequel with data.
Available options:
-h,--help Show this help textAvailable commands:
upload Upload files to a database via the REST API. Expects
3 columns: `name`, `filetype`, and `content`.
github-upload Upload metadata for a single GitHub repo
github-search Search for GitHub repos and upload their metadata.If several search queries are provided, they will be
executed one after the other.WARNING: If a search returns more than 1000 repos,
the results will be truncated.Good search options are:
- language:haskell
- stars:>=10
- stars:10..50
- created:2023-10
- archived:true
```## TODOs
- Add column `is_private` and only crawl private repos if `--private` is passed
- Add subcommand to load list of repos from Airsequel and update them
- Move `bin-calculation.py` to Airsequel
- Store if account is a person or an organization
- Store all languages for a repo
- Repos created per week chart
- Add CLI flag to choose between `OverwriteRepo` and `AddRepo`## Related
- [GrimoireLab] - Open source tools for software development analytics.
- [SEART GitHub Search Engine] - Platform to crawl, store, and present repos.[GrimoireLab]: http://chaoss.github.io/grimoirelab/
[SEART GitHub Search Engine]: https://github.com/seart-group/ghs