Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dbreunig/git-scraper-extractor
Pull out versions of specific files from a gitscraping repo into individual files.
https://github.com/dbreunig/git-scraper-extractor
git-scraping
Last synced: about 2 months ago
JSON representation
Pull out versions of specific files from a gitscraping repo into individual files.
- Host: GitHub
- URL: https://github.com/dbreunig/git-scraper-extractor
- Owner: dbreunig
- License: mit
- Created: 2021-03-18T22:03:53.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-07-14T20:29:13.000Z (about 3 years ago)
- Last Synced: 2024-04-27T17:32:50.410Z (5 months ago)
- Topics: git-scraping
- Language: Ruby
- Homepage:
- Size: 10.7 KB
- Stars: 12
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# git-scraper-extractor
`git-scraper-extractor` is a handy tool for your gitscraping repositories.
What is [gitscraping](https://simonwillison.net/2020/Oct/9/git-scraping/)? We'll let Simon Willison, who coined the term, explain:
>The internet is full of interesting data that changes over time. These changes can sometimes be more interesting than the underlying static data—The @nyt_diff Twitter account tracks changes made to New York Times headlines for example, which offers a fascinating insight into that publication’s editorial process.
>
>We already have a great tool for efficiently tracking changes to text over time: Git. And GitHub Actions (and other CI systems) make it easy to create a scraper that runs every few minutes, records the current state of a resource and records changes to that resource over time in the commit history.`git-scraper-extractor` is a little tool for extracting the multiple versions of a files from your git repository into separate, timestamped files. After your gitscraping repository has been updating a json or csv for awhile, use `git-scraper-extractor` to find each change and output that version into a separate file. Then load those files into the tool of your choice.
## Usage
It's simple. Clone this repo, `cd` into the directory and run:
`$ bundle install`
`$ ./git-scraper-extractor /path/to/repo /path/to/output`