https://github.com/azubieta/appimages.scraper

Search for AppImage releases over the web.
https://github.com/azubieta/appimages.scraper

Last synced: 27 days ago
JSON representation

Search for AppImage releases over the web.

Host: GitHub
URL: https://github.com/azubieta/appimages.scraper
Owner: azubieta
License: gpl-3.0
Created: 2018-05-02T17:51:38.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2018-10-25T10:42:12.000Z (over 6 years ago)
Last Synced: 2025-03-28T05:05:39.095Z (about 1 month ago)
Language: Python
Size: 23.2 MB
Stars: 12
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-appimage - appimages.scraper - Search for AppImage releases over the web. (AppImage discovery / App scrapers)

README

# appimages.scraper
Search for AppImage releases over the web.

### Dependencies
* Python 3.6
* Scrapy

### Run
* Normal run:`scrapy crawl generic.crawler -a project_file=./projects/org.appimage.appimaged.json`
* Output results to json:
`scrapy crawl appimage.github.io -o result.json -t json`

### Input
The scraper should be feed with a `project_file` which will be a json formatted file like the following:

```
{
"urls" : ["https://github.com/AppImage/AppImageKit/releases"]
}
```

**Missing fields?**

Sometimes authors doesnt provide good metadata about their project so we could help them by means of preset values.
Take a look in the following example at the `presets` field and to the `decription` field inside. It will be use
as a fallback value in case that the author forgets to fill that field.

```
{
"urls" : ["https://github.com/AppImage/AppImageKit/releases"]
"presets": {
"id" : "org.appimage.appimaged",
"description" : {"null": "Daemon to monitor AppImage files in the user home dir."}
}
}
```

**Multiple applications release in a single page ?**

No problem use the match field. It expects to be a python regex
that will be used to match the right AppImage download links for the app you are scraping.

```
{
"urls" : ["https://github.com/AppImage/AppImageKit/releases"],
"match" : ".*\/appimagetool.*",
"presets": {
"id" : "org.appimagekit.appimaged",
"description" : {"null": "Daemon to monitor AppImage files in the user home dir."}
}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/azubieta/appimages.scraper

Awesome Lists containing this project

README