Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vorpalblade/any2rss
Framework to convert any website into RSS
https://github.com/vorpalblade/any2rss
rss rss-generator scraper
Last synced: about 2 months ago
JSON representation
Framework to convert any website into RSS
- Host: GitHub
- URL: https://github.com/vorpalblade/any2rss
- Owner: VorpalBlade
- License: agpl-3.0
- Created: 2022-10-19T17:58:15.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-06T05:34:25.000Z (6 months ago)
- Last Synced: 2024-07-06T06:37:09.089Z (6 months ago)
- Topics: rss, rss-generator, scraper
- Language: Python
- Homepage:
- Size: 32.2 KB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# any2rss: Create RSS feeds from broken websites
This programs scrapes HTML and outputs RSS. This is made to work on very,
very broken websites. Unlike projects such as [html2rss] which you should
use instead if that works for you.Unlike other approaches that limit you by using a simplified DSL to
describe how to extract the data, this package gives you the full freedom
of Python. Any2rss manages downloading, caching, and parsing, handing of
to your code for the extraction. Then any2rss handles sanitising the HTML
and generation of the actual RSS document.## Documentation
This project is in very early stages, there is almost no documentation. But
here is how to run the [example](src/any2rss/examples/the_whiteboard.py):```console
$ # It is assumed you have set up venv and installed this package using pip.
$ any2rss -m any2rss.examples.the_whiteboard[...]
$
```The idea is that you would use a cronjob to run any2rss to generate feeds
on your self-hosted web server, that you can then poll using your RSS
reader of choice.[html2rss]: https://github.com/html2rss/html2rss