https://github.com/amdmi3/habrss
Proxy which merges, sanitizes and filters habr.ru RSS feeds
https://github.com/amdmi3/habrss
Last synced: 8 months ago
JSON representation
Proxy which merges, sanitizes and filters habr.ru RSS feeds
- Host: GitHub
- URL: https://github.com/amdmi3/habrss
- Owner: AMDmi3
- License: gpl-3.0
- Created: 2021-11-29T19:52:09.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-08-21T18:49:33.000Z (almost 3 years ago)
- Last Synced: 2025-01-18T04:44:03.573Z (over 1 year ago)
- Language: Python
- Size: 21.5 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
README
# habrss
This is a simple proxy which merges and filters RSS feeds from
[habr.ru](https://habr.ru).
Motivation: habr/geektimes, which were useful sources of knowledge
once, are now clogged with low quality content and corporate blogs.
When reading a feed of articles degrades into skipping through the
garbage, automated solution comes in rescue. Apart from advanced
filtering, this project is capable of skipping duplicate posts when
being subscribed to multiple feeds.
## Usage
```
./habrss.py [-l host] [-p port] -f config
```
where
* `-l` specifies host to listen on (default 127.0.0.1, specify 0.0.0.0 to listen on all interfaces)
* `-p` specifies port to listen on (default 8080)
* `-f` specifies path to config file
The script serves HTTP on a given host/port. Available pages include
a list of configured feeds, feeds themselves and a statistics on
posts blocked and passed by filters.
## Config
Example:
```
- name: myfeed
urls:
- https://habr.com/ru/rss/flows/develop/all/?fl=ru
- https://habr.com/ru/rss/flows/popsci/all/?fl=ru
exclude:
- title: .*iOS.*
- category: Natural Language Processing
- creator: ph_piter
include:
- title: .*python.*
- category: Python
- creator: Zelenyikot
- name: anotherfeed
urls:
...
```
Each config entry specifies an output feed served by the script.
Each such feed may aggregate one or more (in which case duplicates
are removed) habr.ru feeds speficied as a set of `urls`. It's also
possible to specify `exclude` filters which match post title, author,
or a category with a case-insenesitive regexp (so if you want a
partial match, use e.g. `title: .*python.*`). `include` filters
override `exclude` filters.
## License
GPLv3 or later, see [COPYING](COPYING).