{"id":26978738,"url":"https://github.com/reuteras/rssfixer","last_synced_at":"2025-04-03T13:39:49.948Z","repository":{"id":153233675,"uuid":"628573188","full_name":"reuteras/rssfixer","owner":"reuteras","description":"Generate RSS for blogs without a feed.","archived":false,"fork":false,"pushed_at":"2024-04-14T15:52:19.000Z","size":1382,"stargazers_count":7,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-04-17T21:16:24.576Z","etag":null,"topics":["rss","rss-generator"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/rssfixer/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/reuteras.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-04-16T11:28:25.000Z","updated_at":"2024-04-19T02:26:58.593Z","dependencies_parsed_at":"2023-12-25T14:26:58.886Z","dependency_job_id":"76213e1d-d00d-4036-a134-4102e8f5f94b","html_url":"https://github.com/reuteras/rssfixer","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reuteras%2Frssfixer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reuteras%2Frssfixer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reuteras%2Frssfixer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reuteras%2Frssfixer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/reuteras","download_url":"https://codeload.github.com/reuteras/rssfixer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247011521,"owners_count":20868906,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["rss","rss-generator"],"created_at":"2025-04-03T13:39:49.214Z","updated_at":"2025-04-03T13:39:49.930Z","avatar_url":"https://github.com/reuteras.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rssfixer\n\n\u003c!-- CODE:BASH:START --\u003e\n\u003c!-- echo '[![GitHub Super-Linter](https://github.com/reuteras/rssfixer/actions/workflows/linter.yml/badge.svg)](https://github.com/marketplace/actions/super-linter)' --\u003e\n\u003c!-- echo '![PyPI](https://img.shields.io/pypi/v/rssfixer?color=green)' --\u003e\n\u003c!-- echo '[![CodeQL](https://github.com/reuteras/rssfixer/workflows/CodeQL/badge.svg)](https://github.com/reuteras/rssfixer/actions?query=workflow%3ACodeQL)' --\u003e\n\u003c!-- echo '[![Coverage](https://raw.githubusercontent.com/reuteras/rssfixer/main/resources/coverage.svg)](https://github.com/reuteras/rssfixer/)' --\u003e\n\u003c!-- echo '[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/reuteras/rssfixer/main.svg)](https://results.pre-commit.ci/latest/github/reuteras/rssfixer/main)' --\u003e\n\u003c!-- if jq '.metrics._totals | .\"SEVERITY.HI\"' resources/bandit.json|grep -vE '^0' \u003e /dev/null;then cl='red';elif jq '.metrics._totals' resources/bandit.json|grep \"SEVERITY\"|grep -E ' 0,'|wc -l|grep -vE '4$' \u003e /dev/null;then cl='yellow';else cl='green';fi echo -n '[![security: bandit](https://img.shields.io/badge/security-bandit-' + $cl + '.svg)](https://github.com/PyCQA/bandit)' --\u003e\n\u003c!-- CODE:END --\u003e\n\u003c!-- OUTPUT:START --\u003e\n\u003c!-- ⚠️ This content is auto-generated by `markdown-code-runner`. --\u003e\n[![GitHub Super-Linter](https://github.com/reuteras/rssfixer/actions/workflows/linter.yml/badge.svg)](https://github.com/marketplace/actions/super-linter)\n![PyPI](https://img.shields.io/pypi/v/rssfixer?color=green)\n[![CodeQL](https://github.com/reuteras/rssfixer/workflows/CodeQL/badge.svg)](https://github.com/reuteras/rssfixer/actions?query=workflow%3ACodeQL)\n[![Coverage](https://raw.githubusercontent.com/reuteras/rssfixer/main/resources/coverage.svg)](https://github.com/reuteras/rssfixer/)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/reuteras/rssfixer/main.svg)](https://results.pre-commit.ci/latest/github/reuteras/rssfixer/main)\n\n\u003c!-- OUTPUT:END --\u003e\n\nA tool to generate an [RSS][rss] feed from some [WordPress][wor] blogs and other sources that for some reason don't generate their own feeds. This tool uses [BeautifulSoup][bso] to parse the HTML and [feedgen][fge] to generate the feed. I created this tool to be to follow news from companies that have forgotten the usefulness of RSS.\n\n## Installation\n\nCreate a virtual environment and simply run `python3 -m pip install rssfixer`, full example below.\n\n```bash\npython3 -m venv venv\nsource venv/bin/activate\npython3 -m pip install rssfixer\n```\n\n## Supported page types\n\nI've expanded the tools to blogs that I like to follow. If you have suggestions to add/change functionality please open an [issue][iss] or start a new [discussion][dis].\n\nThe basic formats of supported web pages are:\n\n- `--list` - links are in simple ul-list\n- `--json` - links, titles and sometimes description is accessible in a JSON structure\n- `--html` - links and titles can be found by some unique HTML element\n- `--release` - similar to `--html` except there are no links and you have to specify a target URL\n\nDuring testing it is useful to use `--stdout`option to see the generated feed. When I have time (and enough motivation) I might write a tool to try and find the right combination of options for a specified URL.\n\n### Simple list\n\nAn example to generate a feed for [nccgroup][ncc] that have the links in a simple ul-list by using the `--list` option:\n\n```bash\n$ rssfixer --title nccgroup --list https://research.nccgroup.com/\nRSS feed created: rss_feed.xml\n```\n\nYou can specify a filename and silence output:\n\n```bash\nrssfixer --title nccgroup --output nccgroup.xml --quiet https://research.nccgroup.com/\n```\n\nThe resulting file is available [here][exa] as an example.\n\nMost times you would run the script from crontab to have an updated feed. Here is an example with a venv in _/home/user/src/rssfixer_.\n\n```bash\n32 * * * *      /home/user/src/rssfixer/bin/rssfixer --title nccgroup --output /var/www/html/feeds/nccgroup.xml --quiet --list https://research.nccgroup.com\n```\n\n### JSON\n\nSome blogs like [truesec.com][tru] have all blog links in a JSON object. You can use the `--json` option to parse the JSON object and generate a feed. The same is true for Apple's [security blog][app] page.\n\nAn example for [Apple][app]:\n\n```bash\nrssfixer --title \"Apple Security\" --output apple.xml --quiet --json --json-entries blogs --json-url slug --base-url https://security.apple.com/blog/ https://security.apple.com/blog\n```\n\nIn this example `--json-entries blogs`specifies that blog entries are located in a key called __blogs__ and that URLs are available in a key called __slug__. Since the URL only includes the key (or slug) we specify the full URL to the blog with `--base-url https://security.apple.com/blog/`.\n\nAn example for [truesec.com][tru]:\n\n```bash\nrssfixer --title Truesec --json --json-description preamble --quiet --output truesec.xml https://www.truesec.com/hub/blog\n```\n\nHere we must specify `--json-description preamble` to find the description or summary of the blog post.\n\n### General HTML\n\nPages with a more general HTML structure can be parsed with the `--html` option. You can specify the HTML tag for the entries, the URL and title of the blog entry.\n\nAn example for [tripwire.com][tri]:\n\n```bash\nrssfixer --title Tripwire --output tripwire.xml --quiet --html --base-url https://www.tripwire.com http://www.tripwire.com/state-of-security\n```\n\n### Release\n\nCheck for one entity on release pages like [SQLite][sql] (h3) and generate RSS feed with links to the download page (required argument `--release-url`). Easy way to get notified when a new version is released.\n\n```bash\nrssfixer --release --output sqlite.xml --release-entries h3 --release-url https://sqlite.org/download.html https://sqlite.org/changes.html\n```\n\n### Usage\n\nCommand-line options (updated on commit by [markdown-code-runner][mcr]):\n\n\u003c!-- CODE:BASH:START --\u003e\n\u003c!-- echo '```Text' --\u003e\n\u003c!-- poetry run rssfixer --help --\u003e\n\u003c!-- echo '```' --\u003e\n\u003c!-- CODE:END --\u003e\n\n\u003c!-- OUTPUT:START --\u003e\n\u003c!-- ⚠️ This content is auto-generated by `markdown-code-runner`. --\u003e\n```Text\nusage: rssfixer [-h] (--html | --json | --list | --release) [--version]\n                [--atom] [--base-url BASE_URL] [--release-url RELEASE_URL]\n                [--release-entries RELEASE_ENTRIES]\n                [--html-entries HTML_ENTRIES]\n                [--html-entries-class HTML_ENTRIES_CLASS]\n                [--html-url HTML_URL] [--html-title HTML_TITLE]\n                [--html-title-class HTML_TITLE_CLASS]\n                [--title-filter TITLE_FILTER]\n                [--html-description HTML_DESCRIPTION]\n                [--html-description-class HTML_DESCRIPTION_CLASS]\n                [--json-entries JSON_ENTRIES] [--json-url JSON_URL]\n                [--json-title JSON_TITLE]\n                [--json-description JSON_DESCRIPTION] [--output OUTPUT]\n                [--title TITLE] [--user-agent USER_AGENT]\n                [--filter-type FILTER_TYPE] [--filter-name FILTER_NAME] [-q]\n                [-d] [--stdout]\n                url\n\nGenerate RSS feed for blog that don't publish a feed. Default is to find links\nin a simple \u003cul\u003e-list. Options are available to find links in other HTML\nelements or JSON strings.\n\npositional arguments:\n  url                   URL for the blog\n\noptions:\n  -h, --help            show this help message and exit\n  --html                Find entries in HTML\n  --json                Find entries in JSON\n  --list                Find entries in HTML \u003cul\u003e-list (default)\n  --release             Find releases in HTML\n  --version             show program's version number and exit\n  --atom                Generate Atom feed\n  --base-url BASE_URL   Base URL for the blog\n  --release-url RELEASE_URL\n                        Release URL for downloads\n  --release-entries RELEASE_ENTRIES\n                        Release selector for entries\n  --html-entries HTML_ENTRIES\n                        HTML selector for entries\n  --html-entries-class HTML_ENTRIES_CLASS\n                        Class name for entries\n  --html-url HTML_URL   HTML selector for URL\n  --html-title HTML_TITLE\n                        HTML selector for title\n  --html-title-class HTML_TITLE_CLASS\n                        Flag to specify title class (regex)\n  --title-filter TITLE_FILTER\n                        Filter for title, ignore entries that don't match\n  --html-description HTML_DESCRIPTION\n                        HTML selector for description\n  --html-description-class HTML_DESCRIPTION_CLASS\n                        Flag to specify description class (regex)\n  --json-entries JSON_ENTRIES\n                        JSON key for entries (default: 'entries')\n  --json-url JSON_URL   JSON key for URL (default: 'url')\n  --json-title JSON_TITLE\n                        JSON key for title\n  --json-description JSON_DESCRIPTION\n                        JSON key for description\n  --output OUTPUT       Name of the output file\n  --title TITLE         Title of the RSS feed (default: \"My RSS Feed\")\n  --user-agent USER_AGENT\n                        User agent to use for HTTP requests\n  --filter-type FILTER_TYPE\n                        Filter web page\n  --filter-name FILTER_NAME\n                        Filter web page\n  -q, --quiet           Suppress output\n  -d, --debug           Debug selection\n  --stdout              Print to stdout\n```\n\n\u003c!-- OUTPUT:END --\u003e\n\n## Command-line examples for blogs\n\n```bash\n# Apple Security Blog\n# Url: https://security.apple.com/blog/\nrssfixer --title \"Apple Security\" --output apple.xml --quiet --json --json-entries blogs --json-url slug --base-url https://security.apple.com/blog/ https://security.apple.com/blog\n\n# nccgroup\n# Url: https://research.nccgroup.com/\nrssfixer --title nccgroup --output nccgroup.xml --quiet --list https://research.nccgroup.com\n\n# Tripwire\n# Url: https://www.tripwire.com/state-of-security\nrssfixer --title Tripwire --output tripwire.xml --quiet --html --base-url https://www.tripwire.com http://www.tripwire.com/state-of-security\n\n# TrueSec\n# Url: https://www.truesec.com/hub/blog\nrssfixer --title Truesec --output truesec.xml --quiet --json --json-description preamble https://www.truesec.com/hub/blog\n\n# SQLite\n# Url: https://sqlite.org/changes.html\nrssfixer --title SQLite --release --release-entries h3 --release-url https://sqlite.org/download.html https://sqlite.org/changes.html\n\n# Nucleus\n# https://nucleussec.com/category/cisa-kev\nrssfixer --title \"Nucleus CISA KEV\" --output nucleus.xml  --html --filter-type div --filter-name recent-post-widget --html-entries div --html-title div --html-title-class \"post-desc\" --title-filter KEV https://nucleussec.com/category/cisa-kev\n\n# NCSC-SE\n# https://www.ncsc.se/publikationer/\nrssfixer --html --filter-type div --filter-name 'page-container' --html-entries div --html-entries-class \"news-text\" --html-title h2 --html-title-class \"\" --html-description p --html-description-class \"\" --html-url a --base-url https://www.ncsc.se --stdout  --atom --title \"Feed for NCSC-SE\" https://www.ncsc.se/publikationer/\n```\n\nIf you have other example use case please add them in [show usage examples][sue] in discussions.\n\n\n  [app]: https://security.apple.com/blog/\n  [bso]: https://www.crummy.com/software/BeautifulSoup/\n  [dis]: https://github.com/reuteras/rssfixer/discussions\n  [exa]: https://github.com/reuteras/rssfixer/blob/main/src/tests/data/output/nccgroup.xml\n  [fge]: https://feedgen.kiesow.be/\n  [iss]: https://github.com/reuteras/rssfixer/issues\n  [mcr]: https://github.com/basnijholt/markdown-code-runner\n  [ncc]: https://research.nccgroup.com/\n  [rss]: https://www.rssboard.org/\n  [sql]: https://sqlite.org/changes.html\n  [sue]: https://github.com/reuteras/rssfixer/discussions/categories/show-usage-examples\n  [tri]: https://www.tripwire.com/state-of-security\n  [tru]: https://www.truesec.com/hub/blog\n  [wor]: https://wordpress.org/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freuteras%2Frssfixer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freuteras%2Frssfixer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freuteras%2Frssfixer/lists"}