{"id":50815289,"url":"https://github.com/liuwhale/oh-my-rss","last_synced_at":"2026-06-16T12:00:48.029Z","repository":{"id":364005286,"uuid":"1264994973","full_name":"LiuWhale/oh-my-rss","owner":"LiuWhale","description":"RSS-driven reading automation for FreshRSS, arXiv, Codex summaries, and static pages.","archived":false,"fork":false,"pushed_at":"2026-06-13T07:23:21.000Z","size":2383,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-14T10:22:38.262Z","etag":null,"topics":["arxiv","codex","freshrss","papers","rss","summarization"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LiuWhale.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-10T11:17:10.000Z","updated_at":"2026-06-14T02:54:26.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/LiuWhale/oh-my-rss","commit_stats":null,"previous_names":["liuwhale/oh-my-rss"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/LiuWhale/oh-my-rss","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiuWhale%2Foh-my-rss","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiuWhale%2Foh-my-rss/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiuWhale%2Foh-my-rss/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiuWhale%2Foh-my-rss/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LiuWhale","download_url":"https://codeload.github.com/LiuWhale/oh-my-rss/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiuWhale%2Foh-my-rss/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34357285,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arxiv","codex","freshrss","papers","rss","summarization"],"created_at":"2026-06-13T09:00:34.563Z","updated_at":"2026-06-15T11:00:45.529Z","avatar_url":"https://github.com/LiuWhale.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Oh My RSS\n\n![Oh My RSS GitHub cover](assets/github-cover.png)\n\nRSS-native AI research radar. Oh My RSS turns research feeds from FreshRSS into\nChinese paper-story summaries, static public feeds, category feeds, and monthly\ntrend reports that RSS clients such as Reeder can subscribe to.\n\n## Features\n\n- Reads RSS entries from a FreshRSS SQLite database.\n- Detects arXiv, DOI, and generic paper links from RSS titles, links, and\n  content.\n- De-duplicates papers that appear in multiple feeds.\n- Downloads direct PDF links when available and extracts text with `pdftotext`;\n  otherwise falls back to the RSS abstract.\n- Renders a first-page PNG preview and embeds it in the Codex summary page.\n- Classifies papers into research-domain labels such as robot learning,\n  manipulation, humanoids, VLA, navigation, SLAM, perception, safety/control,\n  embodied AI, and benchmarks.\n- Calls Codex CLI to generate Chinese summaries with:\n  - `Motivation`\n  - `Contribution`\n  - `技术原理`\n  - `实验设计及分析`\n  - `原文链接`\n- Produces static HTML with MathJax support.\n- Produces a static public RSS feed at `feed.xml`.\n- Exposes RSS and OPML auto-discovery links from the public index page.\n- Produces a monthly research radar feed with trend tables and SVG charts.\n- Produces a trending-topic feed with one item per hot research direction.\n- Produces a trending-keyword feed for specific terms such as VLA, diffusion\n  policy, humanoid, SLAM, safety filter, and sim-to-real.\n- Uses hash-based detail URLs to avoid stale browser/RSS-client caches.\n- Optionally backs up the FreshRSS DB and updates entries with a clickable summary link.\n- Prints a starter OPML bundle for common paper feeds so a new FreshRSS\n  deployment can start quickly.\n- Validates OPML feed URLs before importing them into FreshRSS.\n\n## Project Goal\n\nOh My RSS is designed to become a self-hosted AI research radar rather than a\nreplacement RSS reader. The long-term goal is to let users connect paper feeds,\nconference feeds, journal feeds, lab blogs, and news feeds, then publish a clean\nknowledge stream with AI summaries, research-domain classification, trend\nreports, and RSS-native distribution.\n\n## Requirements\n\n- Python 3.11+\n- FreshRSS using SQLite\n- `curl`\n- `pdftotext` from Poppler\n- PyMuPDF, installed automatically as a Python dependency\n- Codex CLI authenticated on the machine running the job\n\n## Quick Start\n\n```bash\ngit clone https://github.com/LiuWhale/oh-my-rss.git\ncd oh-my-rss\npython3.11 -m venv .venv\n. .venv/bin/activate\npip install -e \".[dev]\"\noh-my-rss init-config --output config.yaml\n```\n\nEdit `config.yaml`, then run:\n\n```bash\noh-my-rss doctor --config config.yaml\n```\n\nIf the environment checks pass, generate summaries:\n\n```bash\noh-my-rss run --config config.yaml --limit 1\n```\n\nTo preview which papers would be processed without calling Codex:\n\n```bash\noh-my-rss run --config config.yaml --dry-run --limit 5\n```\n\nTo validate generated public site files before publishing or after a run:\n\n```bash\noh-my-rss validate-site --site-dir ./site\n```\n\nTo generate a starter OPML file for FreshRSS paper subscriptions:\n\n```bash\noh-my-rss print-starter-opml --output starter-paper-feeds.opml\n```\n\nValidate it before importing:\n\n```bash\noh-my-rss validate-opml --opml starter-paper-feeds.opml\n```\n\nThen import that OPML into FreshRSS and set `freshrss.category` to the imported\npaper category.\n\n## Configuration\n\nSee [`configs/example.yaml`](configs/example.yaml).\n\nThe key fields are:\n\n- `freshrss.db_path`: path to FreshRSS `db.sqlite`.\n- `freshrss.category`: FreshRSS category to scan, for example `论文` or `Papers`.\n- `site.output_dir`: local directory where HTML files are written.\n- `site.public_base_url`: public URL prefix for generated pages.\n- `codex.command`: command used to invoke Codex CLI.\n- `runtime.state_dir`: state, PDF cache, prompts, logs, and DB backups.\n\n## Public Feed\n\nEach run writes:\n\n- `index.html`: public summary index\n- `feed.xml`: public RSS feed for generated summaries\n- `feeds.json`: machine-readable directory of all public RSS and OPML entry\n  points\n- `status.json`: machine-readable service status summary with summary counts,\n  category counts, report counts, latest summary, and public feed URLs\n- `robots.txt` and `sitemap.xml`: crawler discovery files for the public index,\n  generated summary pages, monthly reports, hot directions, and hot keywords\n- `opml.xml`: complete OPML import file for the main feed, category feeds,\n  monthly report feed, hot direction feed, and hot keyword feed\n- `categories/*.xml`: per-source/category RSS feeds\n- `categories/index.json`: machine-readable category feed list\n- `categories/opml.xml`: category-only OPML import file for RSS clients\n- `reports/monthly.xml`: monthly research trend report RSS feed\n- `reports/monthly/YYYY-MM.html`: monthly report pages with direction bars,\n  source distribution, animated trend charts, and representative papers\n- `reports/trending.xml`: hot research-direction RSS feed\n- `reports/trending/*.html`: direction pages with trend counts, sources, and\n  representative papers\n- `reports/keywords.xml`: hot research-keyword RSS feed\n- `reports/keywords/*.html`: keyword pages with trend counts, sources, and\n  representative papers\n- `manifest.json`: machine-readable summary metadata\n\nCategory feed names are normalized for mixed paper sources: a leading `arXiv `\nprefix is removed before publishing, and stale `categories/arxiv-*.xml` files\nfrom older runs are cleaned up.\n\nNewly generated records include `paper_id`, `source_kind`, and\n`research_domains`. `source_kind` is `arXiv`, `DOI`, or `RSS`. Category feeds\nand monthly reports use research-domain labels first, then fall back to\nnormalized feed names only when no research topic can be inferred.\n\nUsers can subscribe to:\n\n```text\n\u003csite.public_base_url\u003e/feed.xml\n```\n\nThey can also subscribe to category-specific feeds. RSS clients generally do\nnot subscribe to JSON directly; use the complete OPML bundle for one-click\nimport:\n\n```text\n\u003csite.public_base_url\u003e/opml.xml\n```\n\nUse the category-only OPML file when you only want the topic/source feeds:\n\n```text\n\u003csite.public_base_url\u003e/categories/opml.xml\n```\n\nUse the JSON file only for integrations that need a machine-readable list:\n\n```text\n\u003csite.public_base_url\u003e/categories/index.json\n```\n\nFor integrations that need every public RSS and OPML entry point, use:\n\n```text\n\u003csite.public_base_url\u003e/feeds.json\n```\n\nFor monitoring or lightweight health checks, use:\n\n```text\n\u003csite.public_base_url\u003e/status.json\n```\n\nFor crawler discovery, each run also writes:\n\n```text\n\u003csite.public_base_url\u003e/robots.txt\n\u003csite.public_base_url\u003e/sitemap.xml\n```\n\nMonthly trend reports are published as a separate RSS feed:\n\n```text\n\u003csite.public_base_url\u003e/reports/monthly.xml\n```\n\nEach monthly report page includes an animated SVG trend chart, direction bar\nchart, source distribution chart, summary tables, and links back to the\nunderlying Codex paper summaries.\n\nHot research directions are also published as their own feed:\n\n```text\n\u003csite.public_base_url\u003e/reports/trending.xml\n```\n\nEach trending-topic item links to a direction page with source counts, recent\ntrend counts, representative papers, and links back to the generated paper\nsummaries.\n\nSpecific research keywords are published as another RSS feed:\n\n```text\n\u003csite.public_base_url\u003e/reports/keywords.xml\n```\n\nEach keyword item links to a page that tracks term-level trends such as VLA,\ndiffusion policy, humanoid, SLAM, safety filter, and sim-to-real across recent\npaper summaries.\n\nThese feeds are static. They let other people read generated summaries without\nlogging into your FreshRSS account or sharing your read/unread state.\n\n## Scheduling\n\nGenerate a locked cron entry for a 10-minute scheduler:\n\n```bash\noh-my-rss print-cron \\\n  --cwd /opt/oh-my-rss \\\n  --config config.yaml \\\n  --limit 1 \\\n  --interval-minutes 10 \\\n  --log-path state/cron.log \\\n  --venv .venv\n```\n\nThe command prints a cron line like:\n\n```cron\n*/10 * * * * cd /opt/oh-my-rss \u0026\u0026 . .venv/bin/activate \u0026\u0026 flock -n /tmp/oh-my-rss.lock oh-my-rss run --config config.yaml --limit 1 \u003e\u003e state/cron.log 2\u003e\u00261\n```\n\nPaste that line into cron or the equivalent scheduler. Use `--no-venv` if\n`oh-my-rss` is installed on the scheduler's default `PATH`.\n\n## Deployment Notes\n\n- For Synology NAS and FreshRSS Docker setups, see [`docs/synology-freshrss.md`](docs/synology-freshrss.md).\n- For adding RSS subscriptions, grouping feeds, and OPML import, see [`docs/feed-management.md`](docs/feed-management.md).\n- For Reeder/FreshRSS behavior, see [`docs/reeder-workflow.md`](docs/reeder-workflow.md).\n\nFor Docker Compose, copy `.env.example` to `.env` only when you want to override\nthe default run settings. The compose file has built-in defaults and does not\nrequire a local `.env` file.\n\n## Development\n\n```bash\nPYTHONPATH=src pytest -q\nruff check .\n```\n\n## Security\n\nDo not commit:\n\n- FreshRSS DB files\n- Codex auth files\n- real domains, private IPs, proxy credentials, or user accounts\n- generated PDF caches\n\nUse `.env.example` and `configs/example.yaml` as templates.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliuwhale%2Foh-my-rss","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fliuwhale%2Foh-my-rss","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliuwhale%2Foh-my-rss/lists"}