An open API service indexing awesome lists of open source software.

https://github.com/babarot/oksskolten

πŸ”οΈ The AI-native RSS reader
https://github.com/babarot/oksskolten

ai llm news-reader react rss rss-feed rss-reader self-hosted selfhosted typescript

Last synced: 3 days ago
JSON representation

πŸ”οΈ The AI-native RSS reader

Awesome Lists containing this project

README

          





Inbox


Tests
Server Coverage
Client Coverage


Oksskolten (pronounced "ooks-SKOL-ten") β€” every article, full text, by default.

## Why Oksskolten?

Most RSS readers show what the feed gives you β€” a title and maybe a summary. Some (like Miniflux and FreshRSS) can fetch full article text, but it's opt-in per feed and requires configuration. Oksskolten does it for every article automatically: it fetches the original article, extracts the full text using Mozilla's Readability + 500 noise-removal patterns, converts it to clean Markdown, and stores it locally. No per-feed toggles, no manual CSS selectors β€” it just works.

Because Oksskolten always has the complete text, AI summarization and translation produce meaningful results, full-text search actually covers everything, and you never need to leave the app to read an article.

## See it in action

πŸ•Ί Live Demo β†’ [demo.oksskolten.com](https://demo.oksskolten.com)





Home




Inbox




Article




Appearance

## Features

- **Full-Text Extraction** β€” Every article is fetched from its source and processed through Readability + 500 noise-removal patterns. You read complete articles inside Oksskolten, never needing to click through to the original site
- **AI Summarization & Translation** β€” On-demand article processing via Anthropic, Gemini, or OpenAI with SSE streaming. Works on full article text, not RSS excerpts
- **Interactive Chat** β€” Multi-turn AI conversations with MCP tooling; search articles, get stats, and ask questions about your feeds
- **Full-Text Search** β€” Meilisearch-powered search across your entire article archive
- **Smart Fetching** β€” Adaptive per-feed scheduling, conditional HTTP requests (ETag/Last-Modified), content-hash deduplication, exponential backoff, and tracking parameter removal
- **PWA** β€” Offline reading, background sync, and add-to-home-screen support
- **Multi-Auth** β€” Password, Passkey (WebAuthn), and GitHub OAuth β€” each independently configurable
- **Smart Feed Management** β€” Auto-discovery, CSS selector-based feeds (via RSS Bridge), bot bypass (FlareSolverr), and automatic disabling of dead feeds
- **Article Clipping** β€” Save any URL as an article, with full content extraction
- **Theming** β€” 14 built-in color themes + custom theme import via JSON, 9 article fonts, 8 code highlighting styles
- **Single Container** β€” API, SPA, and cron scheduler all run in one Docker container

## Tech Stack

| Layer | Technology |
|---|---|
| Backend | [Node.js 22](https://nodejs.org/) + [Fastify](https://fastify.dev/) |
| Frontend | [React 19](https://react.dev/) + [Vite](https://vite.dev/) + [Tailwind CSS](https://tailwindcss.com/) + [shadcn/ui](https://ui.shadcn.com/) |
| Database | [SQLite](https://sqlite.org/) via [libsql](https://github.com/tursodatabase/libsql) (WAL mode) |
| AI | [Anthropic](https://docs.anthropic.com/) / [Gemini](https://ai.google.dev/) / [OpenAI](https://platform.openai.com/) |
| Search | [Meilisearch](https://www.meilisearch.com/) |
| Auth | JWT + [Passkey / WebAuthn](https://webauthn.io/) + GitHub OAuth |
| Deployment | Docker Compose + [Cloudflare Tunnel](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/) |

## Architecture

```mermaid
graph TD
subgraph host["Docker Host"]
subgraph app["oksskolten (Node.js 22, port 3000)"]
fastify["Fastify API
/api/*"]
spa["SPA static serving
(Vite build)"]
sqlite["SQLite
(WAL mode)"]
cron["node-cron
Feed fetch every 5 min"]
fetcher["Fetcher Pipeline
RSS parse β†’ Readability
β†’ HTML cleaner β†’ Markdown"]
ai["AI Provider
Anthropic / Gemini / OpenAI"]
chat["Chat Service
MCP Server + 4 Adapters"]

fastify --> sqlite
fastify --- spa
fastify --> fetcher
fastify --> ai
fastify --> chat
cron --> fetcher
fetcher --> sqlite
chat --> sqlite
chat --> ai
end

bridge["RSS Bridge
(Docker, port 80)"]
flare["FlareSolverr
(Docker, port 8191)"]
tunnel["cloudflared
(Cloudflare Tunnel)"]
end

user(("User")) -- "HTTPS" --> tunnel
tunnel --> fastify
cron -- "HTTP fetch" --> rss(("RSS Feeds"))
fetcher --> bridge
fetcher --> flare
ai -- "API" --> llm_api(("Anthropic / Gemini
/ OpenAI API"))
```

Everything runs in a single long-lived process β€” SQLite needs local disk, and node-cron needs a process that stays alive. This rules out serverless/edge runtimes but keeps the stack simple: one container, no external queues or coordination. For cloud deployment, a small VM or [Fly.io + Turso](docs/guides/deploying-to-fly-io.md) works well.

Oksskolten also exposes an MCP server, so Claude Code or any MCP client can search, summarize, and query your article archive without opening the app.

> **What's in a name?** Oksskolten is the highest peak in northern Norway β€” a mountain of knowledge for your feeds.

### Content Pipeline

Unlike traditional RSS readers that rely on feed-provided summaries, Oksskolten fetches every article directly from its source URL and extracts the full text. This means the reader is self-contained β€” no need to leave the app to read an article.

1. **Fetch RSS** β€” Adaptive per-feed scheduling with conditional requests (ETag/Last-Modified/content hash)
2. **Parse** β€” RSS/Atom/RDF parsed via feedsmith + fast-xml-parser, tracking parameters stripped
3. **Fetch Article** β€” Original article URL fetched directly (with FlareSolverr fallback for bot-protected sites)
4. **Extract** β€” Full article content extracted with Readability in isolated Worker Threads
5. **Clean** β€” ~500 noise-removal patterns strip ads, nav, sidebars, and tracking elements
6. **Convert** β€” HTML converted to Markdown with GFM support
7. **Enrich** β€” Language detection, OGP image extraction, excerpt generation
8. **Index** β€” Articles indexed in Meilisearch for full-text search

### Smart Fetching

The feed fetcher minimizes bandwidth and adapts to each feed's behavior, inspired by best practices from FreshRSS, Miniflux, and CommaFeed:

- **3-layer change detection** β€” HTTP 304 (ETag/Last-Modified) β†’ content hash (SHA-256) β†’ full parse. Unchanged feeds are caught early without parsing XML
- **Adaptive scheduling** β€” Each feed gets its own check interval (15min–4h) based on three signals: HTTP `Cache-Control`, RSS ``, and actual article frequency. Active blogs are checked often; dormant ones back off automatically
- **Resilient error handling** β€” Exponential backoff on errors (1h–4h cap), but feeds are never disabled. Rate limits (429/503) respect `Retry-After` headers without counting as errors
- **URL deduplication** β€” 60+ tracking parameters (utm_*, fbclid, gclid, etc.) are stripped before duplicate checking, preventing the same article from being inserted twice

## Comparison

| | Oksskolten | [Miniflux](https://github.com/miniflux/v2) | [FreshRSS](https://github.com/FreshRSS/FreshRSS) | [Feedly](https://feedly.com/) |
|---|---|---|---|---|
| **Full-text extraction** | Every article, by default | Opt-in per feed | Opt-in per feed | Auto (best-effort) |
| **Extraction engine** | Readability.js + 500 patterns | Go Readability (~390 lines, ~60 rules) | Manual CSS selectors | Proprietary |
| **JS-rendered sites** | FlareSolverr | β€” | β€” | Enterprise only |
| **Sites without RSS** | Auto-discovery β†’ RSS Bridge β†’ LLM inference | β€” | β€” | Pro+ (25) / Enterprise (100) |
| **AI summarization** | Built-in (Anthropic/Gemini/OpenAI) | β€” | β€” | Pro+ only (Leo) |
| **AI translation** | Built-in (+ Google Translate, DeepL) | β€” | β€” | Enterprise only |
| **AI chat** | MCP-powered, searches archive | β€” | β€” | β€” |
| **Search** | Meilisearch (typo-tolerant) | PostgreSQL full-text | SQL LIKE | Pro+ (Power Search) |
| **Database** | SQLite (embedded, WAL) | PostgreSQL (external) | MySQL/PG/SQLite | SaaS |
| **Deployment** | Single container | Binary + PostgreSQL | PHP + web server + DB | SaaS |
| **Offline reading** | PWA with background sync | β€” | β€” | Mobile apps only |
| **Auth** | Password + Passkey/WebAuthn + GitHub OAuth | Password + API key | Password + API key | Google/Apple/social + SAML (Enterprise) |
| **Themes** | 14 + custom JSON import | Light/Dark | ~10 themes | β€” |
| **Language** | Node.js (TypeScript) | Go | PHP | β€” |
| **Price** | Free / OSS (AGPL-3.0) | Free / OSS (Apache-2.0) | Free / OSS (AGPL-3.0) | $12.99/mo (Pro+) |

Miniflux and FreshRSS are excellent, mature projects. Oksskolten's focus is different: full-text extraction and AI as first-class defaults, not optional add-ons.

## Development

```bash
docker compose up --build # HMR enabled
# Frontend: http://localhost:5173
# Backend: http://localhost:3000

npm test # Run all tests
npm run build # Production build
```

On first startup with an empty database, sample feeds and articles are automatically loaded from the demo seed data (`src/lib/demo/seed/*.json`). This gives you a populated UI to work with immediately. The seed is idempotent β€” it only runs when no RSS feeds exist in the database. To start with an empty database instead, set `NO_SEED=1`.

See [`.env.example`](.env.example) for available environment variables. AI provider keys are configured through the Settings UI.

## Deployment

Runs anywhere Docker runs β€” a home NAS, a Raspberry Pi, or a cloud VM.

### Using pre-built images

Pre-built multi-architecture Docker images (amd64/arm64) are published to GHCR on every release:

```bash
docker pull ghcr.io/babarot/oksskolten:latest
```

To use the pre-built image instead of building locally, edit `compose.prod.yaml` and swap the `build` directive for the commented-out `image` line, then:

```bash
docker compose -f compose.yaml -f compose.prod.yaml up -d
```

### Building locally

```bash
# Production with Cloudflare Tunnel
docker compose -f compose.yaml -f compose.prod.yaml up --build -d
```

The production compose file includes a `cloudflared` sidecar that exposes the app via Cloudflare Tunnel β€” no port forwarding or static IP required.

## License

[AGPL-3.0](LICENSE)