{"id":49645617,"url":"https://github.com/hallelx2/notebooklm-web","last_synced_at":"2026-05-06T00:06:49.877Z","repository":{"id":353508249,"uuid":"1216849882","full_name":"hallelx2/notebooklm-web","owner":"hallelx2","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-01T21:30:37.000Z","size":16638,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-01T23:25:56.661Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://notebooklm-web.vercel.app","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hallelx2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-21T09:43:22.000Z","updated_at":"2026-05-01T21:42:51.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hallelx2/notebooklm-web","commit_stats":null,"previous_names":["hallelx2/notebooklm-web"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/hallelx2/notebooklm-web","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hallelx2%2Fnotebooklm-web","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hallelx2%2Fnotebooklm-web/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hallelx2%2Fnotebooklm-web/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hallelx2%2Fnotebooklm-web/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hallelx2","download_url":"https://codeload.github.com/hallelx2/notebooklm-web/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hallelx2%2Fnotebooklm-web/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32672688,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-05T11:29:49.557Z","status":"ssl_error","status_checked_at":"2026-05-05T11:29:48.587Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-06T00:06:49.241Z","updated_at":"2026-05-06T00:06:49.866Z","avatar_url":"https://github.com/hallelx2.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# notebooklm-web\n\n**An open-source, self-hostable, local-first NotebookLM. Built end to end in four days.**\n\nThree-stage hybrid retrieval. Self-critiquing deep research. Two-voice audio overviews. Mind maps, flashcards, quizzes, study guides. Twelve AI providers with encrypted credentials. One Next.js app, one Postgres, one deploy.\n\n[![Star](https://img.shields.io/github/stars/hallelx2/notebooklm-web?style=flat-square\u0026logo=github\u0026color=2563EB)](https://github.com/hallelx2/notebooklm-web/stargazers)\n[![Fork](https://img.shields.io/github/forks/hallelx2/notebooklm-web?style=flat-square\u0026logo=github\u0026color=2563EB)](https://github.com/hallelx2/notebooklm-web/network/members)\n[![License](https://img.shields.io/badge/license-MIT-2563EB?style=flat-square)](LICENSE)\n[![Whitepaper](https://img.shields.io/badge/whitepaper-45_pages-F59E0B?style=flat-square\u0026logo=adobe-acrobat-reader\u0026logoColor=white)](docs/demo/NotebookLM-in-Four-Days-Whitepaper.pdf)\n\n![demo](docs/demo/demo.gif)\n\n*A 28-second tour at 25× speed. Drop in a PDF, chat with citations, run multi-round web research, generate a two-host audio overview, and walk a mind map of your sources.*\n\n\u003c/div\u003e\n\n---\n\n## What you get\n\n- 📚 **Drop in any source** — PDF, web link, pasted text, or audio. Every source walks the same parse → chunk → embed → store pipe.\n- 💬 **Chat with citations** — three-stage retrieval (query expansion → hybrid pgvector + keyword → LLM rerank). Every claim cites the chunk it came from. Click a citation to jump to the source.\n- 🔬 **Deep research** — multi-round web research with self-critique. The agent plans sub-questions aware of your existing notebook, fetches sources via Exa or Tavily, summarises, drafts the report section by section, scores its own quality, fills gaps in a second round, and cross-references every claim. Output saves as a new source you can chat with.\n- 🎙️ **Two-voice audio overviews** — Deepgram Aura's Orion + Asteria voices, scripted as a podcast conversation, rendered as MP3.\n- 🧠 **Studio outputs** — mind maps (markmap), flashcards, quizzes (with scoring), study guides, briefing docs, FAQs, timelines, and user-authored notes. Eight kinds, one table.\n- 🔌 **Twelve AI providers** — OpenAI, Anthropic, Google Gemini, Mistral, Cohere, Voyage, Groq, Ollama (self-hosted), OpenRouter, Together AI, xAI, and any OpenAI-compatible endpoint. Switch providers per user, no code changes.\n- 🔐 **Encrypted credentials** — AES-GCM at rest with versioned keys and userId as AAD. Plaintext API keys never live in the database.\n- 📐 **Multi-dimensional embeddings** — sibling tables for 768/1024/1536/3072 dims. Switching embedding models is non-destructive.\n- 📡 **Streaming everywhere** — NDJSON over POST for orchestration, AI SDK streamText for chat. Every step the agent takes is a UI event.\n\n## Why this project\n\nMost \"build a NotebookLM clone\" tutorials ship a 200-line demo that breaks the moment you put it in front of a real user. This is the opposite — a real, shipped, opinionated implementation that you can study, fork, run yourself, or use as the substrate for your own product.\n\nThe full architectural breakdown — every chapter, every code path, every decision — is in **[the 45-page whitepaper](docs/demo/NotebookLM-in-Four-Days-Whitepaper.pdf)**. If you are building anything RAG-flavoured right now, that PDF is the cheat sheet.\n\n## ⭐ If this is useful\n\n- **Star the repo** — the cheapest signal that helps the project find more builders\n- **Fork it** — clone it, run it, rip out what you do not need, ship your version\n- **Open issues** — bug reports, feature requests, architectural feedback all welcome\n- **Send PRs** — see the Contributing section below\n- **Share it** — if you build something on top, link back so others can find your work\n\n## Quick start\n\nYou will need: [Bun](https://bun.sh), a [Neon](https://neon.tech) Postgres database (or any Postgres with pgvector), and an API key from at least one of the supported AI providers.\n\n```bash\ngit clone https://github.com/hallelx2/notebooklm-web.git\ncd notebooklm-web\nbun install\ncp .env.example .env       # fill in DATABASE_URL, AUTH_SECRET, MASTER_KEY_V1\nbun run db:push            # push the Drizzle schema + create pgvector extension\nbun run db:hnsw            # create HNSW indexes on the embedding sibling tables\nbun run dev                # http://localhost:3000\n```\n\nFirst-time signup walks you through picking a chat provider and an embedding provider in `/settings`. Drop in your API keys, pick your models, and start building notebooks.\n\n### Desktop\n\nThe same workbench is available as a native Electron app under `apps/desktop` — runs against an embedded PGlite database with local-FS storage, fully offline against Ollama. See [`apps/desktop/README.md`](apps/desktop/README.md) for the quick-start and keyboard shortcuts.\n\n```bash\nbun --filter @notebooklm/desktop dev\n```\n\n## Stack\n\n| Layer | Tool | Notes |\n|---|---|---|\n| Framework | **Next.js 16** | App Router, Route Handlers for streaming, Server Components |\n| Runtime | **Bun** | Package manager + dev server |\n| Auth | **Better Auth** | Sessions, OAuth, email/password |\n| Database | **Drizzle + Neon Postgres** | Serverless HTTP, branching, pgvector |\n| Vectors | **pgvector** ×4 sibling tables | 768 / 1024 / 1536 / 3072 dim, HNSW cosine indexes |\n| AI Layer | **AI SDK v6** | Uniform `LanguageModel` interface across 12 providers |\n| RPC | **tRPC v11** | Notebooks, sources, studio, providers, AI config |\n| TTS | **Deepgram Aura** | Orion (male) + Asteria (female) for two-host audio |\n| Web search | **Exa + Tavily + SearxNG** | Two paid APIs and one OSS fallback. Pluggable order via `SEARCH_PROVIDER_ORDER`. |\n| Web extract | **@mozilla/readability** | Article extraction from arbitrary URLs |\n| Storage | **S3 / R2 / Supabase** | Pluggable behind `StorageService` |\n| Mind maps | **markmap-lib** | Markdown headings → interactive SVG |\n| UI | **React 19 + Tailwind v4** | Three-pane workbench, modal viewers per studio kind |\n\n## Architecture at a glance\n\n```\nBrowser                    Next.js Server                Shared Libs            External\n\nNotebook UI         →      /api/chat (streamText)    →   lib/retrieve.ts   →   12 LLM providers\n3-pane workbench    →      /api/deep-research        →   lib/ingest        →   Deepgram Aura\nsources · chat ·    →      /api/studio/audio         →   lib/ai/factory    →   Neon Postgres\nstudio              →      tRPC routers                                         + pgvector ×4\n```\n\nRead the **[whitepaper](docs/demo/NotebookLM-in-Four-Days-Whitepaper.pdf)** for the full version — sixteen chapters, ~45 pages, every decision explained.\n\n## Project layout\n\n```\nsrc/\n  app/\n    api/                 # Streaming Route Handlers (chat, deep-research, studio, upload, reembed)\n    auth/                # Sign-in, sign-up\n    notebooks/           # Notebook list + workbench\n    settings/            # Provider config, model picker, profile\n  components/            # ui / shared / layout\n  db/\n    schema.ts            # Drizzle schema (users, notebooks, sources, chunks, embeddings ×4, etc.)\n    migrate-hnsw.ts      # HNSW index migrations\n  lib/\n    retrieve.ts          # The three-stage retrieval pipeline (229 lines)\n    ingest/              # parse · chunk · embed · store\n    ai/\n      providers.ts       # The 12-provider registry\n      factory.ts         # User-scoped chat/embed model factory\n    crypto/              # AES-GCM encrypted credentials\n    auth.ts              # Better Auth config\n  module/                # Feature modules (notebook, notebooks, settings, auth, landing)\n  server/\n    routers/             # tRPC routers (notebook, source, studio, provider, aiConfig, message)\ndocs/\n  demo/                  # Demo GIF, MP4, whitepaper PDF\n```\n\n## Configuration\n\nAll env vars live in `.env`. The minimum for local development:\n\n```bash\nDATABASE_URL=\"postgres://...\"             # Neon, Supabase, or any Postgres with pgvector\nAUTH_SECRET=\"...\"                          # 32+ random bytes; openssl rand -base64 32\nMASTER_KEY_V1=\"...\"                        # 32 bytes for AES-GCM credential encryption\nNEXT_PUBLIC_APP_URL=\"http://localhost:3000\"\n```\n\nOptional but useful:\n\n```bash\nDEEPGRAM_API_KEY=\"...\"                          # Required for audio overviews\nEXA_API_KEY=\"...\"                               # Paid; deep-research web search\nTAVILY_API_KEY=\"...\"                            # Paid; fallback web search\nSEARXNG_URL=\"http://localhost:8080\"             # OSS; self-hosted SearxNG instance\nSEARCH_PROVIDER_ORDER=\"exa,tavily,searxng\"      # Comma-separated priority order\nSUPABASE_URL=\"...\"                              # If using Supabase Storage\nSUPABASE_SERVICE_ROLE_KEY=\"...\"\nS3_*                                            # If using S3 / R2 instead\n```\n\nUser AI provider API keys are **not** environment variables — users add them through the settings UI per-user, encrypted at rest.\n\n### Self-hosting SearxNG\n\nSearxNG aggregates Google / Bing / DuckDuckGo / Wikipedia behind one JSON API and is the OSS option that lets users run deep-research without paid keys. A ready-to-run Docker setup lives at [`docker/searxng/`](docker/searxng/):\n\n```bash\ncd docker/searxng\nexport SEARXNG_PORT=8888\nexport SEARXNG_SECRET=$(openssl rand -hex 32)\ndocker compose up -d\n\n# point the app at it\nexport SEARXNG_URL=http://localhost:${SEARXNG_PORT}\n```\n\nThe desktop app **auto-starts the same compose stack on first launch** when Docker is available — no setup needed beyond installing Docker Desktop. See [`docker/searxng/README.md`](docker/searxng/README.md) for details, tear-down, and tuning.\n\nPublic instances (e.g. `https://searx.be`) work for testing but rate-limit unpredictably — self-host for production. Full SearxNG reference: \u003chttps://docs.searxng.org/admin/installation.html\u003e\n\n## Development\n\n```bash\nbun run dev             # Next.js dev server\nbun run db:push         # Push Drizzle schema changes\nbun run db:hnsw         # Create HNSW indexes (idempotent)\nbun run db:reembed      # Backfill embeddings under a different model\nbun run lint            # Biome lint\nbun run format          # Biome format\nbun run typecheck       # tsc --noEmit\nbun run build           # Production build\n```\n\n## Deploy\n\nBuilt for **Vercel**. Push to a connected GitHub repo, set env vars in the Vercel dashboard, and deploy. The streaming routes use `maxDuration: 300` to fit deep-research and audio-generation runs comfortably under the function timeout.\n\nFor self-hosting, the app runs anywhere Node 20+ runs — Docker, Render, Fly, your own VM. The only hard requirement is a Postgres with pgvector enabled.\n\n## Roadmap\n\n\u003e **Three companion documents** live alongside this README and go deeper on each direction:\n\u003e - [`ROADMAP.md`](ROADMAP.md) — phased plan from v1 through marketplace\n\u003e - [`CONTRIBUTING.md`](CONTRIBUTING.md) — how to land a useful PR\n\u003e - [`docs/AGENT-HARNESS.md`](docs/AGENT-HARNESS.md) — the technical vision for pluggable agent runtimes\n\nThe big direction this project is heading is **a fully local desktop app** — same workbench, same retrieval, same studio kinds, but with no cloud dependencies and no API keys required. Drop a folder of PDFs onto your laptop, open the app, and learn from them entirely offline.\n\nTo get there, the repo will move to a **pnpm workspace** so the web app and the desktop app can share a common core (`packages/core`: retrieval, ingest, AI factory, schema, prompts) while each app owns its own surface (`apps/web` for the hosted Next.js version, `apps/desktop` for the local Electron or Tauri shell).\n\nThe local stack we are aiming at:\n\n| Layer | Local choice | Notes |\n|---|---|---|\n| Chat models | **Ollama** | Llama 3.3, Qwen 2.5, Mistral, anything served on `localhost:11434` |\n| Embedding models | **Ollama** (Nomic, mxbai) or **GGUF** | 768- or 1024-dim, local inference |\n| TTS | **Piper** or **Kokoro-82M** | Free, offline, no Deepgram dependency |\n| Database | **Embedded Postgres** or **SQLite + sqlite-vec** | No Neon connection, runs in-process |\n| Web search | **Optional** | Local-only mode skips deep-research's web stage; offline still has retrieval, studio, audio |\n| Storage | **Local filesystem** | The user's own folders are the canonical store |\n| Shell | **Tauri 2** or **Electron** | Native window, drag-and-drop folders, file watcher |\n\nThe shape of the work: the existing `lib/ai/factory.ts`, `lib/retrieve.ts`, `lib/ingest`, and the studio router are already provider-agnostic — Ollama is supported in the registry today. Pulling them into `packages/core`, swapping Neon for an embedded DB, and replacing Deepgram with Piper at the TTS boundary is the bulk of the desktop port.\n\n## Contributing\n\nYes please. The contributions I am most excited to see:\n\n- 💻 **The desktop app.** Tauri or Electron shell that wraps the existing core and runs fully offline. File-system access to user folders, automatic ingest of dropped folders, local Ollama + Piper integration, embedded database. This is the headline contribution.\n- 📦 **pnpm workspace migration.** Turn this repo into a monorepo with `apps/web`, `apps/desktop`, and `packages/core`. Move the shared libraries (`lib/ai`, `lib/retrieve`, `lib/ingest`, `db/schema`) into `packages/core` so both surfaces consume the same code.\n- 🧠 **Local model integrations.** Better Ollama UX (auto-detect installed models, one-click pull). Native llama.cpp bindings for sub-second embedding. GGUF embedding adapters.\n- 🗣️ **Local TTS adapters.** Piper, Kokoro-82M, Coqui XTTS — same interface as the Deepgram adapter, drop-in for audio overviews.\n- 🔌 **More AI providers** (the registry is one file).\n- 🎨 **More studio output kinds** (one prompt + one renderer).\n- 📐 **More embedding-dim sibling tables** (768/1024/1536/3072 today; 4096 and beyond are interesting).\n- 🔎 **More web-search providers** (Exa and Tavily are pluggable).\n- 🐛 **Bug fixes, docs, tests, accessibility improvements.**\n\nFor non-trivial changes — especially the desktop app and the workspace migration — open an issue first so we can talk through the shape before you spend hours on it.\n\n## License\n\nMIT. See [LICENSE](LICENSE). Use it, fork it, ship it, charge for it. If you build something cool, I would love to hear about it.\n\n## Credits\n\nBuilt by [Halleluyah Darasimi Oludele](https://github.com/hallelx2) between 2026-04-21 and 2026-04-24. The whitepaper is **Field Notes Issue 01** from my open-source notes.\n\nIf this saved you a week, [say hi](mailto:halleluyaholudele@gmail.com) or follow on [GitHub](https://github.com/hallelx2).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhallelx2%2Fnotebooklm-web","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhallelx2%2Fnotebooklm-web","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhallelx2%2Fnotebooklm-web/lists"}