{"id":51244693,"url":"https://github.com/its-not-rocket-science/mnemosyne","last_synced_at":"2026-06-29T03:03:32.552Z","repository":{"id":364887760,"uuid":"1203151903","full_name":"its-not-rocket-science/mnemosyne","owner":"its-not-rocket-science","description":"Language learning from any text — FSRS scheduling, morphological analysis, and 17 language plugins.","archived":false,"fork":false,"pushed_at":"2026-06-23T15:55:58.000Z","size":73742,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-23T17:29:34.667Z","etag":null,"topics":["fastapi","fsrs","language-learning","nlp","python","self-hosted","spaced-repetition"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/its-not-rocket-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":".github/AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-06T19:16:33.000Z","updated_at":"2026-06-23T15:56:16.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/its-not-rocket-science/mnemosyne","commit_stats":null,"previous_names":["its-not-rocket-science/mnemosyne"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/its-not-rocket-science/mnemosyne","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/its-not-rocket-science%2Fmnemosyne","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/its-not-rocket-science%2Fmnemosyne/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/its-not-rocket-science%2Fmnemosyne/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/its-not-rocket-science%2Fmnemosyne/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/its-not-rocket-science","download_url":"https://codeload.github.com/its-not-rocket-science/mnemosyne/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/its-not-rocket-science%2Fmnemosyne/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34911136,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-29T02:00:05.398Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","fsrs","language-learning","nlp","python","self-hosted","spaced-repetition"],"created_at":"2026-06-29T03:03:31.716Z","updated_at":"2026-06-29T03:03:32.547Z","avatar_url":"https://github.com/its-not-rocket-science.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Mnemosyne\n\nPaste any text, parse it into sentences, open per-word micro-lessons, and rate your recall. FSRS schedules the next review.\n\n**Current state:** multi-user system with seventeen language plugins. Full morphological analysis for Spanish, French, German, Russian, Japanese, Portuguese, and Italian; full grammar and nuance analysis for English (phrasal verbs, tense constructions, register markers via spaCy; shallow tense morphology — present and past only); morphology-light for Latin, Koine Greek, Korean, Hindi, and Turkish; Stanza-primary rich morphology plus grammar-nuance drills for Finnish; vocabulary/dictionary mode for Arabic, Hebrew, and Mandarin Chinese. RTL layout (Arabic, Hebrew) and CJK segmentation (Chinese, Japanese) are supported. User authentication is implemented (JWT); see [ROADMAP.md](ROADMAP.md).\n\n---\n\n## Quick start\n\n### Docker (recommended)\n\n```bash\ncp .env.example .env     # review defaults before starting\nmake build               # builds the image; generates poetry.lock if absent\nmake up                  # starts app + postgres + redis in the background\nmake ready               # prints { \"status\": \"ready\", \"db\": \"ok\", \"redis\": \"ok\" }\n```\n\nThe API listens on `http://localhost:8000`. The frontend is static HTML/JS — serve it separately:\n\n```bash\npython -m http.server 8080 -d frontend\n# open http://localhost:8080\n```\n\nBackend source is bind-mounted into the container, so uvicorn hot-reloads on save.\n\n\n### Windows note\n\nUse Docker Desktop and PowerShell:\n\n```powershell\nCopy-Item .env.example .env\ndocker compose build\ndocker compose up -d\nInvoke-WebRequest http://localhost:8000/ready | Select-Object -Expand Content\n```\n\n### Local (no Docker)\n\nRequires PostgreSQL and Redis already running.\n\n```bash\npoetry install\n# Minimum — Spanish only:\npython -m spacy download es_core_news_sm\n# Optional — install additional language models as needed:\n# python -m spacy download fr_core_news_sm de_core_news_sm\n# python -m spacy download ru_core_news_sm ja_core_news_sm\n# python -m spacy download pt_core_news_sm it_core_news_sm\ncp .env.example .env     # set DATABASE_URL and REDIS_URL\npsql -h localhost -U postgres -l\nmake dev                 # uvicorn --reload on :8000\npython -m http.server 8080 -d frontend\n```\n\nTables are managed by Alembic. On first startup the application runs `alembic upgrade head` automatically. No manual migration step is needed for fresh or existing databases.\n\n\nWindows PowerShell:\n\n```powershell\npoetry install\npython -m spacy download es_core_news_sm\nCopy-Item .env.example .env\npsql -h localhost -U postgres -l\n```\n\nExample .env for Windows:\n\n```env\nDATABASE_URL=postgresql+asyncpg://postgres:changeme@postgres:5432/mnemosyne\nREDIS_URL=redis://localhost:6379/0\n```\n\nRun:\n\n```powershell\nuvicorn backend.main:app --reload --host 0.0.0.0 --port 8000\n\nor as a background process:\nStart-Process python -WorkingDirectory \"working directory path\" `\n  -ArgumentList \"-m\",\"uvicorn\",\"backend.main:app\",\"--reload\",\"--host\",\"0.0.0.0\",\"--port\",\"8000\"\n\npython -m http.server 8080 -d frontend\n```\n\n---\n\n## API\n\n### `POST /parse`\n\nParses text into sentences and learnable objects. Caches the result in Redis (1 h TTL); persists to PostgreSQL.\n\n**Request**\n```json\n{\n  \"text\": \"Hola. Yo hablo español.\",\n  \"language\": \"es\",\n  \"source_url\": \"https://example.com\"\n}\n```\n`source_url` is optional; stored for attribution, never fetched.\n\n**Response**\n```json\n{\n  \"sentences\": [\n    {\n      \"text\": \"Hola.\",\n      \"learnable_objects\": [\n        {\n          \"id\": \"a7f3c2d1-e4b5-5678-90ab-cdef12345678\",\n          \"language\": \"es\",\n          \"type\": \"vocabulary\",\n          \"label\": \"hola\",\n          \"lesson_data\": { \"lemma\": \"hola\", \"pos\": \"INTJ\" },\n          \"confidence\": 0.85\n        }\n      ]\n    }\n  ]\n}\n```\n\n`id` is a deterministic UUID-v5 derived from `(language, type, canonical_form)`. The same word in any text always produces the same UUID.\n\n`type` is one of: `vocabulary` `conjugation` `agreement` `idiom` `grammar` `nuance`.\n\n---\n\n### `POST /ingest`\n\nPreferred ingest endpoint. Accepts text plus attribution metadata, runs the same parse pipeline as `/parse`, and additionally persists a `SourceDocument` + `SourceChunk` row for reading-progression tracking.\n\n**Request**\n```json\n{\n  \"text\": \"Hola. Yo hablo español.\",\n  \"language\": \"es\",\n  \"content_type\": \"article\",\n  \"title\": \"Mi primer artículo\",\n  \"source_url\": \"https://example.com\",\n  \"author\": null,\n  \"filename\": null\n}\n```\n\n`content_type`: one of `article`, `book`, `lyrics`, `legal`, `conversation`, `other`.\n\n**Response** — same `sentences` array as `/parse`, plus:\n```json\n{\n  \"sentences\": [...],\n  \"source_document_id\": \"a1b2c3d4-...\",\n  \"warnings\": []\n}\n```\n\n`source_document_id` is the stable reference for repeated-exposure tracking and reading-progression queries (`GET /reading/{id}`).\n\n`POST /parse` is retained for backward compatibility. New clients should use `/ingest`.\n\n---\n\n### `GET /lesson/{object_id}?language=es`\n\nReturns lesson content for one learnable object. Checks the database first; falls back to the plugin's in-session store.\n\n---\n\n### `POST /review`\n\nSubmits a recall rating and returns the next scheduled interval.\n\n**Request**\n```json\n{\n  \"object_id\": \"a7f3c2d1-e4b5-5678-90ab-cdef12345678\",\n  \"quality\": 3,\n  \"review_state\": null\n}\n```\n\n`quality`: 1 = Again, 2 = Hard, 3 = Good, 4 = Easy.\n\n`review_state`: send `null` on the first review. On subsequent reviews within the same browser session pass back the `review_state` from the previous response so the server can use it as a fallback if the database is unavailable.\n\n**Response**\n```json\n{\n  \"object_id\": \"a7f3c2d1-e4b5-5678-90ab-cdef12345678\",\n  \"next_interval_days\": 3,\n  \"review_state\": { \"stability\": 2.4, \"difficulty\": 5.31, \"reviews\": 1, \"...\": \"...\" }\n}\n```\n\n---\n\n### `GET /dashboard`\n\nReturns a knowledge-state summary for the default user.\n\nOptional query parameter: `?language=es` — scopes results to one language.\n\n```json\n{\n  \"known\": [...],\n  \"weak\": [...],\n  \"new\": [...],\n  \"due_for_review\": [...],\n  \"total_objects\": 42\n}\n```\n\nEach item carries `object_id`, `language`, `status` (`new` / `learning` / `mastered` / `forgotten`), `mastery_score`, `total_reviews`, `last_seen`, and `due_at`.\n\n---\n\n### `GET /metrics`\n\nReturns quantitative learning-effectiveness figures.\n\nOptional query parameter: `?language=es`.\n\n```json\n{\n  \"total_seen\": 84,\n  \"total_reviewed\": 31,\n  \"total_mastered\": 7,\n  \"overall_retention\": 0.74,\n  \"success_rate\": 0.82,\n  \"avg_stability_days\": 4.3,\n  \"overdue_count\": 3,\n  \"by_language\": [{ \"language\": \"es\", \"seen\": 80, \"mastered\": 7, \"retention\": 0.74 }],\n  \"by_type\": [{ \"type\": \"vocabulary\", \"seen\": 60, \"reviewed\": 22, \"mastered\": 5, \"retention\": 0.78 }],\n  \"weakest\": [{ \"object_id\": \"...\", \"type\": \"conjugation\", \"mastery_score\": 0.12, \"lapse_rate\": 0.5 }]\n}\n```\n\n---\n\n### `GET /recommend` or `GET /recommend-text`\n\nReturns sentences from the user's parse history at the difficulty appropriate for their current knowledge state, following the i+1 comprehensible-input principle.\n\nRequired query parameter: `?language=es`  \nOptional: `\u0026limit=10` (1–50)\n\n```json\n{\n  \"sentences\": [\n    {\n      \"sentence_id\": \"...\",\n      \"text\": \"El gato duerme.\",\n      \"difficulty\": 0.38,\n      \"difficulty_label\": \"ideal\",\n      \"unknown_ratio\": 0.25,\n      \"grammar_score\": 0.14,\n      \"length_score\": 0.12,\n      \"known_count\": 3,\n      \"unknown_count\": 1,\n      \"total_objects\": 4\n    }\n  ],\n  \"user_level\": \"elementary\",\n  \"target_difficulty_min\": 0.15,\n  \"target_difficulty_max\": 0.39,\n  \"total_mastered\": 12,\n  \"total_seen\": 47\n}\n```\n\n`difficulty_label` is `easy` (\u003c 15% unknown), `ideal` (15–40% unknown), or `hard` (\u003e 40% unknown).\n\n---\n\n### `GET /languages`\n\nReturns the list of active language plugins.\n\n```json\n[\n  { \"code\": \"es\", \"display_name\": \"Spanish\",        \"direction\": \"ltr\" },\n  { \"code\": \"fr\", \"display_name\": \"French\",         \"direction\": \"ltr\" },\n  { \"code\": \"ar\", \"display_name\": \"Arabic\",         \"direction\": \"rtl\" },\n  { \"code\": \"en\", \"display_name\": \"English (stub)\", \"direction\": \"ltr\" }\n]\n```\n\n---\n\n### `GET /health`\n\nLiveness probe. Returns `{\"status\": \"ok\"}` when the process is alive. Does not check backing services.\n\n### `GET /ready`\n\nReadiness probe. Queries PostgreSQL and Redis. Returns `{\"status\": \"ready\", \"db\": \"ok\", \"redis\": \"ok\"}` (HTTP 200) or `{\"status\": \"degraded\", ...}` (HTTP 503) with per-service error detail.\n\n---\n\n## Tests\n\n```bash\nmake test\n# or: pytest backend/tests -q\n```\n\nNo external services needed for most tests. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full breakdown.\n\n---\n\n## Configuration\n\n| Variable | Default |\n|---|---|\n| `DATABASE_URL` | `postgresql+asyncpg://postgres:postgres@localhost:5432/mnemosyne` |\n| `REDIS_URL` | `redis://localhost:6379/0` |\n| `DEBUG` | `true` |\n| `CORS_ORIGINS` | `[\"*\"]` |\n| `PLUGIN_PACKAGE` | `backend.plugins` |\n| `ENABLED_LANGUAGES` | *(empty — all plugins loaded)* |\n\n`ENABLED_LANGUAGES` is a comma-separated list (e.g. `es,fr`) that restricts which plugins are registered. Unset means load all discovered plugins.\n\nSee `.env.example` for the full list including the `POSTGRES_*` variables used by Docker Compose.\n\n---\n\n## Known limitations\n\n- **Lesson prose is English-only.** `build_lesson()` always produces English explanations (\"The word X is a noun\"). There is no `l1_language` parameter yet; learners whose native language is not English see English metalanguage regardless of the target language.\n- **Background parse is in-process.** `POST /parse/jobs` runs NLP in a thread-pool executor inside the same uvicorn process. Multi-worker deployments (`--workers N \u003e 1`) require sticky sessions (e.g. Nginx `ip_hash`, Traefik sticky cookie) scoped to the job ID so that SSE/polling requests reach the same worker that created the job. Single-worker deployments (`--workers 1`, the default) are unaffected.\n- **Classical morphology is shallow.** Latin and Koine Greek use offline treebank annotations (Universal Dependencies ITTB/PROIEL + MorphGNT) for morphological features. Coverage is limited to attested forms in those corpora (~3 400 Latin, ~27 000 Greek forms). Unattested forms fall back to the curated dictionary with lower confidence. Run `python -m scripts.ingest_classical_morph --lang all` to rebuild the indices from updated corpora.\n- **WCAG 2.1 AA — static audit passes; manual AT test pending.** Static checks run via `pytest backend/tests/test_accessibility_static.py`. A human keyboard-only walkthrough and NVDA/VoiceOver smoke test have not been run; see `MANUAL_ACCESSIBILITY_TEST.md` and `WCAG_AUDIT.md` for the checklist and full audit.\n\n---\n\n## Docs\n\n- [ARCHITECTURE.md](ARCHITECTURE.md) — request flows, plugin system, FSRS scheduler, persistence, difficulty scoring\n- [CONTRIBUTING.md](CONTRIBUTING.md) — setup, coding standards, how to write a language plugin\n- [ROADMAP.md](ROADMAP.md) — what is done and what is next\n- [VISION_ALIGNMENT.md](VISION_ALIGNMENT.md) — vision, current state, gaps, and design principles\n- [WCAG_AUDIT.md](WCAG_AUDIT.md) — WCAG 2.1 AA static audit findings and manual test instructions\n- [MANUAL_ACCESSIBILITY_TEST.md](MANUAL_ACCESSIBILITY_TEST.md) — step-by-step keyboard/AT manual test script\n- [docs/offline_scripts.md](docs/offline_scripts.md) — offline data pipeline scripts reference\n- [docs/corpus_pipeline.md](docs/corpus_pipeline.md) — offline corpus ingestion pipeline\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fits-not-rocket-science%2Fmnemosyne","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fits-not-rocket-science%2Fmnemosyne","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fits-not-rocket-science%2Fmnemosyne/lists"}