{"id":31547142,"url":"https://github.com/neurone/discord-archiver","last_synced_at":"2025-10-04T15:57:18.113Z","repository":{"id":317573979,"uuid":"1064207379","full_name":"Neurone/discord-archiver","owner":"Neurone","description":"Download all conversations from a Discord channel and listen for new messages to create a local archive that you can refer to later.","archived":false,"fork":false,"pushed_at":"2025-10-01T17:56:15.000Z","size":23,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-01T19:28:26.428Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Neurone.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-25T17:38:39.000Z","updated_at":"2025-10-01T17:56:19.000Z","dependencies_parsed_at":"2025-10-01T19:28:29.623Z","dependency_job_id":null,"html_url":"https://github.com/Neurone/discord-archiver","commit_stats":null,"previous_names":["neurone/discord-archiver"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/Neurone/discord-archiver","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Neurone%2Fdiscord-archiver","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Neurone%2Fdiscord-archiver/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Neurone%2Fdiscord-archiver/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Neurone%2Fdiscord-archiver/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Neurone","download_url":"https://codeload.github.com/Neurone/discord-archiver/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Neurone%2Fdiscord-archiver/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278335452,"owners_count":25970129,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-04T02:00:05.491Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-04T15:57:07.332Z","updated_at":"2025-10-04T15:57:18.108Z","avatar_url":"https://github.com/Neurone.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Discord Archiver\n\nArchive Discord channel messages locally. Automatically downloads message history and monitors for new changes in real-time.\n\n## Quickstart\n\n```sh\ngit clone https://github.com/neurone/discord-archiver.git\ncd discord-archiver\nnpm install\nnpx dotenvx set API_TOKEN \"\u003cYOUR_DISCORD_BOT_TOKEN\u003e\"\nnpm start \u003cCHANNEL_ID\u003e\n```\n\nArchives are saved as Markdown files in the `archive/` directory.\n\n## Features\n\n- Incremental archiving with per-channel/thread checkpoints (no re-downloading history each run)\n- Supports regular text channels and forum channels (including archived threads)\n- Optional thread filtering by forum tags via `FILTER_TAGS`\n- Real-time capture of new messages after initial backfill\n- Message edits reflected in place with a `MODIFIED` marker and latest edit timestamp\n- Message deletions remove the original block and automatically update any replies to show `DELETED MESSAGE (id)`\n- Preserves reply context; replies to already-deleted messages are marked during both bulk export and live mode\n- Attachment links preserved (filename + direct URL)\n- Safe filename handling (sanitized channel/thread IDs)\n- Stateless operation besides a lightweight JSON checkpoints file\n\n## Environment Variables\n\nSet via `npx dotenvx set \u003cNAME\u003e \u003cVALUE\u003e` or your preferred method.\n\n| Variable | Required | Description |\n|----------|----------|-------------|\n| `API_TOKEN` | Yes | Discord bot token (with Message Content intent enabled) |\n| `CHANNEL_ID` | Yes (unless passed as CLI arg) | ID of the channel (text or forum) to archive |\n| `FILTER_TAGS` | No | Comma-separated list of forum tag names to include (case-insensitive, substring match) |\n| `MAX_FETCH_SIZE` | No | Batch size for each fetch (default 100, Discord max) |\n| `OUTPUT_ROOT` | No | Output directory for markdown (default `./archive`) |\n| `CHECKPOINT_PATH` | No | Path to checkpoints JSON (default `./archive/checkpoints.json`) |\n\nCLI argument `\u003cCHANNEL_ID\u003e` overrides absence of `CHANNEL_ID` in env.\n\n## Output Format\n\nEach channel or thread is archived to `archive/\u003cCHANNEL_OR_THREAD_ID\u003e.md`.\n\nMessage structure example:\n\n```markdown\n### Message 1234567890123456789\nby alice#0 (111111111111111111)\nat *2025-01-01 10:00:00.000 UTC*\n**MODIFIED** last time at *2025-01-01 10:05:30.000 UTC*\nin reply to **DELETED MESSAGE** (1234567890123000000)\n\nThis is the message body (supports markdown as-is)\n\n**Attachments:**\n- [image.png](https://cdn.discordapp.com/attachments/…/image.png)\n---\n```\n\nNotes:\n- `**MODIFIED**` line appears only if the message has been edited.\n- Reply line appears only for replies; if the parent was deleted, it is annotated as `DELETED MESSAGE`.\n- Deleted messages are fully removed; their former replies update automatically.\n\n## Real-Time Behavior\n\n1. On startup: performs a backfill (only messages newer than the last checkpoint if it exists).\n2. While running: listens for `messageCreate`, `messageUpdate`, `messageDelete`, and `threadUpdate`.\n3. Edits: in-place rewrite with updated edit timestamp; no historical versions retained.\n4. Deletes: message block removed; any existing references rewritten to a deleted marker.\n5. Replies to already-deleted parents (even if deleted before startup) are detected and marked.\n\n## Forum Thread Filtering\n\nWhen archiving a forum channel, all threads (active + archived) are enumerated. If `FILTER_TAGS` is set:\n\n- Each applied tag name on a thread is lowercased.\n- A thread is included if ANY tag name contains (or is contained by) ANY filter token.\n- Example: `FILTER_TAGS=\"bug,feature\"` will match `bug`, `bug report`, `feature-request`, etc.\n\nUnset `FILTER_TAGS` to archive every thread.\n\n## Checkpoints \u0026 Incremental Sync\n\nThe file at `CHECKPOINT_PATH` stores the last processed message ID per channel/thread:\n\n```json\n{\n\t\"channels\": {\n\t\t\"1423048371555405876\": \"1423073354352562186\"\n\t}\n}\n```\n\nDuring startup:\n- If a checkpoint exists: only messages with IDs greater than the stored ID are fetched (using Discord's `after` option).\n- If missing/corrupt: a full backfill is performed.\n\nDuring runtime:\n- New messages are appended immediately and checkpoint updated.\n- If a tag reconfiguration or missed gap occurs, the logic re-fetches only the missing span.\n\n## Safety \u0026 Operational Notes\n\n- Filenames are sanitized to alphanumerics / underscore / hyphen.\n- Only minimal state is persisted (checkpoint JSON); archives are append-only except for edit/delete rewrites.\n- The script requires the **Message Content Intent**; ensure it's enabled in the Developer Portal.\n- Large channels: uses a single fetch window per run (checkpoint-based) instead of walking history backwards.\n- Rate limits: relies on discord.js internal backoff; no manual throttling required at current scope.\n\n## Limitations / Future Ideas\n\n- No pagination backwards beyond the first startup snapshot when no checkpoint exists (could add full historical crawl).\n- No preservation of previous edit revisions (could add versioned collapsible blocks).\n- No rich embed capture; only message `content` and attachment URLs.\n- Does not currently export reactions or pin status.\n\n## Cleaning Archives\n\n## Configuration Options\n\n### Optional: Filter Forum Threads by Tags\n\nWhen archiving forum channels, you can filter specific threads by tags:\n\n```sh\nnpx dotenvx set FILTER_TAGS \"bug,feature-request\"\n```\n\nMultiple tags can be comma-separated. Only threads with matching tags will be archived. Leave unset to archive all threads.\n\n### Optional: Set Channel ID via Environment\n\nInstead of passing the channel ID as an argument, you can set it as an environment variable:\n\n```sh\nnpx dotenvx set CHANNEL_ID \"\u003cYOUR_CHANNEL_ID\u003e\"\nnpx dotenvx run -- node discord-archiver.js\n```\n\n## Setup Details\n\n### Get Your Discord Bot Token\n\n1. Go to [Discord Developer Portal](https://discord.com/developers/applications)\n2. Create a new application\n3. Navigate to the \"Bot\" section and create a bot\n4. Copy the bot token and use it as `API_TOKEN`\n5. Enable **Message Content Intent** in the bot settings\n6. Invite the bot to your server with \"Read Messages\" and \"Read Message History\" permissions\n\n### Get the Channel ID\n\nEnable Developer Mode in Discord (Settings → Advanced → Developer Mode), then right-click any channel and select \"Copy Channel ID\".\n\n## Usage\n\nRun the archiver for a specific channel:\n```sh\nnpx dotenvx run -- node discord-archiver.js \u003cCHANNEL_ID\u003e\n```\n\nThe archiver will:\n- Download all existing messages from the channel\n- Save them as a Markdown file in `archive/`\n- Continue listening for new messages\n- Update the archive in real-time\n\n## Cleaning Archives\n\nRemove all archived data:\n```sh\nnpm run clean\n```\n\nThis removes all markdown files and the checkpoint JSON—subsequent runs will re-backfill from scratch.\n\n## Troubleshooting\n\n| Symptom | Cause | Fix |\n|---------|-------|-----|\n| Script exits immediately | Missing `API_TOKEN` or `CHANNEL_ID` | Set env vars or pass channel ID CLI arg |\n| No edits detected | Missing `Partials.Message` or permissions | Ensure current code \u0026 bot has Message Content intent |\n| Replies show raw IDs only | Parent message not yet archived | Will update when parent appears (if still exists) |\n| Deleted parent not marked | Cache race | Restart; ensure deletion happened after bot startup |\n\nEnable verbose logging by temporarily adding custom `console.log` lines where needed.\n\n## License\n\nMIT License. See `LICENSE` file.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneurone%2Fdiscord-archiver","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneurone%2Fdiscord-archiver","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneurone%2Fdiscord-archiver/lists"}