{"id":19201666,"url":"https://github.com/oott123/telegram-archive-server","last_synced_at":"2025-05-12T12:42:59.019Z","repository":{"id":69739720,"uuid":"417018399","full_name":"oott123/telegram-archive-server","owner":"oott123","description":"Archive and search server for Telegram, adds missing CJK support for Telegram search using MeiliSearch.","archived":false,"fork":false,"pushed_at":"2024-03-22T10:23:50.000Z","size":1702,"stargazers_count":50,"open_issues_count":0,"forks_count":7,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-20T13:36:09.715Z","etag":null,"topics":["meilisearch","search","telegram"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oott123.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-14T06:44:24.000Z","updated_at":"2024-09-03T03:57:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"7aaff5bf-7995-4fa7-b6af-7ee0b2eddfb2","html_url":"https://github.com/oott123/telegram-archive-server","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oott123%2Ftelegram-archive-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oott123%2Ftelegram-archive-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oott123%2Ftelegram-archive-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oott123%2Ftelegram-archive-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oott123","download_url":"https://codeload.github.com/oott123/telegram-archive-server/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253742401,"owners_count":21957018,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["meilisearch","search","telegram"],"created_at":"2024-11-09T12:39:45.472Z","updated_at":"2025-05-12T12:42:58.982Z","avatar_url":"https://github.com/oott123.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Telegram Archive Server\n\n[![Docker](https://github.com/oott123/telegram-archive-server/actions/workflows/ci.yaml/badge.svg)](https://github.com/oott123/telegram-archive-server/actions/workflows/ci.yaml) [![CJK Ready](https://img.shields.io/badge/CJK-ready-66ccff)](./README.md) [![Releases](https://img.shields.io/github/package-json/v/oott123/telegram-archive-server/master?label=version)](https://github.com/oott123/telegram-archive-server/releases) [![quay.io](https://img.shields.io/badge/Browse%20on-quay.io-blue?logo=docker\u0026logoColor=white)](https://quay.io/repository/oott123/telegram-archive-server?tab=tags) [![BSD 3 Clause Licensed](https://img.shields.io/github/license/oott123/telegram-archive-server)](./LICENSE)\n\n一个适合 CJK 环境的，Telegram 群聊搜索和归档机器人。\n\n## 功能概览\n\n- 支持群成员鉴权，仅群友可以搜索\n- 支持导入历史聊天记录，自动去重\n- 使用 MeiliSearch 对中文进行搜索，索引效果好\n- 支持图片 OCR 纳入搜索结果（仅支持新增，尚未支持历史图片）\n- 有简单的网页界面，可以显示头像\n- 搜索结果可以跳转打开聊天界面\n\n## 展示\n\n### 聊天鉴权\n\n![](./docs/assets/search-command.jpg)\n\n点击【搜索】按钮即可自动鉴权打开搜索界面。\n\n### 搜索界面\n\n![](./docs/assets/search-ui.jpg)\n\n点击时间链接即可跳转聊天界面。\n\n![](./docs/assets/search-and-jump.gif)\n\n## 部署\n\n### 准备\n\n你需要：\n\n- 一个 Bot 帐号，事先获取它的 token\n- 一个公网可及的 https 服务器，一定要有 https\n- 一个**超级群**，目前只支持超级群\n- 一个 MeiliSearch 实例，配不配置 key 都行\n- 一个 Redis 实例，没有也行，就是可能异常重启会丢消息\n\n### 配置\n\n下载 [`.env.example`](./.env.example) 文件，参考内部注释，进行相应配置。\n\n你可以将它保存为 `.env` ，或是作为环境变量配置。\n\n### 运行\n\n#### HTTPS\n\nTAS 并不提供内建的 https 服务，建议使用 Caddy 或类似软件反向代理 TAS。\n\n#### With Docker\n\n```bash\ndocker run -d --restart=always --env-file=.env quay.io/oott123/telegram-archive-server\n```\n\n当然，也可以使用 Kubernetes 或者 docker-compose 运行。\n\n#### Using Source Code\n\n如果没有 Docker 或者不想用 Docker，也可以从源码编译部署。此时你还需要：\n\n- git\n- node 18\n\n```bash\ngit clone https://github.com/oott123/telegram-archive-server.git\ncd telegram-archive-server\n# git checkout vX.X.X\ncp .env.example .env\nvim .env\nyarn\nyarn build\nyarn start\n```\n\n### 使用\n\n在群里发送 `/search`。Bot 可能会提示你设置 Domain，按提示设置即可。\n\n![](./docs/assets/bot-set-domain.gif)\n\n#### 获取用户头像\n\n用户必须满足以下条件，才能在搜索结果中展示头像：\n\n- 曾与 Bot 交互过（发送过消息，或是授权登录过）\n- 用户设置头像公开可见\n\n#### 新记录的索引规则\n\n由于 MeiliSearch 对新消息的索引效率较差，只有在满足如下任意条件时，消息才会进入索引：\n\n- 60 秒内没有收到新消息\n- 累计收到了 100 条没有进入索引的消息\n- 主进程接收到 SIGINT 信号\n\n如果没有使用 redis 以持久化消息队列，在程序异常、服务器重启时可能会丢失未进入队列的消息。\n\n### 导入老的聊天记录\n\n**当前仅支持超级群导入。**\n\n在桌面客户端点击三点按钮 - Export chat history，等待导出完成，得到 `result.json`。\n\n执行：\n\n```bash\ncurl \\\n  -H \"Content-Type: application/json\" \\\n  -H \"Authorization: Bearer $AUTH_IMPORT_TOKEN\" \\\n  -XPOST -T result.json \\\n  http://localhost:3100/api/v1/import/fromTelegramGroupExport\n```\n\n即可导入记录。注意一次只能导入单个群的记录。\n\n### OCR 识别文字(TBD)\n\n如果启用 OCR 队列，那么 Redis 是必须的（可以和缓存共用一个实例），并配置第三方识别服务。识别流程如下：\n\n[![](https://mermaid.ink/img/eyJjb2RlIjoic2VxdWVuY2VEaWFncmFtXG4gIGF1dG9udW1iZXJcbiAgQm905a6e5L6LLT4-K09DUuWunuS-izog6YCa6L-HIE9DUiDpmJ_liJflj5HpgIHlm77niYdcbiAgT0NS5a6e5L6LLT4-K09DUuacjeWKoTog6K-G5Yir5Zu-54mHXG4gIE9DUuacjeWKoS0-Pi1PQ1Llrp7kvos6IOi_lOWbnue7k-aenFxuICBPQ1Llrp7kvostPj4tQm905a6e5L6LOiDpgJrov4flhaXlupPpmJ_liJflj5HpgIHor4bliKvnu5PmnpxcbiAgYWN0aXZhdGUgQm905a6e5L6LXG4gIEJvdOWunuS-iy0-Pi1NZWlsaVNlYXJjaDog5YWl5bqTIiwibWVybWFpZCI6eyJ0aGVtZSI6ImRlZmF1bHQifSwidXBkYXRlRWRpdG9yIjp0cnVlLCJhdXRvU3luYyI6dHJ1ZSwidXBkYXRlRGlhZ3JhbSI6dHJ1ZX0)](https://mermaid.live/edit/#eyJjb2RlIjoic2VxdWVuY2VEaWFncmFtXG4gIGF1dG9udW1iZXJcbiAgQm905a6e5L6LLT4-K09DUuWunuS-izog6YCa6L-HIE9DUiDpmJ_liJflj5HpgIHlm77niYdcbiAgT0NS5a6e5L6LLT4-K09DUuacjeWKoTog6K-G5Yir5Zu-54mHXG4gIE9DUuacjeWKoS0-Pi1PQ1Llrp7kvos6IOi_lOWbnue7k-aenFxuICBPQ1Llrp7kvostPj4tQm905a6e5L6LOiDpgJrov4flhaXlupPpmJ_liJflj5HpgIHor4bliKvnu5PmnpxcbiAgYWN0aXZhdGUgQm905a6e5L6LXG4gIEJvdOWunuS-iy0-Pi1NZWlsaVNlYXJjaDog5YWl5bqTIiwibWVybWFpZCI6IntcbiAgXCJ0aGVtZVwiOiBcImRlZmF1bHRcIlxufSIsInVwZGF0ZUVkaXRvciI6dHJ1ZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOnRydWV9)\n\n识别和入库可以在不同的角色实例上完成：图片下载和文本入库将在 Bot 实例上完成，OCR 实例仅需访问 OCR 服务即可。\n\n这样的设计使得维护者可以设计离线式的集中识别（例如使用抢占式实例运行识别服务，队列清空后关机），降低识别成本。\n\n如果你使用的是第三方云服务，可以直接关闭 OCR 队列，或是在同一个实例中开启 Bot 和 OCR 角色。\n\n#### 识别服务\n\n##### Google Cloud Vision\n\n参考 [Google Cloud Vision 文本识别文档](https://cloud.google.com/vision/docs/ocr) 和 [Google Cloud Vision 计费规则](https://cloud.google.com/vision/pricing)。配置如下：\n\n```bash\nOCR_DRIVER=google\nOCR_ENDPOINT=eu-vision.googleapis.com # 或者 us-vision.googleapis.com ，决定 Google 在何处存储处理数据\nGOOGLE_APPLICATION_CREDENTIALS=/path/to/google/credentials.json # 从 GCP 后台下载的 json 鉴权文件\n```\n\n##### PaddleOCR\n\n你需要一个 [paddleocr-web](https://github.com/lilydjwg/paddleocr-web) 实例。配置如下：\n\n```bash\nOCR_DRIVER=paddle-ocr-web\nOCR_ENDPOINT=http://127.0.0.1:8980/api\n```\n\n##### Azure OCR\n\n创建一个 [Azure Vision](https://portal.azure.com/#create/Microsoft.CognitiveServicesComputerVision) 资源，并将资源信息配置如下：\n\n```bash\nOCR_DRIVER=azure\nOCR_ENDPOINT=https://tas.cognitiveservices.azure.com\nOCR_CREDENTIALS=000000000000000000000000000000000\n```\n\n#### 启动不同角色\n\n```bash\ndocker run [...] dist/main ocr,bot\n# or\nnode dist/main ocr,bot\n```\n\n## 开发\n\n```bash\nDEBUG=app:*,grammy* yarn start:debug\n```\n\n### 前端开发\n\n搜索服务鉴权后，服务端会跳转到：`$HTTP_UI_URL/index.html` 并带上以下 URL 参数：\n\n- `tas_server` - 服务器基础 URL，形如 `http://localhost:3100/api/v1`\n- `tas_indexName` - 群号，形如 `supergroup1234567890`\n- `tas_authKey` - 服务器签发的 JWT，可以用来作为 MeiliSearch 的 api key 使用。\n\n### MeiliSearch 兼容\n\n在 `/api/v1/search/compilable/meili` 处可以当作普通的 MeiliSearch 实例进行搜索。\n\n索引名应该使用形如 `supergroup1234567890` 的群号； API Key 则是服务端签发的 JWT。\n\n请注意 filter 由于安全原因暂时不可使用。\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foott123%2Ftelegram-archive-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foott123%2Ftelegram-archive-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foott123%2Ftelegram-archive-server/lists"}