https://github.com/openclaw/telecrawl
Telegram for Claws
https://github.com/openclaw/telecrawl
Last synced: about 1 month ago
JSON representation
Telegram for Claws
- Host: GitHub
- URL: https://github.com/openclaw/telecrawl
- Owner: openclaw
- License: mit
- Created: 2026-05-08T13:55:14.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-05-09T05:15:19.000Z (about 1 month ago)
- Last Synced: 2026-05-12T17:41:18.302Z (about 1 month ago)
- Language: Go
- Size: 54.7 KB
- Stars: 27
- Watchers: 0
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# telecrawl
Telegram Desktop archive CLI.
`telecrawl` reads your local Telegram Desktop `tdata` through `opentele2` and
Telethon, stores a searchable SQLite archive in `~/.telecrawl/telecrawl.db`, and
can back it up to GitHub as encrypted age shards.
It is local-first:
- Normal archive/search commands do not upload data.
- `backup push` uploads only age-encrypted shards when you run it explicitly.
- Telegram message text, chat names, sender names, and media metadata stay inside
encrypted backup payloads.
## Install
```bash
brew tap steipete/tap
brew install telecrawl
```
Or install with Go:
```bash
go install github.com/openclaw/telecrawl/cmd/telecrawl@latest
```
## Setup
Install the Python bridge used for Telegram Desktop `tdata` imports:
```bash
telecrawl deps install
```
This creates `~/.telecrawl/venv` and installs `opentele2` plus Telethon.
## Import
```bash
telecrawl doctor
telecrawl import
telecrawl status
```
Import defaults to:
- latest `200` dialogs
- latest `500` messages per dialog
Use `0` for no limit:
```bash
telecrawl import --dialogs-limit 0 --messages-limit 0
```
Useful reads:
```bash
telecrawl folders
telecrawl chats --limit 20
telecrawl chats --folder FOLDER_ID
telecrawl chats --unread
telecrawl topics --chat CHAT_ID
telecrawl messages --limit 20
telecrawl messages --chat CHAT_ID --after 2026-01-01
telecrawl messages --chat CHAT_ID --topic TOPIC_ID
telecrawl messages --chat CHAT_ID --pinned
telecrawl search "query"
telecrawl search "query" --chat CHAT_ID --topic TOPIC_ID
```
Telegram folders, forum topics, reply/thread IDs, pinned messages, edits,
forwards, reactions, view/reply counts, and richer media titles are archived
when Telethon exposes them for the active account. Folder rows include explicit
membership from Telegram dialog filters; dynamic folder rules are recorded as
metadata and may not expand to every matching chat.
Add `--json` before the command for machine-readable output:
```bash
telecrawl --json status
telecrawl --json search "invoice"
```
## Data Paths
Defaults:
- Telegram Desktop source: `~/Library/Application Support/Telegram Desktop/tdata`
- archive DB: `~/.telecrawl/telecrawl.db`
- Python bridge venv: `~/.telecrawl/venv`
- Telethon sessions: `~/.telecrawl/sessions/`
- backup config: `~/.telecrawl/backup.json`
- age identity: `~/.telecrawl/age.key`
- backup checkout: `~/Projects/backup-telecrawl`
Override the archive DB:
```bash
telecrawl --db /tmp/telecrawl.db status
```
Override the Telegram Desktop source:
```bash
telecrawl --source "/path/to/tdata" doctor
telecrawl --source "/path/to/tdata" import
```
## Backup
Create `https://github.com/steipete/backup-telecrawl` first, then initialize:
```bash
telecrawl backup init
telecrawl backup push
```
The default backup config points at:
```json
{
"repo": "~/Projects/backup-telecrawl",
"remote": "https://github.com/steipete/backup-telecrawl.git",
"identity": "~/.telecrawl/age.key"
}
```
Use a different repository or config path:
```bash
telecrawl backup init \
--config ~/.telecrawl/backup.json \
--repo ~/Projects/backup-telecrawl \
--remote https://github.com/steipete/backup-telecrawl.git
```
Inspect backup metadata:
```bash
telecrawl backup status
```
Restore into the current archive DB:
```bash
telecrawl backup pull
telecrawl status
```
Restore into a throwaway DB for validation:
```bash
telecrawl --db /tmp/telecrawl-restore-test.db backup pull
telecrawl --db /tmp/telecrawl-restore-test.db status
```
## Backup Security Model
Backup shards are JSONL, gzip-compressed with deterministic gzip metadata, and
encrypted with age before Git sees them.
Git can still see cleartext metadata:
- export time
- public age recipients
- table names
- row counts
- shard paths
- encrypted byte sizes
- plaintext shard hashes
- backup cadence and which encrypted shards changed
Git cannot read message text, chat names, sender names, or media metadata without
an age identity.
Keep `~/.telecrawl/age.key` private. If you lose it and no other recipient can
decrypt the backup, the encrypted backup cannot be restored.
## Multi-Machine Backups
On another machine:
```bash
telecrawl backup init --no-push
cat ~/.telecrawl/backup.json
```
Copy that machine's public `recipient` into the first machine's
`~/.telecrawl/backup.json`, then re-encrypt current shards:
```bash
telecrawl backup push
```
The private `AGE-SECRET-KEY-...` identity must not be committed or shared.
## Reset
Remove local state:
```bash
rm -rf ~/.telecrawl
```
Remove only the archive:
```bash
rm -f ~/.telecrawl/telecrawl.db ~/.telecrawl/telecrawl.db-*
```
Do not delete `~/.telecrawl/age.key` unless you have another working backup
recipient or you no longer need to restore existing encrypted backups.