{"id":50525546,"url":"https://github.com/maximbilan/gazpacho","last_synced_at":"2026-06-07T11:01:03.961Z","repository":{"id":362031217,"uuid":"1256005638","full_name":"maximbilan/gazpacho","owner":"maximbilan","description":"Personal Telegram school digest bot for Ukrainian parents in Spain","archived":false,"fork":false,"pushed_at":"2026-06-02T16:41:50.000Z","size":115,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-05T09:14:26.001Z","etag":null,"topics":["aws-lambda","dynamodb","openai","python","serverless","telegram","telegram-bot","telethon"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maximbilan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-01T11:24:48.000Z","updated_at":"2026-06-02T16:34:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/maximbilan/gazpacho","commit_stats":null,"previous_names":["maximbilan/gazpacho"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/maximbilan/gazpacho","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maximbilan%2Fgazpacho","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maximbilan%2Fgazpacho/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maximbilan%2Fgazpacho/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maximbilan%2Fgazpacho/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maximbilan","download_url":"https://codeload.github.com/maximbilan/gazpacho/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maximbilan%2Fgazpacho/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33977371,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-06T02:00:07.033Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws-lambda","dynamodb","openai","python","serverless","telegram","telegram-bot","telethon"],"created_at":"2026-06-03T07:31:20.423Z","updated_at":"2026-06-06T10:00:37.795Z","avatar_url":"https://github.com/maximbilan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gazpacho 🥫🥣\n\nPersonal Telegram digest bot for Spanish school updates, summarized in Ukrainian.\n\nGazpacho has two separate Telegram identities:\n\n- A Telegram user client, implemented with Telethon/MTProto, logs in as the parent's own account and reads the configured school chats. This is required because a Telegram bot cannot fetch chat history for chats where it is not present.\n- A normal Telegram bot sends scheduled digests and receives follow-up questions in one or more configured private chats.\n\nThe interactive Telegram user login happens only once on a local machine. Cloud code receives a pre-created Telethon `StringSession` from AWS Secrets Manager and never asks for a phone number, login code, or 2FA password.\n\n## Current State\n\nGazpacho supports the full production flow:\n\n- Daily scheduled digest through EventBridge, SAM, and a container Lambda.\n- Telegram webhook Q\u0026A through API Gateway and a zip Lambda.\n- OpenAI as the default vision-capable LLM provider, with Amazon Bedrock as an optional provider.\n- Multiple private Telegram recipients through `TARGET_CHAT_IDS`.\n- GitHub Actions for pull request CI and manual production deploys.\n\n## Architecture\n\nAWS deployment overview:\n\n```mermaid\nflowchart LR\n  subgraph Telegram\n    SourceChats[\"School Telegram chats\"]\n    Bot[\"@your_bot\"]\n    Recipients[\"Configured private chats\"]\n  end\n\n  subgraph AWS\n    EventBridge[\"EventBridge daily schedule\"]\n    ScheduledLambda[\"ScheduledDigest Lambda\u003cbr/\u003econtainer image\"]\n    ApiGateway[\"API Gateway HTTP API\"]\n    WebhookLambda[\"Webhook Lambda\u003cbr/\u003ezip package\"]\n    DynamoDB[(\"DynamoDB\u003cbr/\u003edigest + Q\u0026A context\")]\n    Secrets[(\"Secrets Manager\u003cbr/\u003eTelegram + LLM secrets\")]\n  end\n\n  subgraph AI\n    VisionLLM[\"OpenAI or Bedrock\u003cbr/\u003evision-capable model\"]\n  end\n\n  EventBridge --\u003e ScheduledLambda\n  ScheduledLambda -- \"MTProto user session\" --\u003e SourceChats\n  ScheduledLambda --\u003e Secrets\n  ScheduledLambda -- \"messages + downloaded images\" --\u003e VisionLLM\n  VisionLLM -- \"Ukrainian digest\" --\u003e ScheduledLambda\n  ScheduledLambda --\u003e DynamoDB\n  ScheduledLambda -- \"send digest\" --\u003e Bot\n  Bot --\u003e Recipients\n\n  Recipients -- \"questions and commands\" --\u003e Bot\n  Bot -- \"Telegram webhook\" --\u003e ApiGateway\n  ApiGateway --\u003e WebhookLambda\n  WebhookLambda --\u003e Secrets\n  WebhookLambda --\u003e DynamoDB\n  WebhookLambda -- \"Q\u0026A prompt\" --\u003e VisionLLM\n  VisionLLM -- \"Ukrainian answer\" --\u003e WebhookLambda\n  WebhookLambda -- \"reply\" --\u003e Bot\n  Bot --\u003e Recipients\n\n  WebhookLambda -- \"/refresh async invoke\" --\u003e ScheduledLambda\n```\n\nThe Telegram bot never reads source chats. Only the Telethon user-client session in the scheduled Lambda reads the configured school chats.\n\nDaily digest flow:\n\n```text\nEventBridge cron, configured with SCHEDULE_HOUR/SCHEDULE_MINUTE in UTC\n  -\u003e ScheduledDigest Lambda container image\n       -\u003e Telethon user client reads messages since the previous stored digest\n          or falls back to LOOKBACK_DAYS\n       -\u003e downloads photo/image notices to /tmp\n       -\u003e configured vision LLM summarizes and translates into Ukrainian\n       -\u003e Telegram Bot API sends digest to each TARGET_CHAT_IDS recipient\n       -\u003e DynamoDB stores raw messages and generated digest\n```\n\nQ\u0026A flow:\n\n```text\nTelegram bot webhook\n  -\u003e API Gateway HTTP API\n       -\u003e Webhook Lambda zip\n       -\u003e verifies Telegram secret-token header\n       -\u003e reads all stored digest summaries, latest raw messages, and short chat history from DynamoDB\n       -\u003e configured LLM answers in Ukrainian\n       -\u003e Telegram Bot API replies\n```\n\nThe webhook Lambda does not import Telethon and does not read Telegram chat history. It answers only from context already stored by the scheduled digest flow.\n\n## Defaults\n\nModel IDs are configurable through environment variables. The default provider is OpenAI, using `gpt-4.1-mini` for scheduled summaries and `gpt-5-mini` for Q\u0026A. The OpenAI client sends image inputs for summaries when image notices are present.\n\nAmazon Bedrock is also supported with `LLM_PROVIDER=bedrock`. In that mode, use Bedrock model or inference-profile IDs and grant the Lambda role Bedrock Runtime permissions.\n\n## Layout\n\n```text\nsrc/common        config, secrets loader, shared clients\nsrc/reader        Telethon reader\nsrc/summarizer    prompt builder, LLM calls, image handling\nsrc/notifier      Telegram Bot API sender and message splitting\nsrc/handlers      AWS Lambda entrypoints\nscripts           local operator scripts\ninfra             SAM template and container Dockerfile\ntests             focused unit tests\n```\n\n## Local Setup\n\n1. Create a Telegram app at `my.telegram.org` and get `api_id` and `api_hash`.\n2. Create the bot via `@BotFather` and get the bot token.\n3. Copy `.env.example` to `.env` and fill in non-secret config values.\n4. Run `python scripts/login.py` locally, enter the phone number, Telegram login code, and 2FA password if set, then copy the printed `StringSession`.\n5. Store secrets in AWS Secrets Manager as one JSON object:\n\n```json\n{\n  \"telegram_api_id\": 123456,\n  \"telegram_api_hash\": \"from-my.telegram.org\",\n  \"telethon_string_session\": \"printed-by-scripts-login\",\n  \"telegram_bot_token\": \"from-botfather\",\n  \"telegram_webhook_secret\": \"random-secret-token\",\n  \"openai_api_key\": \"sk-proj-...\"\n}\n```\n\n6. Configure the GitHub `Deploy` workflow variables, or run `sam build` and `sam deploy` locally from `infra/template.yaml`.\n7. Run `scripts/set_webhook.py` to point Telegram at the API Gateway URL with the secret token.\n8. Message the bot with `/start` from each private recipient chat to confirm its `chat_id`, then set `TARGET_CHAT_IDS`.\n\nIf you do not know every recipient `chat_id` before the first deploy, deploy with one known chat ID, register the webhook, ask each recipient to send `/start`, update `TARGET_CHAT_IDS`, and redeploy.\n\nAWS SSM Parameter Store `SecureString` can replace Secrets Manager later if you want the cheapest possible secret storage at this scale.\n\n## Amazon Bedrock Setup\n\nGazpacho can use Claude through Amazon Bedrock if you prefer AWS-native model access.\n\n1. In AWS Console, open **Amazon Bedrock** in the region you plan to use.\n2. Open the model catalog or playground and confirm the model or inference profile ID you want to use.\n3. For local runs, authenticate with AWS credentials that can call Bedrock Runtime.\n4. For Lambda, grant the execution role permission for `bedrock:Converse` and `bedrock:InvokeModel`.\n\nBedrock model availability and IDs vary by region and AWS account. Set `LLM_PROVIDER=bedrock`, then set `LLM_MODEL_SUMMARY` and `LLM_MODEL_QA` to the exact Bedrock model IDs or inference profile IDs available in your account.\n\n## Environment\n\n`SOURCE_CHAT_IDS` accepts comma-separated values or a JSON list. Values can be `@username`, numeric IDs, or invite links that Telethon can resolve.\n\n`TARGET_CHAT_IDS` accepts comma-separated values or a JSON list of private Telegram chat IDs that should receive scheduled digests. The older singular `TARGET_CHAT_ID` env var is still accepted as a compatibility fallback.\n\nRequired local/cloud config:\n\n- `SOURCE_CHAT_IDS`\n- `TARGET_CHAT_IDS`\n- `TIMEZONE`, default `Europe/Madrid`\n- `SOURCE_LANG`, default `es`\n- `OUTPUT_LANG`, default `uk`\n- `LOOKBACK_DAYS`, default `7`; first-run and maximum catch-up window\n- `SCHEDULE_HOUR`, default `16` UTC for 18:00 Europe/Madrid during CEST\n- `SCHEDULE_MINUTE`, default `0`\n- `LLM_PROVIDER`, default `openai`\n- `LLM_MODEL_SUMMARY`\n- `LLM_MODEL_QA`\n- `SECRETS_MANAGER_SECRET_ID`\n- `DYNAMODB_TABLE_NAME`\n- `SCHEDULED_DIGEST_FUNCTION_NAME`\n\n## Development\n\n```bash\npython -m venv .venv\n. .venv/bin/activate\npip install -e \".[dev]\"\npytest\n```\n\nPull requests run the `CI` GitHub Actions workflow, which installs the package on Python 3.12, runs Ruff, compiles Python files, and runs pytest.\n\n## Bot Commands\n\n- `/start` explains the bot and prints the current Telegram `chat_id`.\n- `/summary` resends the latest stored digest.\n- `/refresh` asynchronously invokes the scheduled digest Lambda to read the configured source chats again.\n- Any non-command text message is treated as a question about school updates. The bot answers in Ukrainian using all stored digest summaries, the latest raw message context, and short per-chat conversation history.\n\n## Deployment\n\nThe `Deploy` GitHub Actions workflow deploys the SAM stack manually through `workflow_dispatch`.\n\nRequired GitHub environment variables:\n\n- `AWS_REGION`, default `eu-west-1`\n- `SOURCE_CHAT_IDS`\n- `TARGET_CHAT_IDS`\n- `LOOKBACK_DAYS`, default `7`; used as the first-run and maximum catch-up window. After a successful stored digest, the next scheduled digest reads only messages since that stored run.\n- `SCHEDULE_HOUR`, default `16` UTC for 18:00 Europe/Madrid during CEST\n- `SCHEDULE_MINUTE`, default `0`\n- `LLM_PROVIDER`, default `openai`\n- `LLM_MODEL_SUMMARY`, default `gpt-4.1-mini`\n- `LLM_MODEL_QA`, default `gpt-5-mini`\n- `SECRETS_MANAGER_SECRET_ID`, default `gazpacho/secrets`\n- `DYNAMODB_TABLE_NAME`, default `gazpacho`\n- `SCHEDULED_DIGEST_FUNCTION_NAME`, default `gazpacho-scheduled-digest`\n\nRequired GitHub environment secrets for the current static-key deploy setup:\n\n- `AWS_ACCESS_KEY_ID`\n- `AWS_SECRET_ACCESS_KEY`\n\nIf you later switch to GitHub OIDC, replace the static AWS secrets with an `AWS_ROLE_TO_ASSUME` variable. That role must trust GitHub OIDC for this repository.\n\nThe deploy principal, whether static IAM user or OIDC role, must allow SAM/CloudFormation, ECR, Lambda, EventBridge, DynamoDB, IAM role creation for the stack, and Secrets Manager read permissions for the configured runtime secret.\n\nThe scheduled digest Lambda is deployed as a container image from `infra/Dockerfile`.\nThe Telegram webhook Lambda is deployed as a lean zip package and intentionally does not import Telethon.\n\nAfter a successful deploy, register the webhook with the `WebhookUrl` stack output:\n\n```bash\npython scripts/set_webhook.py \\\n  --profile gazpacho-deploy \\\n  --region eu-west-1 \\\n  --url \"$(aws cloudformation describe-stacks \\\n    --profile gazpacho-deploy \\\n    --region eu-west-1 \\\n    --stack-name gazpacho \\\n    --query 'Stacks[0].Outputs[?OutputKey==`WebhookUrl`].OutputValue' \\\n    --output text)\"\n```\n\n## One-Time Telegram Login\n\nThe Telethon login must run locally because Telegram sends an interactive login code and may require the account's 2FA password. The script uses an in-memory `StringSession`, so it does not create a `.session` file.\n\nProvide `TELEGRAM_API_ID` and `TELEGRAM_API_HASH` in `.env`, or pass them as flags:\n\n```bash\npython scripts/login.py --api-id 123456 --api-hash abcdef123456\n```\n\nThe script prints the `StringSession` after successful login. Store that exact value as `telethon_string_session` in AWS Secrets Manager.\n\nIf Telegram does not send a login code, use QR login from an already logged-in Telegram mobile app:\n\n```bash\npython scripts/login.py --qr\n```\n\nScan the terminal QR code from Telegram mobile using **Settings \u003e Devices \u003e Link Desktop Device**. If the account has 2FA enabled, the script will still ask for the 2FA password after the QR scan.\n\n## Local Reader Smoke Test\n\nAfter generating a `StringSession`, put `TELEGRAM_API_ID`, `TELEGRAM_API_HASH`, `TELETHON_STRING_SESSION`, and `SOURCE_CHAT_IDS` in `.env`, then run:\n\n```bash\npython scripts/read_chats.py\n```\n\nThe script prints one normalized JSON message per line and a final JSON object with `message_count`, `image_count`, and the image download directory.\n\n## Local Scheduled Digest\n\nAfter `.env` has Telegram, target chat, source chat, and LLM settings, run the local end-to-end digest:\n\n```bash\npython scripts/run_scheduled_digest.py\n```\n\nThe deployed schedule is configured independently with `SCHEDULE_HOUR` and `SCHEDULE_MINUTE` and can run daily or at any other EventBridge cron cadence.\n\nWhen storage is enabled, each scheduled digest analyzes only Telegram messages posted since the previous successful stored digest. If there is no previous run, or the previous run is older than `LOOKBACK_DAYS`, it falls back to the configured `LOOKBACK_DAYS` window.\n\nUse `--dry-run` to read chats and summarize without sending the digest to Telegram.\n\nUse `--store` to also write the digest run to DynamoDB:\n\n```bash\npython scripts/run_scheduled_digest.py --store\n```\n\n## Security Notes\n\n- Never commit `.env` or any secret values.\n- Do not log the Telethon string session, bot token, webhook secret, or direct-provider API keys.\n- Do not log full school message bodies at info level.\n- The Q\u0026A Lambda does not import Telethon and does not use MTProto credentials. For stricter least privilege, split runtime secrets into separate reader and webhook secrets so the webhook role cannot read Telegram account credentials at all.\n- With `LLM_PROVIDER=bedrock`, Lambdas use IAM permissions for `bedrock:InvokeModel` and `bedrock:Converse`; no Anthropic API key is needed.\n- With `LLM_PROVIDER=openai`, store `openai_api_key` in Secrets Manager and do not grant Bedrock permissions.\n\n## Implemented Components\n\n- Repo skeleton, pydantic config, secrets loader, README, and `.env.example`.\n- `scripts/login.py`, including QR login mode, for one-time Telethon `StringSession` generation.\n- Reader for configured chats, normalized messages, image downloads, and Telegram flood-wait handling.\n- Summarizer for Ukrainian digests with image inputs when notices are posted as photos.\n- Telegram notifier with 4096-character message splitting.\n- Scheduled digest Lambda, SAM template, container image build, EventBridge schedule, and DynamoDB storage.\n- Webhook handler, Q\u0026A bot, `/start`, `/summary`, `/refresh`, and webhook registration script.\n- Pull request CI and manual GitHub Actions deploy workflow.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaximbilan%2Fgazpacho","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaximbilan%2Fgazpacho","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaximbilan%2Fgazpacho/lists"}