{"id":28042808,"url":"https://github.com/hamodywe/telegram-scraper-TeleGraphite","last_synced_at":"2025-05-11T15:01:33.735Z","repository":{"id":287490480,"uuid":"964909544","full_name":"hamodywe/telegram-scraper-TeleGraphite","owner":"hamodywe","description":"A fast and reliable Telegram channel scraper that fetches posts and exports them to JSON.","archived":false,"fork":false,"pushed_at":"2025-04-12T02:44:07.000Z","size":0,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-12T03:29:34.026Z","etag":null,"topics":["chanels","telegram","telegram-channel-scraper","telegram-json","telegram-scrape-channels","telegram-scraper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hamodywe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-12T02:30:35.000Z","updated_at":"2025-04-12T02:44:56.000Z","dependencies_parsed_at":"2025-04-12T03:30:01.589Z","dependency_job_id":"caa6ff02-cc87-4934-84cf-dac463df62cc","html_url":"https://github.com/hamodywe/telegram-scraper-TeleGraphite","commit_stats":null,"previous_names":["hamodywe/telegraphite"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hamodywe%2Ftelegram-scraper-TeleGraphite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hamodywe%2Ftelegram-scraper-TeleGraphite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hamodywe%2Ftelegram-scraper-TeleGraphite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hamodywe%2Ftelegram-scraper-TeleGraphite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hamodywe","download_url":"https://codeload.github.com/hamodywe/telegram-scraper-TeleGraphite/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253584490,"owners_count":21931547,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chanels","telegram","telegram-channel-scraper","telegram-json","telegram-scrape-channels","telegram-scraper"],"created_at":"2025-05-11T15:01:06.651Z","updated_at":"2025-05-11T15:01:33.684Z","avatar_url":"https://github.com/hamodywe.png","language":"Python","funding_links":[],"categories":["Recently Updated","Social Media Tools"],"sub_categories":["[May 10, 2025](/content/2025/05/10/README.md)","[↑](#-table-of-contents) Telegram"],"readme":"# TeleGraphite: Telegram Scraper \u0026 JSON Exporter \u0026 telegram chanels scraper\n\n\nA tool to fetch and save posts from public Telegram channels.\n![TeleGraphite Screenshot](logo.png)\n\n## Features\n\n- Fetch posts from multiple Telegram channels\n- Save posts as JSON files (with contact exports: emails, phone numbers, links)\n- Download and save media files (photos, documents videos)\n- Deduplicate posts to avoid saving the same content twice\n- Run once or continuously with a specified interval\n- Filter posts by keywords or content type (text-only, media-only)\n- Schedule fetching at specific days and times\n\n## Installation\n\n### From Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/hamodywe/telegraphite.git\ncd telegraphite\n\n# Install the package\npip install -e .\n```\n\n### Using pip\n\n```bash\npip install telegraphite\n```\n\n## Setup\n\n1. Create a Telegram API application:\n   - Go to https://my.telegram.org/\n   - Log in with your phone number\n   - Go to 'API development tools'\n   - Create a new application\n   - Note your API ID and API Hash\n\n2. Create a `.env` file in your project directory with the following content:\n\n```\nAPI_ID=your_api_id\nAPI_HASH=your_api_hash\n```\n\n3. Create a `channels.txt` file with one channel username per line:\n\n```\n@channel1\n@channel2\nchannel3\n```\n\n## Usage\n\n### Command Line Interface\n\nTeleGraphite provides a command-line interface for fetching posts:\n\n```bash\n# Fetch posts once and exit\ntelegraphite once\n\n# Fetch posts continuously with a 1-hour interval\ntelegraphite continuous --interval 3600\n```\n\n### Options\n\n```\n-c, --channels-file  Path to file containing channel usernames (default: channels.txt)\n-d, --data-dir       Directory to store posts and media (default: data)\n-e, --env-file       Path to .env file with API credentials (default: .env)\n-l, --limit          Maximum number of posts to fetch per channel (default: 10)\n-v, --verbose        Enable verbose logging\n-i, --interval       Interval between fetches in seconds (default: 3600, only for continuous mode)\n--config             Path to YAML configuration file\n\n# Filter options\n--keywords           Filter posts containing specific keywords\n--media-only         Only fetch posts containing media (photos, documents)\n--text-only          Only fetch posts containing text\n\n# Schedule options\n--days               Days of the week to run the fetcher (monday, tuesday, etc.)\n--times              Times of day to run the fetcher in HH:MM format\n```\n\n### Configuration File\n\nYou can also use a YAML configuration file to specify options:\n\n```yaml\n# Directory to store posts and media\ndata_dir: data\n\n# Path to file containing channel usernames\nchannels_file: channels.txt\n\n# Maximum number of posts to fetch per channel\nlimit: 10\n\n# Interval between fetches in seconds (for continuous mode)\ninterval: 3600\n\n# Filters for posts\nfilters:\n  # Keywords to filter posts (only fetch posts containing these keywords)\n  keywords:\n    - important\n    - announcement\n  # Only fetch posts containing media (photos, documents)\n  media_only: false\n  # Only fetch posts containing text\n  text_only: false\n\n# Schedule for fetching posts (for continuous mode)\nschedule:\n  # Days of the week to run the fetcher\n  days:\n    - monday\n    - wednesday\n    - friday\n  # Times of day to run the fetcher (HH:MM format)\n  times:\n    - \"09:00\"\n    - \"18:00\"\n```\n\nTo use a configuration file:\n\n```bash\ntelegraphite --config config.yaml once\n```\n\nCommand-line arguments will override settings in the configuration file.\n\n### Examples\n\n```bash\n# Fetch 20 posts from each channel and save to custom directory\ntelegraphite once --limit 20 --data-dir custom_data\n\n# Use custom channels file and environment file\ntelegraphite once --channels-file my_channels.txt --env-file my_env.env\n\n# Run continuously with 30-minute interval and verbose logging\ntelegraphite continuous --interval 1800 --verbose\n\n# Fetch only posts containing specific keywords\ntelegraphite once --keywords announcement important news\n\n# Fetch only posts containing media\ntelegraphite once --media-only\n\n# Run continuously on specific days and times\ntelegraphite continuous --days monday wednesday friday --times 09:00 18:00\n\n# Combine filters and scheduling\ntelegraphite continuous --keywords important --media-only --days monday friday --times 12:00\n```\n\n## Data Structure\n\nPosts and media are saved in the following structure:\n\n```\ndata/\n  channel1/\n    posts.json\n    media/\n      20230101_123456_123.jpg\n      20230101_123456_124.pdf\n  channel2/\n    posts.json\n    media/\n      ...\n```\n\nEach `posts.json` file contains an array of post objects with the following structure:\n\n```json\n[\n  {\n    \"channel\": \"channel1\",\n    \"post_id\": 123,\n    \"date\": \"2023-01-01T12:34:56Z\",\n    \"text\": \"Post content\",\n    \"images\": [\"media/20230101_123456_123.jpg\"]\n  },\n  ...\n]\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhamodywe%2Ftelegram-scraper-TeleGraphite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhamodywe%2Ftelegram-scraper-TeleGraphite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhamodywe%2Ftelegram-scraper-TeleGraphite/lists"}