{"id":31980105,"url":"https://github.com/instagram-automations/instagram-scraper-github","last_synced_at":"2025-10-14T23:27:12.512Z","repository":{"id":318739660,"uuid":"1075641598","full_name":"Instagram-Automations/instagram-scraper-github","owner":"Instagram-Automations","description":"instagram scraper github automation toolkit","archived":false,"fork":false,"pushed_at":"2025-10-13T19:37:05.000Z","size":1983,"stargazers_count":0,"open_issues_count":3,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-14T19:04:41.952Z","etag":null,"topics":["anti-detect","api","automation","cli","docker","github","instagram","instagram-scraper-github","nodejs","playwright","proxy","puppeteer","python","rate-limits","rotating-proxies","scarper","selenium"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Instagram-Automations.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-13T19:27:49.000Z","updated_at":"2025-10-13T19:37:08.000Z","dependencies_parsed_at":"2025-10-14T19:14:50.462Z","dependency_job_id":null,"html_url":"https://github.com/Instagram-Automations/instagram-scraper-github","commit_stats":null,"previous_names":["instagram-automations/instagram-scraper-github"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Instagram-Automations/instagram-scraper-github","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Instagram-Automations%2Finstagram-scraper-github","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Instagram-Automations%2Finstagram-scraper-github/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Instagram-Automations%2Finstagram-scraper-github/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Instagram-Automations%2Finstagram-scraper-github/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Instagram-Automations","download_url":"https://codeload.github.com/Instagram-Automations/instagram-scraper-github/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Instagram-Automations%2Finstagram-scraper-github/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279024775,"owners_count":26087830,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anti-detect","api","automation","cli","docker","github","instagram","instagram-scraper-github","nodejs","playwright","proxy","puppeteer","python","rate-limits","rotating-proxies","scarper","selenium"],"created_at":"2025-10-14T23:27:11.571Z","updated_at":"2025-10-14T23:27:12.506Z","avatar_url":"https://github.com/Instagram-Automations.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# instagram scraper github\n\nA production-ready boilerplate to build, test, and ship an Instagram scraping pipeline from a GitHub repository. It focuses on resiliency against UI/API changes, proxy hygiene, and safe scaling.\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://t.me/devpilot1\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Chat%20on-Telegram-2CA5E0?style=for-the-badge\u0026logo=telegram\u0026logoColor=white\" alt=\"Telegram\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://discord.gg/vBu9huKBvy\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Join-Discord-5865F2?style=for-the-badge\u0026logo=discord\u0026logoColor=white\" alt=\"Discord\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://wa.me/447723343390?text=Hi%20Zeeshan%2C%20I%27m%20interested%20in%20automation.\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Chat-WhatsApp-25D366?style=for-the-badge\u0026logo=whatsapp\u0026logoColor=white\" alt=\"WhatsApp\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"mailto:support@appilot.app\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Email-support@appilot.app-EA4335?style=for-the-badge\u0026logo=gmail\u0026logoColor=white\" alt=\"Gmail\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eFor discussion, queries, and freelance work — reach out 👆\u003c/strong\u003e\n\u003c/p\u003e\n\n---\n\n##  Introduction\n\u003e This repository is a robust template for building an Instagram scraper that you can deploy from GitHub to containers or serverless runners. It handles login, pagination, data extraction, retries, and storage pipelines with proxy rotation and anti-detect best practices. Ideal for growth teams, data engineers, and researchers.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"instagram-scraper-github.png\" alt=\"instagram-scraper-github.png\" width=\"90%\"\u003e\n\u003c/p\u003e\n\n###  Key Benefits\n1. Saves time and automates setup.  \n2. Scalable for multiple use cases.  \n3. Safer with anti-detect and proxy logic.  \n\n---\n\n## Features (Table)\n\n| Feature | What it does |\n|---|---|\n| Headless browser layer | Playwright/Puppeteer/Selenium adapters with stealth plugin |\n| Resilient selectors | CSS/XPath fallback + semantic locators to withstand UI shifts |\n| Proxy \u0026 session pool | Rotating residential/mobile proxies, per-session cookies/fingerprints |\n| Rate-limit guard | Token bucket throttling, jittered delays, backoff \u0026 circuit breaker |\n| Pluggable storage | Write to JSON/CSV, SQLite/Postgres, S3/GCS, or Webhooks |\n| Config via .env | Centralized runtime toggles, credentials, and feature flags |\n| Structured logs | JSON logs + request/response tracing for observability |\n| Dockerized runner | One-command local runs and reproducible CI builds |\n\n---\n\n##  Use Cases\n- Competitor monitoring (hashtags, mentions, profiles)  \n- UGC/review collection for sentiment analysis  \n- Influencer discovery and campaign tracking  \n- Academic research \u0026 trend analysis  \n\n---\n\n##  FAQs\n\n**Q:** What happens if GitHub scraper breaks (due to Instagram changes)?  \n**A:** The boilerplate includes selector fallbacks, semantic locators, and a rules-based parser. When a DOM change happens, the retry layer captures failures, snapshots the HTML, and opens a “break report” in logs. You can then adjust locators in one place (`/scraper/selectors.*`) without touching business logic. CI smoke tests validate critical paths so breaks are caught early.\n\n**Q:** Can I deploy scraper in production / scale it?  \n**A:** Yes. Use the included Dockerfile and `docker-compose.yml` for horizontal workers. Scale with a queue (Redis/RQ, BullMQ, or Celery) and run N workers per proxy pool. Add a scheduler (GitHub Actions, Cron, or Argo Workflows) and centralize storage (Postgres/S3). The rate-limit guard and session pools keep concurrency safe.\n\n**Q:** What tools or libraries are commonly used for Instagram scraping?  \n**A:** Headless browsers (Playwright, Puppeteer, Selenium), stealth plugins, proxy managers (residential/mobile), HTML parsers (Cheerio/BeautifulSoup), request tooling (Axios/Requests), queues (BullMQ/Celery), and datastores (SQLite/Postgres/S3). This repo shows reference adapters so you can swap stacks easily.\n\n---\n\n## Results\n----------------------------------- \n\u003e 10x faster posting schedules  \n\u003e 80% engagement increase on group campaigns  \n\u003e Fully automated lead response system  \n\n##  Performance Metrics\n-----------------------------------\nAverage Performance Benchmarks:  \n- **Speed:** 2x faster than manual posting  \n- **Stability:** 99.2% uptime  \n- **Ban Rate:** \u003c0.5% with safe automation mode  \n- **Throughput:** 100+ posts/hour per session\n\n---\n\n##Do you have a customize project for us ?\nContact Us\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"https://mail.google.com/mail/u/?authuser=ahmadzee26@gmail.com\"\u003e\n    \u003cimg alt=\"Gmail\" width=\"30px\" src=\"https://edent.github.io/SuperTinyIcons/images/svg/gmail.svg\" /\u003e\n    \u003ccode\u003esupport@appilot.app\u003c/code\u003e\n  \u003c/a\u003e\n  \u003cspan\u003e ┃ \u003c/span\u003e\n  \u003ca href=\"https://t.me/devpilot1\"\u003e\n    \u003cimg alt=\"Telegram\" width=\"30px\" src=\"https://edent.github.io/SuperTinyIcons/images/svg/telegram.svg\" /\u003e\n    \u003ccode\u003epilot\u003c/code\u003e\n  \u003c/a\u003e\n  \u003cspan\u003e ┃ \u003c/span\u003e\n  \u003ca href=\"https://discord.com\"\u003e\n    \u003cimg alt=\"Discord\" width=\"30px\" src=\"https://github.com/Zeeshanahmad4/RealEstateMate-WhatsApp-Group-Management-Bot/blob/main/discord-icon-svgrepo-com.svg\" /\u003e\n    \u003ccode\u003ezee#2655\u003c/code\u003e\n  \u003c/a\u003e\n  \u003cspan\u003e ┃ \u003c/span\u003e\n  \u003ca href=\"https://wa.me/447723343390?text=Hi%20Zeeshan%2C%20I%27m%20interested%20in%20automation.\" target=\"_blank\"\u003e\n    \u003cimg alt=\"WhatsApp\" width=\"30px\" src=\"https://cdn.jsdelivr.net/npm/simple-icons@v11/icons/whatsapp.svg\" /\u003e\n    \u003ccode\u003ewhatsapp\u003c/code\u003e\n  \u003c/a\u003e\n  \u003cbr /\u003e\n\u003c/div\u003e\n\n---\n\n##  Installation\n\n###  Pre-requisites\n- Node.js or Python  \n- Git  \n- Docker (optional)  \n\n###  Steps\n```bash\n# Clone the repo\ngit clone https://github.com/yourusername/instagram-scraper-github.git\ncd instagram-scraper-github\n\n# Install dependencies\nnpm install\n# or\npip install -r requirements.txt\n\n# Setup environment\ncp .env.example .env\n\n# Run\nnpm start\n# or\npython main.py\n```\n\n---\n\n##  Example Output\n\n```bash\n$ npm start -- --hashtag \"fitness\" --limit 50 --out data/fitness.json\n# =\u003e scrapes recent posts for #fitness with safe delays and saves JSON\n\n$ python main.py --profile zeeshanahmad --out data/profile.csv\n# =\u003e collects profile metadata, posts, and basic engagement stats\n```\n\n---\n\n##  License\n\nMIT License\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finstagram-automations%2Finstagram-scraper-github","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finstagram-automations%2Finstagram-scraper-github","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finstagram-automations%2Finstagram-scraper-github/lists"}