https://github.com/instagram-automations/scrape-instagram-comments
scrape instagram comments automation toolkit
https://github.com/instagram-automations/scrape-instagram-comments
api automation cli comments docker graphql instagram nodejs playwright proxy puppeteer python rate-limits scrape scrape-instagram-comments selenium
Last synced: 8 months ago
JSON representation
scrape instagram comments automation toolkit
- Host: GitHub
- URL: https://github.com/instagram-automations/scrape-instagram-comments
- Owner: Instagram-Automations
- Created: 2025-10-13T19:40:38.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-10-13T19:50:00.000Z (8 months ago)
- Last Synced: 2025-10-14T19:04:42.170Z (8 months ago)
- Topics: api, automation, cli, comments, docker, graphql, instagram, nodejs, playwright, proxy, puppeteer, python, rate-limits, scrape, scrape-instagram-comments, selenium
- Homepage:
- Size: 1.8 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# scrape instagram comments
A ready-to-use toolkit that fetches Instagram post comments (and threaded replies) at scale with pagination, proxy rotation, and safe automation options.
For discussion, queries, and freelance work — reach out 👆
---
## Introduction
> Collect post comments and nested replies for analytics, moderation, and research. Built for makers, growth teams, and data engineers who need reliable pagination, nested-thread traversal, and resilient strategies when endpoints or GraphQL identifiers shift.
### Key Benefits
1. Saves time and automates setup.
2. Scalable for multiple use cases.
3. Safer with anti-detect and proxy logic.
---
## Features must be in table
| Feature | Details |
|---|---|
| Comment Fetching | Pull top-level comments with cursor-based pagination. |
| Nested Replies | Recursively collect threaded replies with parent-child linkage. |
| Proxy & Rotation | Supports proxy lists and rotation to reduce blocks. |
| Dual Stack | Works with official API (where eligible) or headless automation fallback. |
| Rate Control | Backoff, delays, and concurrency caps for stability. |
---
## Use Cases
- Social listening & sentiment analysis
- Creator/brand moderation dashboards
- Research datasets (public comments)
- Lead extraction and keyword monitoring
---
## FAQs
**Q:** How many comments can I fetch?
**A:** Practically, you can paginate through all available comments on a public post, constrained by rate limits, session quality, and proxies. With the official API (for eligible Business/Creator accounts), you’ll page using cursors and reasonable limits per request; with headless/browser automation, use slower concurrency, randomized delays, and rotating proxies. At scale, teams commonly fetch thousands of comments across posts by batching requests and persisting cursors.
**Q:** How do I get nested replies?
**A:** Use recursive traversal. For each top-level comment, request its replies (child thread) and attach `parent_id` references. In official endpoints, request fields that include replies/threads; in headless mode, capture the threaded structure from the post’s comment UI or GraphQL response and walk each child list until no more `next_cursor` is returned.
**Q:** Do Instagram APIs / GraphQL doc_ids change?
**A:** Yes—internal GraphQL `doc_id`/query hashes are not stable and can change. To stay resilient: (1) prefer official APIs when eligible, (2) in scraping mode, discover query identifiers dynamically from network traffic at runtime, and (3) design a fallback that scrolls & renders comments in the UI, extracting from the DOM if a query hash breaks.
---
## Results
-----------------------------------
> 10x faster posting schedules
> 80% engagement increase on group campaigns
> Fully automated lead response system
## Performance Metrics
-----------------------------------
Average Performance Benchmarks:
- **Speed:** 2x faster than manual posting
- **Stability:** 99.2% uptime
- **Ban Rate:** <0.5% with safe automation mode
- **Throughput:** 100+ posts/hour per session
---
##Do you have a customize project for us ?
Contact Us
---
## Installation
### Pre-requisites
- Node.js or Python
- Git
- Docker (optional)
### Steps
```bash
# Clone the repo
git clone https://github.com/yourusername/scrape-instagram-comments.git
cd scrape-instagram-comments
# Install dependencies
npm install
# or
pip install -r requirements.txt
# Setup environment
cp .env.example .env
# Run
npm start
# or
python main.py
```
---
## Example Output
```bash
$ cli fetch --post https://www.instagram.com/p/POST_ID/ --limit 500 --replies
Fetched: 500 comments (including 178 nested replies)
Saved to: outputs/POST_ID_comments.jsonl
```
---
## License
MIT License