https://github.com/instagram-automations/scrape-instagram-comments

scrape instagram comments automation toolkit
https://github.com/instagram-automations/scrape-instagram-comments

api automation cli comments docker graphql instagram nodejs playwright proxy puppeteer python rate-limits scrape scrape-instagram-comments selenium

Last synced: 9 months ago
JSON representation

scrape instagram comments automation toolkit

Host: GitHub
URL: https://github.com/instagram-automations/scrape-instagram-comments
Owner: Instagram-Automations
Created: 2025-10-13T19:40:38.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-10-13T19:50:00.000Z (9 months ago)
Last Synced: 2025-10-14T19:04:42.170Z (9 months ago)
Topics: api, automation, cli, comments, docker, graphql, instagram, nodejs, playwright, proxy, puppeteer, python, rate-limits, scrape, scrape-instagram-comments, selenium
Homepage:
Size: 1.8 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# scrape instagram comments

A ready-to-use toolkit that fetches Instagram post comments (and threaded replies) at scale with pagination, proxy rotation, and safe automation options.

For discussion, queries, and freelance work — reach out 👆

---

## Introduction
> Collect post comments and nested replies for analytics, moderation, and research. Built for makers, growth teams, and data engineers who need reliable pagination, nested-thread traversal, and resilient strategies when endpoints or GraphQL identifiers shift.

### Key Benefits
1. Saves time and automates setup.
2. Scalable for multiple use cases.
3. Safer with anti-detect and proxy logic.

---

## Features must be in table

| Feature | Details |
|---|---|
| Comment Fetching | Pull top-level comments with cursor-based pagination. |
| Nested Replies | Recursively collect threaded replies with parent-child linkage. |
| Proxy & Rotation | Supports proxy lists and rotation to reduce blocks. |
| Dual Stack | Works with official API (where eligible) or headless automation fallback. |
| Rate Control | Backoff, delays, and concurrency caps for stability. |

---

## Use Cases
- Social listening & sentiment analysis
- Creator/brand moderation dashboards
- Research datasets (public comments)
- Lead extraction and keyword monitoring

---

## FAQs

**Q:** How many comments can I fetch?
**A:** Practically, you can paginate through all available comments on a public post, constrained by rate limits, session quality, and proxies. With the official API (for eligible Business/Creator accounts), you’ll page using cursors and reasonable limits per request; with headless/browser automation, use slower concurrency, randomized delays, and rotating proxies. At scale, teams commonly fetch thousands of comments across posts by batching requests and persisting cursors.

**Q:** How do I get nested replies?
**A:** Use recursive traversal. For each top-level comment, request its replies (child thread) and attach `parent_id` references. In official endpoints, request fields that include replies/threads; in headless mode, capture the threaded structure from the post’s comment UI or GraphQL response and walk each child list until no more `next_cursor` is returned.

**Q:** Do Instagram APIs / GraphQL doc_ids change?
**A:** Yes—internal GraphQL `doc_id`/query hashes are not stable and can change. To stay resilient: (1) prefer official APIs when eligible, (2) in scraping mode, discover query identifiers dynamically from network traffic at runtime, and (3) design a fallback that scrolls & renders comments in the UI, extracting from the DOM if a query hash breaks.

---

## Results
-----------------------------------
> 10x faster posting schedules
> 80% engagement increase on group campaigns
> Fully automated lead response system

## Performance Metrics
-----------------------------------
Average Performance Benchmarks:
- **Speed:** 2x faster than manual posting
- **Stability:** 99.2% uptime
- **Ban Rate:** <0.5% with safe automation mode
- **Throughput:** 100+ posts/hour per session

---

##Do you have a customize project for us ?
Contact Us

support@appilot.app

┃

pilot

┃

zee#2655

┃

whatsapp

---

## Installation

### Pre-requisites
- Node.js or Python
- Git
- Docker (optional)

### Steps
```bash
# Clone the repo
git clone https://github.com/yourusername/scrape-instagram-comments.git
cd scrape-instagram-comments

# Install dependencies
npm install
# or
pip install -r requirements.txt

# Setup environment
cp .env.example .env

# Run
npm start
# or
python main.py
```

---

## Example Output

```bash
$ cli fetch --post https://www.instagram.com/p/POST_ID/ --limit 500 --replies
Fetched: 500 comments (including 178 nested replies)
Saved to: outputs/POST_ID_comments.jsonl
```

---

## License

MIT License

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/instagram-automations/scrape-instagram-comments

Awesome Lists containing this project

README