https://github.com/instagram-automations/instagram-scraper-api
instagram scraper api toolkit
https://github.com/instagram-automations/instagram-scraper-api
anti-detect api automation captcha cli docker graphql http httpx instagram instagram-scraper-api nodejs playwright proxy puppeteer python rest scraper
Last synced: 6 months ago
JSON representation
instagram scraper api toolkit
- Host: GitHub
- URL: https://github.com/instagram-automations/instagram-scraper-api
- Owner: Instagram-Automations
- Created: 2025-10-13T16:19:35.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-10-13T16:26:45.000Z (6 months ago)
- Last Synced: 2025-10-14T19:04:41.812Z (6 months ago)
- Topics: anti-detect, api, automation, captcha, cli, docker, graphql, http, httpx, instagram, instagram-scraper-api, nodejs, playwright, proxy, puppeteer, python, rest, scraper
- Homepage:
- Size: 1.11 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# instagram scraper api
A plug-and-play toolkit to build your own **Instagram Scraping API** with rotating proxies, CAPTCHA handling, and rate-limit friendly strategies. Perfect for analysts, growth teams, and SaaS builders who need reliable IG data pipelines.
For discussion, queries, and freelance work — reach out 👆
---
## Introduction
> This project provides a ready-to-run **REST API** that scrapes public Instagram data (profiles, posts, reels, comments, hashtags) using headless browsers and/or HTTP clients. It focuses on stability, safety, and scale with rotating proxies, session pools, and optional CAPTCHA solving integration.
### Key Benefits
1. Saves time and automates setup.
2. Scalable for multiple use cases.
3. Safer with anti-detect and proxy logic.
---
## Features (Table)
| Feature | Details |
|---|---|
| REST Endpoints | `/v1/profile`, `/v1/posts`, `/v1/post/{shortcode}`, `/v1/hashtag/{tag}`, `/v1/comments/{shortcode}` |
| Dual Engines | **Playwright/Puppeteer** (browser) + **HTTP clients** (requests/httpx) |
| Proxy Rotation | Per-request & sticky sessions, residential/MNO proxy ready |
| CAPTCHA Handling | Pluggable solvers (2Captcha/CapMonster/API hook) |
| Session Management | Cookie jars, device fingerprints, randomized headers/delays |
| Rate Limiting | Token bucket, concurrency caps, backoff & retry |
| Exporters | JSONL/CSV/NDJSON + webhooks/Kafka-ready stubs |
| Dockerized | Single-command local or server deploy |
---
## Use Cases
- Market & competitor analysis (content cadence, engagement rates)
- Influencer discovery (hashtag/topic mining, audience metrics)
- Social listening (keyword/hashtag monitoring, comment sentiment)
- Dataset building for ML/NLP (public captions, comments, metadata)
---
## FAQs
**Q:** What is an Instagram Scraping API?
**A:** It’s a server that exposes endpoints to fetch public Instagram data (profiles, posts, reels, comments, hashtags) by performing automated browsing or HTTP requests under the hood, then returning normalized JSON.
**Q:** What kind of data can be extracted?
**A:** Public profile metadata (username, bio, followers/following counts), posts/reels (shortcode, captions, media URLs, like/comment counts, timestamps), comments (text, author, time), hashtags (top/recent posts), and lightweight engagement metrics—subject to Instagram’s terms and your jurisdiction’s laws.
**Q:** How do Instagram scraping APIs handle proxies and CAPTCHAs?
**A:** Proxies are rotated per request or per session (sticky) to distribute traffic and reduce blocks. User-agents, headers, and delays are randomized to look human. When CAPTCHAs appear, the API routes the challenge to a solver service (e.g., 2Captcha/CapMonster) via a configurable adapter and retries with the solved token.
---
## Results
-----------------------------------
> 10x faster posting schedules
> 80% engagement increase on group campaigns
> Fully automated lead response system
## Performance Metrics
-----------------------------------
Average Performance Benchmarks:
- **Speed:** 2x faster than manual posting
- **Stability:** 99.2% uptime
- **Ban Rate:** <0.5% with safe automation mode
- **Throughput:** 100+ posts/hour per session
---
##Do you have a customize project for us ?
Contact Us
---
## Installation
### Pre-requisites
- Node.js or Python
- Git
- Docker (optional)
### Steps
```bash
# Clone the repo
git clone https://github.com/yourusername/instagram-scraper-api.git
cd instagram-scraper-api
# Install dependencies (Node)
npm install
# OR Python
pip install -r requirements.txt
# Setup environment
cp .env.example .env
# Fill in:
# PROXY_URL=http://user:pass@host:port
# CAPTCHA_PROVIDER=2captcha|capmonster|mock
# CAPTCHA_API_KEY=xxxx
# ENGINE=playwright|puppeteer|httpx|requests
# Run (Node)
npm start
# OR Python
python main.py
```
---
## Example Output
**Fetch profile (REST):**
```bash
curl -s "http://localhost:8080/v1/profile?username=instagram" | jq .
```
**Fetch post by shortcode:**
```bash
curl -s "http://localhost:8080/v1/post/CxYZaBC1234" | jq .
```
**Fetch comments with pagination:**
```bash
curl -G "http://localhost:8080/v1/comments/CxYZaBC1234" --data-urlencode "limit=50" | jq .
```
---
## License
MIT License