https://github.com/instagram-automations/instagram-scraper-api

instagram scraper api toolkit
https://github.com/instagram-automations/instagram-scraper-api

anti-detect api automation captcha cli docker graphql http httpx instagram instagram-scraper-api nodejs playwright proxy puppeteer python rest scraper

Last synced: 8 months ago
JSON representation

instagram scraper api toolkit

Host: GitHub
URL: https://github.com/instagram-automations/instagram-scraper-api
Owner: Instagram-Automations
Created: 2025-10-13T16:19:35.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-10-13T16:26:45.000Z (8 months ago)
Last Synced: 2025-10-14T19:04:41.812Z (8 months ago)
Topics: anti-detect, api, automation, captcha, cli, docker, graphql, http, httpx, instagram, instagram-scraper-api, nodejs, playwright, proxy, puppeteer, python, rest, scraper
Homepage:
Size: 1.11 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# instagram scraper api

A plug-and-play toolkit to build your own **Instagram Scraping API** with rotating proxies, CAPTCHA handling, and rate-limit friendly strategies. Perfect for analysts, growth teams, and SaaS builders who need reliable IG data pipelines.

For discussion, queries, and freelance work — reach out 👆

---

## Introduction
> This project provides a ready-to-run **REST API** that scrapes public Instagram data (profiles, posts, reels, comments, hashtags) using headless browsers and/or HTTP clients. It focuses on stability, safety, and scale with rotating proxies, session pools, and optional CAPTCHA solving integration.

### Key Benefits
1. Saves time and automates setup.
2. Scalable for multiple use cases.
3. Safer with anti-detect and proxy logic.

---

## Features (Table)

| Feature | Details |
|---|---|
| REST Endpoints | `/v1/profile`, `/v1/posts`, `/v1/post/{shortcode}`, `/v1/hashtag/{tag}`, `/v1/comments/{shortcode}` |
| Dual Engines | **Playwright/Puppeteer** (browser) + **HTTP clients** (requests/httpx) |
| Proxy Rotation | Per-request & sticky sessions, residential/MNO proxy ready |
| CAPTCHA Handling | Pluggable solvers (2Captcha/CapMonster/API hook) |
| Session Management | Cookie jars, device fingerprints, randomized headers/delays |
| Rate Limiting | Token bucket, concurrency caps, backoff & retry |
| Exporters | JSONL/CSV/NDJSON + webhooks/Kafka-ready stubs |
| Dockerized | Single-command local or server deploy |

---

## Use Cases
- Market & competitor analysis (content cadence, engagement rates)
- Influencer discovery (hashtag/topic mining, audience metrics)
- Social listening (keyword/hashtag monitoring, comment sentiment)
- Dataset building for ML/NLP (public captions, comments, metadata)

---

## FAQs

**Q:** What is an Instagram Scraping API?
**A:** It’s a server that exposes endpoints to fetch public Instagram data (profiles, posts, reels, comments, hashtags) by performing automated browsing or HTTP requests under the hood, then returning normalized JSON.

**Q:** What kind of data can be extracted?
**A:** Public profile metadata (username, bio, followers/following counts), posts/reels (shortcode, captions, media URLs, like/comment counts, timestamps), comments (text, author, time), hashtags (top/recent posts), and lightweight engagement metrics—subject to Instagram’s terms and your jurisdiction’s laws.

**Q:** How do Instagram scraping APIs handle proxies and CAPTCHAs?
**A:** Proxies are rotated per request or per session (sticky) to distribute traffic and reduce blocks. User-agents, headers, and delays are randomized to look human. When CAPTCHAs appear, the API routes the challenge to a solver service (e.g., 2Captcha/CapMonster) via a configurable adapter and retries with the solved token.

---

## Results
-----------------------------------
> 10x faster posting schedules
> 80% engagement increase on group campaigns
> Fully automated lead response system

## Performance Metrics
-----------------------------------
Average Performance Benchmarks:
- **Speed:** 2x faster than manual posting
- **Stability:** 99.2% uptime
- **Ban Rate:** <0.5% with safe automation mode
- **Throughput:** 100+ posts/hour per session

---

##Do you have a customize project for us ?
Contact Us