Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lamaparbat/insta-scraper
Instagram Tagged Post Scraper
https://github.com/lamaparbat/insta-scraper
docker express-joi-validation expressjs joi puppeteer render render-hosting typescript
Last synced: 24 days ago
JSON representation
Instagram Tagged Post Scraper
- Host: GitHub
- URL: https://github.com/lamaparbat/insta-scraper
- Owner: lamaparbat
- Created: 2024-05-05T03:51:30.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-05-19T04:50:10.000Z (7 months ago)
- Last Synced: 2024-08-24T13:58:53.027Z (4 months ago)
- Topics: docker, express-joi-validation, expressjs, joi, puppeteer, render, render-hosting, typescript
- Language: TypeScript
- Homepage:
- Size: 11.6 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Instagram Tagged Post Scraper
This Instagram Tagged Post Scraper is a powerful tool built with Express, Node.js, and TypeScript, making it efficient and easy to use. Leveraging the headless scraping capabilities of Puppeteer and DOM selectors, this scraper extracts valuable data from Instagram's tagged posts.
## Features
- **Efficient Scraping:** Utilizes Puppeteer for efficient headless scraping.
- **Flexible API:** Allows users to specify the target Instagram user ID as an API parameter.
- **Custom Media Server:** Allows users to render insta tagged photos from our customer server instead of instagram server (Due to restriction and auth issues).
- **Optional MongoDB Integration:** Users can optionally specify a MongoDB URL in .env of project to store fetched data directly into a MongoDB database.## API Usage
### Request
`GET /insta/tags?instaId=lamaDev`: Get instagram tagged photos by Instagram User Id.
`GET /insta/media/:filename`: Get/Render single instagram tagged media by Filename.
### Parameters
- `instaId`: The target Instagram user ID whose tagged posts you want to scrape.
### Response
The response will be an array of objects, each containing the URL and link of the scraped images:
```json
[
{ "url": "https://instagram.com/acaks..", "link": "post_link_1" },
{ "url": "https://instagram.com/acaks..", "link": "post_link_2" },
...
]