https://github.com/code3743/social-scraper
Social Scraper is a Node.js console application that leverages Playwright to scrape data from various social media platforms.
https://github.com/code3743/social-scraper
nodejs playwright scraping social-media web-scraping
Last synced: 6 months ago
JSON representation
Social Scraper is a Node.js console application that leverages Playwright to scrape data from various social media platforms.
- Host: GitHub
- URL: https://github.com/code3743/social-scraper
- Owner: code3743
- License: mit
- Created: 2024-09-08T16:54:22.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-09-10T20:14:36.000Z (almost 2 years ago)
- Last Synced: 2025-04-05T11:42:09.429Z (about 1 year ago)
- Topics: nodejs, playwright, scraping, social-media, web-scraping
- Language: JavaScript
- Homepage:
- Size: 27.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Social Scraper
## Description
**Social Scraper** is a console tool developed in **Node.js** that uses **Playwright** to perform web scraping on various social media platforms.
Currently, **Social Scraper** supports the following providers:
1. **X (Twitter)**
2. **Instagram**
## Features
- **Multi-Platform Support**: Compatible with different social media platforms through specific providers.
- **Structured Storage**: Saves the results in JSON files, organized by provider name and date.
- **Session Management**: Handles active sessions for efficient scraping.
## Installation
### Prerequisites
- **Node.js** (version 18 or higher)
- **npm**
### Installation Steps
1. **Clone the Repository**
```bash
git clone https://github.com/code3743/social-scraper.git
```
2. **Navigate to the Project Directory**
```bash
cd social-scraper
```
3. **Install Dependencies**
```bash
npm install
```
## Usage
**Social Scraper** is a console-based tool. To run it, use the following command:
```bash
node app.js
```
### Results
The results are stored in the `/results` folder as JSON files in the format `providerName-date.json`. Each file contains an array of posts with the following structure:
- **id**: The post's identifier.
- **content**: The textual content of the post.
- **media**: An array of URLs for associated media.
- **metadata**: An object containing additional relevant information.
## Contribution
If you would like to contribute to **Social Scraper**, please follow these steps:
1. **Fork the Repository**
2. **Create a Branch for Your Feature or Bug Fix**
```bash
git checkout -b feature/new-feature
```
3. **Make Your Changes and Commit Them**
```bash
git commit -m "Description of changes"
```
4. **Push to Your Branch**
```bash
git push origin feature/new-feature
```
5. **Open a Pull Request**
## License
This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.