An open API service indexing awesome lists of open source software.

https://github.com/tanosshi/reddit-scraper

Scrape any type of reddit content
https://github.com/tanosshi/reddit-scraper

api image javascript js media nextjs nodejs reddit scrape scraper video

Last synced: 5 months ago
JSON representation

Scrape any type of reddit content

Awesome Lists containing this project

README

          

# reddit-scraper

_@tanosshi/reddit-scraper is a quick and simple reddit scraper, to not bloat your own code up_

tanos.fm thumb

## 🚀 How to use (quickly)

### 📦 Installation

```bash
npm install @tanosshi/reddit-scraper
```

> **💡 Info:** Node.js v18 or higher is highly recommended!

### 🛠️ Usage

## In a file

```js
const { scrape, download } = require("@tanosshi/reddit-scraper");

// Only download the image file, if present.
const imagePath = await download("https://reddit.com/r/.../comments/...", { outDir: "out", userAgent: "..." });

// Scrape post content
const res = await scrape("https://reddit.com/r/.../comments/...", {
outDir: "out",
download: true, // download: true | false
userAgent: "...", // userAgent: {STRING} or leave empty
mode: "all", // mode: 'video' | 'image' | 'text' | 'full_media' | 'comments' | 'all'
});

// res = { title, selftext?, imageUrl?, imagePath?, textPath?, commentsPath? }
})();
```

## CLI

```bash
# Basic
npx reddit-scraper

# Output to directory
npx reddit-scraper --out './out/'

# Modes (default is --image)
npx reddit-scraper --text
npx reddit-scraper --full-media
npx reddit-scraper --comments
npx reddit-scraper --all

# Help
npx reddit-scraper --help
```

## Outputs

Console: `{"title":"...","imagePath":"out/img.jpg","textPath":"out/post.txt","commentsPath":"out/post.comments.txt"}`

Text file structures:

```bash
Title:
Author: u/
Subreddit: r/
URL: https://www.reddit.com/...

```

Comment files are full with threads

---

### 🤔 Options

#### Output Directory

- `outDir`: Output folder path (created if missing)
- `download`: Whether to download it in the first place or not (only works in .scrape())
- `userAgent`: Use a custom user agent incase the default one is flagged

#### Modes

- `video`: Video only
- `image`: Image only
- `text`: Text only
- `full_media`: Image + Text
- `comments`: Comments only
- `all`: Image + Text + Comments

_made with ❤️ by tanos_