https://github.com/salman0ansari/sitefetch

Fetch a site and extract its readable content as Markdown (to be used with AI models).
https://github.com/salman0ansari/sitefetch

ai chatgpt crawler fetcher golang scraping

Last synced: 4 months ago
JSON representation

Fetch a site and extract its readable content as Markdown (to be used with AI models).

Host: GitHub
URL: https://github.com/salman0ansari/sitefetch
Owner: salman0ansari
License: unlicense
Created: 2025-02-04T12:43:19.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-02-06T08:15:31.000Z (5 months ago)
Last Synced: 2025-03-25T00:44:58.043Z (4 months ago)
Topics: ai, chatgpt, crawler, fetcher, golang, scraping
Language: Go
Homepage:
Size: 30.3 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# sitefetch

Fetch a site and extract its readable content as Markdown (to be used with AI models).

![image](https://github.com/user-attachments/assets/bdd90bfe-ed4e-445f-8121-b8284128e0c4)

## Install

Install globally

```bash
go install github.com/salman0ansari/sitefetch@latest
```

## Usage

![image](https://github.com/user-attachments/assets/4391566d-bb82-4089-a3d8-0ae3b3db46ae)

```bash
sitefetch https://hisalman.in --outfile site.txt

# or better concurrency
sitefetch https://hisalman.in --outfile site.txt --concurrency 10
```

### Match specific pages

Use the `--match` flag to specify the pages you want to fetch:

```bash
sitefetch https://vite.dev --match "/blog/**,/guide/**"
```

### Content selector

```bash
sitefetch https://vite.dev --content-selector ".content"
```

## Credit
This project is a Go implementation of the original [sitefetch](https://github.com/egoist/sitefetch) written in TypeScript by [egoist](https://github.com/egoist).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/salman0ansari/sitefetch

Awesome Lists containing this project

README