Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/salman0ansari/sitefetch

Fetch a site and extract its readable content as Markdown (to be used with AI models).
https://github.com/salman0ansari/sitefetch

ai chatgpt crawler fetcher golang scraping

Last synced: about 11 hours ago
JSON representation

Fetch a site and extract its readable content as Markdown (to be used with AI models).

Awesome Lists containing this project

README

        

# sitefetch

Fetch a site and extract its readable content as Markdown (to be used with AI models).

![image](https://github.com/user-attachments/assets/bdd90bfe-ed4e-445f-8121-b8284128e0c4)

## Install

Install globally

```bash
go install github.com/salman0ansari/sitefetch@latest
```

## Usage

![image](https://github.com/user-attachments/assets/4391566d-bb82-4089-a3d8-0ae3b3db46ae)

```bash
sitefetch https://hisalman.in --outfile site.txt

# or better concurrency
sitefetch https://hisalman.in --outfile site.txt --concurrency 10
```

### Match specific pages

Use the `--match` flag to specify the pages you want to fetch:

```bash
sitefetch https://vite.dev --match "/blog/**,/guide/**"
```

### Content selector

```bash
sitefetch https://vite.dev --content-selector ".content"
```

## Credit
This project is a Go implementation of the original [sitefetch](https://github.com/egoist/sitefetch) written in TypeScript by [egoist](https://github.com/egoist).