https://github.com/salman0ansari/sitefetch
Fetch a site and extract its readable content as Markdown (to be used with AI models).
https://github.com/salman0ansari/sitefetch
ai chatgpt crawler fetcher golang scraping
Last synced: 10 months ago
JSON representation
Fetch a site and extract its readable content as Markdown (to be used with AI models).
- Host: GitHub
- URL: https://github.com/salman0ansari/sitefetch
- Owner: salman0ansari
- License: unlicense
- Created: 2025-02-04T12:43:19.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-06T08:15:31.000Z (over 1 year ago)
- Last Synced: 2025-08-17T11:58:14.553Z (10 months ago)
- Topics: ai, chatgpt, crawler, fetcher, golang, scraping
- Language: Go
- Homepage:
- Size: 30.3 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# sitefetch
Fetch a site and extract its readable content as Markdown (to be used with AI models).

## Install
Install globally
```bash
go install github.com/salman0ansari/sitefetch@latest
```
## Usage

```bash
sitefetch https://hisalman.in --outfile site.txt
# or better concurrency
sitefetch https://hisalman.in --outfile site.txt --concurrency 10
```
### Match specific pages
Use the `--match` flag to specify the pages you want to fetch:
```bash
sitefetch https://vite.dev --match "/blog/**,/guide/**"
```
### Content selector
```bash
sitefetch https://vite.dev --content-selector ".content"
```
## Credit
This project is a Go implementation of the original [sitefetch](https://github.com/egoist/sitefetch) written in TypeScript by [egoist](https://github.com/egoist).