Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/salman0ansari/sitefetch
Fetch a site and extract its readable content as Markdown (to be used with AI models).
https://github.com/salman0ansari/sitefetch
ai chatgpt crawler fetcher golang scraping
Last synced: about 11 hours ago
JSON representation
Fetch a site and extract its readable content as Markdown (to be used with AI models).
- Host: GitHub
- URL: https://github.com/salman0ansari/sitefetch
- Owner: salman0ansari
- License: unlicense
- Created: 2025-02-04T12:43:19.000Z (3 days ago)
- Default Branch: main
- Last Pushed: 2025-02-06T08:15:31.000Z (1 day ago)
- Last Synced: 2025-02-06T08:28:05.442Z (1 day ago)
- Topics: ai, chatgpt, crawler, fetcher, golang, scraping
- Language: Go
- Homepage:
- Size: 28.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# sitefetch
Fetch a site and extract its readable content as Markdown (to be used with AI models).
![image](https://github.com/user-attachments/assets/bdd90bfe-ed4e-445f-8121-b8284128e0c4)
## Install
Install globally
```bash
go install github.com/salman0ansari/sitefetch@latest
```## Usage
![image](https://github.com/user-attachments/assets/4391566d-bb82-4089-a3d8-0ae3b3db46ae)
```bash
sitefetch https://hisalman.in --outfile site.txt# or better concurrency
sitefetch https://hisalman.in --outfile site.txt --concurrency 10
```### Match specific pages
Use the `--match` flag to specify the pages you want to fetch:
```bash
sitefetch https://vite.dev --match "/blog/**,/guide/**"
```### Content selector
```bash
sitefetch https://vite.dev --content-selector ".content"
```## Credit
This project is a Go implementation of the original [sitefetch](https://github.com/egoist/sitefetch) written in TypeScript by [egoist](https://github.com/egoist).