https://github.com/egoist/sitefetch
Fetch an entire site and save it as a text file (to be used with AI models).
https://github.com/egoist/sitefetch
Last synced: 23 days ago
JSON representation
Fetch an entire site and save it as a text file (to be used with AI models).
- Host: GitHub
- URL: https://github.com/egoist/sitefetch
- Owner: egoist
- License: mit
- Created: 2025-01-07T08:43:01.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-18T05:30:04.000Z (about 1 year ago)
- Last Synced: 2026-01-14T11:12:55.341Z (about 2 months ago)
- Language: TypeScript
- Size: 27.3 KB
- Stars: 1,637
- Watchers: 9
- Forks: 142
- Open Issues: 16
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- my-awesome-list - sitefetch
README
# sitefetch
Fetch an entire site and save it as a text file (to be used with AI models).

## Install
One-off usage (choose one of the followings):
```bash
bunx sitefetch
npx sitefetch
pnpx sitefetch
```
Install globally (choose one of the followings):
```bash
bun i -g sitefetch
npm i -g sitefetch
pnpm i -g sitefetch
```
## Usage
```bash
sitefetch https://egoist.dev -o site.txt
# or better concurrency
sitefetch https://egoist.dev -o site.txt --concurrency 10
```
### Match specific pages
Use the `-m, --match` flag to specify the pages you want to fetch:
```bash
sitefetch https://vite.dev -m "/blog/**" -m "/guide/**"
```
The match pattern is tested against the pathname of target pages, powered by micromatch, you can check out all the supported [matching features](https://github.com/micromatch/micromatch#matching-features).
### Content selector
We use [mozilla/readability](https://github.com/mozilla/readability) to extract readable content from the web page, but on some pages it might return irrelevant contents, in this case you can specify a CSS selector so we know where to find the readable content:
```sitefetch
sitefetch https://vite.dev --content-selector ".content"
```
## Plug
If you like this, please check out my LLM chat app: https://chatwise.app
## API
```ts
import { fetchSite } from "sitefetch"
await fetchSite("https://egoist.dev", {
//...options
})
```
Check out options in [types.ts](./src/types.ts).
## License
MIT.