https://github.com/zzstoatzz/raggy

scraping and querying documents for LLMs
https://github.com/zzstoatzz/raggy

llms rag scraping vectorstore

Last synced: 6 months ago
JSON representation

scraping and querying documents for LLMs

Host: GitHub
URL: https://github.com/zzstoatzz/raggy
Owner: zzstoatzz
License: apache-2.0
Created: 2024-02-03T21:33:20.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-12-22T08:11:22.000Z (6 months ago)
Last Synced: 2024-12-25T19:59:08.353Z (6 months ago)
Topics: llms, rag, scraping, vectorstore
Language: Python
Homepage: https://zzstoatzz.github.io/raggy/
Size: 2.46 MB
Stars: 17
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # raggy

a Python library for scraping and document processing

## installation

```python

pip install raggy

```

add extras to load different document types:

```python

pip install raggy[chroma]     # ChromaDB support

pip install raggy[tpuf]       # TurboPuffer support

pip install raggy[pdf]        # PDF processing

```

read the [docs](https://zzstoatzz.github.io/raggy/)

### what is it?

a simple-to-use Python library for:

- scraping the web to produce rich documents

- putting these documents in vectorstores

- querying the vectorstores to find documents similar to a query

> [!TIP]

> See this [example](https://github.com/zzstoatzz/raggy/blob/main/examples/chat_with_X/website.py) to chat with any website, or this [example](https://github.com/zzstoatzz/raggy/blob/main/examples/chat_with_X/repo.py) to chat with any GitHub repo.

### license 

this project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.

### contributing

I welcome contributions! See the [contributing guide](https://zzstoatzz.github.io/raggy/contributing) for details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zzstoatzz/raggy

Awesome Lists containing this project

README