https://github.com/zzstoatzz/raggy
scraping and querying documents for LLMs
https://github.com/zzstoatzz/raggy
llms rag scraping vectorstore
Last synced: 6 months ago
JSON representation
scraping and querying documents for LLMs
- Host: GitHub
- URL: https://github.com/zzstoatzz/raggy
- Owner: zzstoatzz
- License: apache-2.0
- Created: 2024-02-03T21:33:20.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-22T08:11:22.000Z (6 months ago)
- Last Synced: 2024-12-25T19:59:08.353Z (6 months ago)
- Topics: llms, rag, scraping, vectorstore
- Language: Python
- Homepage: https://zzstoatzz.github.io/raggy/
- Size: 2.46 MB
- Stars: 17
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# raggy
a Python library for scraping and document processing
## installation
```python
pip install raggy
```add extras to load different document types:
```python
pip install raggy[chroma] # ChromaDB support
pip install raggy[tpuf] # TurboPuffer support
pip install raggy[pdf] # PDF processing
```read the [docs](https://zzstoatzz.github.io/raggy/)
### what is it?
a simple-to-use Python library for:
- scraping the web to produce rich documents
- putting these documents in vectorstores
- querying the vectorstores to find documents similar to a query> [!TIP]
> See this [example](https://github.com/zzstoatzz/raggy/blob/main/examples/chat_with_X/website.py) to chat with any website, or this [example](https://github.com/zzstoatzz/raggy/blob/main/examples/chat_with_X/repo.py) to chat with any GitHub repo.### license
this project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
### contributing
I welcome contributions! See the [contributing guide](https://zzstoatzz.github.io/raggy/contributing) for details.