https://github.com/postgresml/example-korvus-firecrawl
https://github.com/postgresml/example-korvus-firecrawl
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/postgresml/example-korvus-firecrawl
- Owner: postgresml
- License: mit
- Created: 2024-09-26T20:36:03.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-26T20:39:39.000Z (about 1 year ago)
- Last Synced: 2025-03-24T08:42:07.093Z (9 months ago)
- Language: Python
- Size: 2.93 KB
- Stars: 1
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Korvus x Firecrawl Example
This example application demonstrates how to perform web crawling, semantic search, and Retrieval-Augmented Generation (RAG) using [Korvus](https://github.com/postgresml/korvus) and [Firecrawl](https://www.firecrawl.dev/).
## Features
- Web crawling using Firecrawl
- Semantic search over crawled content
- RAG (Retrieval-Augmented Generation) for question answering
## Prerequisites
- Python 3.7+
- Firecrawl API key
- PostgresML database URL
## Installation
1. Clone this repository:
```
git clone https://github.com/postgresml/example-korvus-firecrawl
cd example-korvus-firecrawl
```
2. Install the required packages:
```
pip install -r requirements.txt
```
3. Create a `.env` file in the project root and add your credentials:
```
FIRECRAWL_API_KEY=your_firecrawl_api_key
KORVUS_DATABASE_URL=your_postgresml_database_url
CRAWL_URL=https://example.com
CRAWL_LIMIT=100
```
## Usage
The application supports three main actions: crawl, search, and rag.
1. Crawl a website:
```
python main.py crawl
```
2. Perform semantic search:
```
python main.py search
```
3. Use RAG for question answering:
```
python main.py rag
```
For search and RAG, you'll be prompted to enter queries. Type 'q' to quit the input loop.
## How it works
1. The application uses Firecrawl to crawl the specified website and extract markdown content.
2. Crawled data is processed and stored using Korvus.
3. Semantic search allows you to find relevant documents based on your queries.
4. RAG combines retrieved context with a language model to answer questions.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.