Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kaarthik108/snowbrain
snowBrain - AI-Driven Insights with Snowflake (New version- https://github.com/kaarthik108/snowbrain-AGUI)
https://github.com/kaarthik108/snowbrain
chatgpt data fastapi langchain nextjs pinecone snowflake tailwindcss
Last synced: 3 months ago
JSON representation
snowBrain - AI-Driven Insights with Snowflake (New version- https://github.com/kaarthik108/snowbrain-AGUI)
- Host: GitHub
- URL: https://github.com/kaarthik108/snowbrain
- Owner: kaarthik108
- License: other
- Created: 2023-06-17T01:26:34.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-03-23T19:57:10.000Z (10 months ago)
- Last Synced: 2024-10-14T07:27:47.524Z (4 months ago)
- Topics: chatgpt, data, fastapi, langchain, nextjs, pinecone, snowflake, tailwindcss
- Language: TypeScript
- Homepage: https://snowbrain.dev
- Size: 1.81 MB
- Stars: 115
- Watchers: 2
- Forks: 21
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# snowBrain
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https://github.com/kaarthik108/snowbrain&project-name=snowbrain&repo-name=snowbrain)
https://github.com/kaarthik108/snowBrain/assets/53030784/582eee20-bcf6-4db7-9343-2941937374f9
SnowBrain is an open-source prototype that serves as your personal data analyst. It converses in SQL, remembers previous discussions, and even draws data visualizations for you.
This project is a unique blend of Snowflake, Langchain, OpenAI, Pinecone, NEXTjs, and FastAPI, among other technologies. It's all about reimagining the simplicity of SQL querying. Dive in and discover a new way of interacting with your data.
## Tech Stack
- [Snowflake](https://www.snowflake.com/) - Data Cloud
- [Next.js](https://nextjs.org/) - Frontend & backend
- [Supabase](https://supabase.com/) - DB - Persist chat messages
- [Tailwindcss](https://tailwindcss.com/) - Styling
- [Pinecone](https://www.pinecone.io/) - Vector database
- [OpenAI](https://www.openai.com/) - LLM
- [Langchain](https://js.langchain.com/docs/) - LLM wrapper
- [Cloudinary](https://cloudinary.com/) - Image data
- [Clerk.dev](https://clerk.dev/) - Auth
- [Upstash Redis](https://upstash.com/) - Rate limiting
- [Fast API](https://fastapi.tiangolo.com/) - Backend python
- [Modal Labs](https://modal.com/) - Host backend fastapi
- [Vercel](https://vercel.com/) - Hosting
- [umami](https://umami.is/) - Web analytics## Features
- **Snowflake to Vector Database**: Automatic conversion of all Snowflake DDL to a vector database.
- **Conversational Memory**: Maintain context and improve the quality of interactions.
- **Snowflake Integration**: Integrate with Snowflake schema for automatic SQL generation and visualization.
- **Pinecone Vector Database**: Leverage Pinecone's vector database management system for efficient searching capabilities.
- **Secure Authentication**: Employ Clerk.dev for secure and hassle-free user authentication.
- **Rate Limit Handling**: Utilize Upstash Redis for managing rate limits.
- **Fast API**: High-performance Python web framework for building APIs.## Example Queries
snowBrain is designed to make complex data querying simple. Here are some example queries you can try:
- **Total revenue per product category**: "Show me the total revenue for each product category."
- **Top customers by sales**: "Who are the top 10 customers by sales?"
- **Average order value per region**: "What is the average order value for each region?"
- **Order volume**: "How many orders were placed last week?"
- **Product price listing**: "Display the list of products with their prices."## Installation
Follow these steps to get **snowBrain** up and running in your local environment.
1. **Update Environment Variables**
Make sure to update the environment variables as necessary. Refer to the example provided:
```bash
.env.example
```2. **Auto fetch All Schema DDL**
You can do this by running the following command:
```bash
python3 embed/snowflake_ddl_fetcher.py
```
Make sure to install requirements using
```bash
pip3 install -r embed/requirements.txt
```3. **Convert DDL Documents to Vector & Upload to Pinecone**
Use the following command to do this:
```bash
python3 embed/embed.py
```4. **Install Dependencies for the Code Plugin**
Navigate to the code plugin directory and install the necessary dependencies using Poetry:
```bash
cd code-plugin && poetry install
```5. **Deploy FastAPI to Modal Labs**
Run the following command to deploy your FastAPI (make sure to add a secrets file in modal labs):
```bash
modal deploy main.py
```After deploying, make sure to store the endpoint in your environment variables:
```bash
MODAL_API_ENDPOINT=
MODAL_AUTH_TOKEN=random_secret
```6. **Install packages**
Install packages using the following command:
```bash
bun install
```7. **Run Locally**
Test the setup locally using the following command:
```bash
bun run dev
```
Test the build
```bash
bun run build
```8. **Deploy to Vercel**
Finally, when you're ready, deploy the project to Vercel.
Note: Vercel build is automatically blocked on folders code-plugin, embed and readme.md. You can additionally add a build block command in vercel's dashboard.
## One-Click Deploy
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https://github.com/kaarthik108/snowbrain&project-name=snowbrain&repo-name=snowbrain)
## Contributing
Here's how you can contribute:
- [Open an issue](https://github.com/kaarthik108/snowbrain/issues) if you believe you've encountered a bug.
- Make a [pull request](https://github.com/kaarthik108/snowbrain/pulls) to add new features/make improvements/fix bugs.
## Credits
Thanks to @jaredpalmer, @shuding_, @shadcn, @thorwebdev