Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dankgarlic1/summarizemypdf

🚧 Summarize My PDF πŸ€– - Upload your PDF, get smart summaries, and chat with AI! 🧠 Powered by Pinecone DB, OpenAI, and PostgreSQL. Currently under construction, but feel free to explore! ✨
https://github.com/dankgarlic1/summarizemypdf

aws chatbot chatgpt clerk docker nextjs pdf-ai pinecone postgres rag

Last synced: 2 days ago
JSON representation

🚧 Summarize My PDF πŸ€– - Upload your PDF, get smart summaries, and chat with AI! 🧠 Powered by Pinecone DB, OpenAI, and PostgreSQL. Currently under construction, but feel free to explore! ✨

Awesome Lists containing this project

README

        

# πŸ“„ **Summarize My PDF** πŸ€–

## πŸŽ₯ **Video Preview**

![alt text](image.png)

![Video Overview](https://drive.google.com/file/d/1ov3namORvpsoEraPvM3G052zya859Haw/view?usp=drive_link)

---

_(Currently Under Construction β€” So Close Yet So Far!)_ 🚧

_This project is like your favorite dish in the oven – it smells great, but you can’t eat it just yet! πŸ• While it’s still baking, feel free to peek behind the scenes._ 😎

**Update:** This is the hardest project I have built up until now. Unfortunately, Stripe has now switched their policy, and I can't create an account unless I get an invite. Therefore, I won’t be deploying this SaaS. However, you can still tinker with it in local development. Please make sure to leave a star if you like it! ⭐

---

Welcome to **Summarize My PDF AI**! This project allows users to upload a PDF, splits the document into multiple embeddings, stores them in Pinecone DB, and uses those embeddings in a chatbot to provide accurate and contextual answers, with chats stored in PostgreSQL.

### 🎯 **Features**

- πŸ“‚ **Drag-and-Drop PDF Upload**
- 🧠 **PDF Content Summarization**
- 🌐 **Pinecone DB for Embeddings**
- πŸ’¬ **AI Chatbot with Contextual Understanding**
- πŸ—ƒοΈ **Chat History Stored in PostgreSQL**
- ☁️ **AWS S3 for File Storage**

---

### πŸš€ **Getting Started**

Follow the steps below to get the project up and running on your local machine.

#### 1. **Clone the Repository**

```bash
git clone https://github.com/dankgarlic1/SummarizeMyPDF.git
cd summarize-my-pdf-ai
```

#### 2. **Install Dependencies**

Make sure you have Node.js installed, then install the project dependencies:

```bash
npm install
```

#### 3. **Set Up Environment Variables**

Create a `.env` file in the root directory and add the necessary environment variables. **Do not share your API keys publicly!** Make sure your `.env` file contains something like this:

```bash
# Clerk API Keys
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=
CLERK_SECRET_KEY=

# Database
DATABASE_URL=

# AWS S3
NEXT_PUBLIC_S3_ACCESS_KEY_ID=
NEXT_PUBLIC_S3_SECRET_ACCESS_KEY=
NEXT_PUBLIC_S3_BUCKET_NAME=

# Pinecone DB
PINECONE_API_KEY=

# OpenAI
OPENAI_API_KEY=
```

#### 4. **Run the Development Server**

Start the development server with:

```bash
npm run dev
```

Your app will be running at [http://localhost:3000](http://localhost:3000) πŸš€.

---

### 🐳 **Docker Setup**

To run this project using Docker, follow these steps:

1. **Create an Empty Project Directory**

Create a directory for your project and navigate into it.

2. **Create a `.env` File**

Inside the directory, create a `.env` file and use the sample provided above to add your environment variables.

3. **Pull the Docker Image**

```bash
docker pull harshitraizada63/summarize-my-pdf
```

4. **Run the Docker Container**

Run the container using the following command:

```bash
docker run -p 3000:3000 --env-file .env harshitraizada63/summarize-my-pdf
```

5. **Access the App**

Your app will be available at [http://localhost:3000](http://localhost:3000).

---

### πŸ› οΈ **Technologies Used**

- **Next.js** - Server-side rendering and static site generation 🌐
- **PostgreSQL** - Robust database for storing chat history πŸ—„οΈ
- **Pinecone** - Vector database for efficient embeddings πŸ“Š
- **OpenAI** - AI models for summarization and contextual chat πŸ€–
- **AWS S3** - File storage for uploaded PDFs ☁️
- **React Dropzone** - Smooth drag-and-drop PDF upload πŸ“‚
- **Drizzle ORM** - Simple, yet powerful ORM for database operations πŸ› οΈ

---

### πŸ”₯ **Running Database Migrations**

To push database changes to PostgreSQL, run:

```bash
npm run db:push
```

To access the database studio:

```bash
npm run db:studio
```

---

### 🎨 **Styling**

TailwindCSS is used for quick and scalable UI development. All components are highly customizable via props.

---

### πŸ§‘β€πŸ’» **Local Development Tips**

- For managing API keys securely, always use environment variables.
- Use **react-hot-toast** for displaying notifications and loading states.
- For custom embeddings and PDF content processing, **@pinecone-database/doc-splitter** handles PDF chunking efficiently.

---

### ⚠️ **Important Note**

In the `FileUpload` component, I hardcoded my email (`[email protected]`) to allow unlimited PDF sessions while other users have a limit of two. To ensure fair use and limit resources, please comment out or remove the part of the code where my email is hardcoded:

```jsx
const isSpecialUser = userEmail === "[email protected]";
```

Feel free to explore the project πŸš€