{"id":27994461,"url":"https://github.com/surajsanap/pdfgemini_bot","last_synced_at":"2025-05-08T19:12:16.943Z","repository":{"id":216846180,"uuid":"741917886","full_name":"SurajSanap/PDFGemini_Bot","owner":"SurajSanap","description":"PDFGem Chat is an interactive chat interface designed for querying information from uploaded PDF files.","archived":false,"fork":false,"pushed_at":"2024-01-14T08:59:56.000Z","size":426,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-08T19:12:13.401Z","etag":null,"topics":["ai","faiss","gemini","genrative-ai","google","langchain","nlp","streamlit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SurajSanap.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2024-01-11T11:33:28.000Z","updated_at":"2024-10-12T06:21:34.000Z","dependencies_parsed_at":"2024-01-13T07:53:48.783Z","dependency_job_id":null,"html_url":"https://github.com/SurajSanap/PDFGemini_Bot","commit_stats":null,"previous_names":["surajsanap/pdfgemini_bot"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SurajSanap%2FPDFGemini_Bot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SurajSanap%2FPDFGemini_Bot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SurajSanap%2FPDFGemini_Bot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SurajSanap%2FPDFGemini_Bot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SurajSanap","download_url":"https://codeload.github.com/SurajSanap/PDFGemini_Bot/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253133114,"owners_count":21859112,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","faiss","gemini","genrative-ai","google","langchain","nlp","streamlit"],"created_at":"2025-05-08T19:12:16.320Z","updated_at":"2025-05-08T19:12:16.939Z","avatar_url":"https://github.com/SurajSanap.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PDFGemini_Bot\n\n## Project Overview\n\nPDFGem Chat is an interactive chat interface designed for querying information from uploaded PDF files. This project utilizes Streamlit, PyPDF2, LangChain, Google Generative AI, and FAISS to create a seamless experience for users to ask questions related to the content of PDF documents.\n\n\u003cimg width=\"960\" alt=\"image\" src=\"https://github.com/SurajSanap/PDFGemini_Bot/assets/101057653/15fe59a5-cf8a-4b9b-8e84-536a99bf1cfe\"\u003e\n\n## Components\n\n### 1. User Interface\n\n- Developed using the Streamlit library for a user-friendly experience.\n- Users can ask questions about the content of uploaded PDF files.\n\n### 2. PDF Processing\n\n- Extracts text from PDF files using PyPDF2.\n- Splits extracted text into manageable chunks.\n\n### 3. Embedding and Vectorization\n\n- Leverages Google Generative AI Embeddings for converting text into vectors.\n- Applies FAISS (Facebook AI Similarity Search) to create a vector store/index of text chunks.\n\n### 4. Conversational Chain\n\n- Implements a conversational chain for question-answering using the Gemini Generative AI model.\n- Configures the prompt template for providing context and framing questions.\n\n### 5. Workflow\n\n- Users upload PDF files and ask questions through the interface.\n- Text is extracted from PDFs, split into chunks, and converted into vectors.\n- The conversational chain processes user input, searches for similar text chunks, and generates responses.\n\n## Code Structure\n\n- **`main()` function**: Sets up the Streamlit interface and handles user input.\n- **`get_pdf_text(pdf_docs)` function**: Extracts text from PDF files.\n- **`get_text_chunks(text)` function**: Splits text into manageable chunks.\n- **`get_vector_store(text_chunks)` function**: Creates a vector store/index from text chunks.\n- **`get_conversational_chain()` function**: Configures the conversational chain for question-answering.\n- **`user_input(user_question)` function**: Processes user input and generates responses.\n- **Environment variables**: Utilizes the `dotenv` library to securely load the Google API key.\n\n## Usage\n\n1. **Upload PDFs**: Use the sidebar to upload one or more PDF files.\n2. **Ask a Question**: Enter your question in the provided text input.\n3. **Submit \u0026 Process**: Click the button to initiate the processing of PDFs and question-answering.\n4. **View Response**: The system generates a response based on the input question and the content of the PDFs.\n\n## Dependencies\n\n- Streamlit\n- PyPDF2\n- LangChain\n- Google Generative AI\n- FAISS\n- Dotenv\n\n## Setup\n\n1. **Install Dependencies**: Ensure the required Python packages are installed.\n2. **Set up Google API Key**: Store the Google API key in a secure manner using the `dotenv` file.\n3. **Run the Application**: Execute the script to launch the Streamlit interface.\n\nFeel free to explore and enhance the functionalities of this project based on your requirements. \n\n---\n\n**PDFGemini** - Unleash the Power of Conversational PDF Exploration! 💬✨\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsurajsanap%2Fpdfgemini_bot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsurajsanap%2Fpdfgemini_bot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsurajsanap%2Fpdfgemini_bot/lists"}