https://github.com/leodeveloper/qasimilaritysearchchroma
This project is dedicated to creating a robust Question and Answer (Q&A) similarity search application using Python including Jupyter Notebook, Chroma Database, Vector Database, LangChain, large language models, RetrievalQA, ChatGPT, OpenAI Embeddings, PyPDFDirectoryLoader, and RecursiveCharacterTextSplitter
https://github.com/leodeveloper/qasimilaritysearchchroma
Last synced: about 1 year ago
JSON representation
This project is dedicated to creating a robust Question and Answer (Q&A) similarity search application using Python including Jupyter Notebook, Chroma Database, Vector Database, LangChain, large language models, RetrievalQA, ChatGPT, OpenAI Embeddings, PyPDFDirectoryLoader, and RecursiveCharacterTextSplitter
- Host: GitHub
- URL: https://github.com/leodeveloper/qasimilaritysearchchroma
- Owner: leodeveloper
- Created: 2024-05-05T13:02:39.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-05-06T15:43:45.000Z (almost 2 years ago)
- Last Synced: 2024-05-07T14:27:15.983Z (almost 2 years ago)
- Language: Jupyter Notebook
- Size: 7.45 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Question and Answer Similarity Search Application
This project is dedicated to creating a robust Question and Answer (Q&A) similarity search application using Python. By leveraging a variety of tools and technologies including Jupyter Notebook, Chroma Database, Vector Database, LangChain, large language models, RetrievalQA, ChatGPT, OpenAI Embeddings, PyPDFDirectoryLoader, and RecursiveCharacterTextSplitter, we aim to deliver a comprehensive solution for finding similarities between questions and answers.
## Features
- **Semantic Similarity**: Utilize state-of-the-art language models and embeddings to compute semantic similarity between questions and answers.
- **Vector Database**: Store and manage question-answer pairs efficiently using a vector database.
- **Intuitive Interface**: Develop a user-friendly interface, possibly with Jupyter Notebook, to facilitate easy interaction with the application.
- **Extensibility**: Design the application in a modular fashion to allow for easy integration of additional features and enhancements.
- **Efficient Search**: Implement efficient search algorithms to quickly retrieve similar questions and answers from the database.
- **PDF Support**: Incorporate functionality to handle PDF documents using PyPDFDirectoryLoader and RecursiveCharacterTextSplitter.
## Technologies Used
- **Python**: The primary programming language for the application logic.
- **Jupyter Notebook**: Possibly used for developing and presenting the application interface.
- **Chroma Database**: Store and manage data efficiently.
- **Vector Database**: Manage question-answer pairs using vectors for fast retrieval.
- **LangChain**: Utilize for language processing tasks and feature extraction.
- **Large Language Model**: Leverage a large language model like ChatGPT for generating responses and computing similarity.
- **RetrievalQA**: Framework for building question-answering systems.
- **OpenAI Embeddings**: Generate embeddings for text data to represent semantic meaning.
- **PyPDFDirectoryLoader**: Library for loading text from PDF documents.
- **RecursiveCharacterTextSplitter**: Tool for splitting text into smaller, more manageable units.
## Installation
1. Clone the repository: