An open API service indexing awesome lists of open source software.

https://github.com/leodeveloper/qasimilaritysearchchroma

This project is dedicated to creating a robust Question and Answer (Q&A) similarity search application using Python including Jupyter Notebook, Chroma Database, Vector Database, LangChain, large language models, RetrievalQA, ChatGPT, OpenAI Embeddings, PyPDFDirectoryLoader, and RecursiveCharacterTextSplitter
https://github.com/leodeveloper/qasimilaritysearchchroma

Last synced: about 1 year ago
JSON representation

This project is dedicated to creating a robust Question and Answer (Q&A) similarity search application using Python including Jupyter Notebook, Chroma Database, Vector Database, LangChain, large language models, RetrievalQA, ChatGPT, OpenAI Embeddings, PyPDFDirectoryLoader, and RecursiveCharacterTextSplitter

Awesome Lists containing this project

README

          

# Question and Answer Similarity Search Application

This project is dedicated to creating a robust Question and Answer (Q&A) similarity search application using Python. By leveraging a variety of tools and technologies including Jupyter Notebook, Chroma Database, Vector Database, LangChain, large language models, RetrievalQA, ChatGPT, OpenAI Embeddings, PyPDFDirectoryLoader, and RecursiveCharacterTextSplitter, we aim to deliver a comprehensive solution for finding similarities between questions and answers.

## Features

- **Semantic Similarity**: Utilize state-of-the-art language models and embeddings to compute semantic similarity between questions and answers.
- **Vector Database**: Store and manage question-answer pairs efficiently using a vector database.
- **Intuitive Interface**: Develop a user-friendly interface, possibly with Jupyter Notebook, to facilitate easy interaction with the application.
- **Extensibility**: Design the application in a modular fashion to allow for easy integration of additional features and enhancements.
- **Efficient Search**: Implement efficient search algorithms to quickly retrieve similar questions and answers from the database.
- **PDF Support**: Incorporate functionality to handle PDF documents using PyPDFDirectoryLoader and RecursiveCharacterTextSplitter.

## Technologies Used

- **Python**: The primary programming language for the application logic.
- **Jupyter Notebook**: Possibly used for developing and presenting the application interface.
- **Chroma Database**: Store and manage data efficiently.
- **Vector Database**: Manage question-answer pairs using vectors for fast retrieval.
- **LangChain**: Utilize for language processing tasks and feature extraction.
- **Large Language Model**: Leverage a large language model like ChatGPT for generating responses and computing similarity.
- **RetrievalQA**: Framework for building question-answering systems.
- **OpenAI Embeddings**: Generate embeddings for text data to represent semantic meaning.
- **PyPDFDirectoryLoader**: Library for loading text from PDF documents.
- **RecursiveCharacterTextSplitter**: Tool for splitting text into smaller, more manageable units.

## Installation

1. Clone the repository: