Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/obaskly/docai
GPT-3 based Question Answering System that reads text from PDF, DOCX, or TXT files and answers questions based on the content.
https://github.com/obaskly/docai
ai chatbot chatgpt gpt gui python
Last synced: 7 days ago
JSON representation
GPT-3 based Question Answering System that reads text from PDF, DOCX, or TXT files and answers questions based on the content.
- Host: GitHub
- URL: https://github.com/obaskly/docai
- Owner: obaskly
- Created: 2023-04-08T23:04:37.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-04-09T02:16:31.000Z (almost 2 years ago)
- Last Synced: 2024-11-11T01:35:24.467Z (2 months ago)
- Topics: ai, chatbot, chatgpt, gpt, gui, python
- Language: Python
- Homepage:
- Size: 22.5 KB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Docai
Docai is a GPT-3 based Question Answering System that can provide answers based on a PDF, DOCX, and TXT files.
Key Features •
How To Use •
Requirements •
Copyright## Key Features
* File handling
- The script supports PDF, DOCX, and TXT files
- Read the content using the pdfplumber, docx, and built-in open() functions
* GPT-3 integration
- The script uses the OpenAI GPT-3 model, specifically the `text-davinci-003` engine, to generate answers to questions.
* Confidence scoring
- The script calculates confidence scores for the generated answers using log probabilities returned by the GPT-3 API.
* Concurrency
- It uses the `concurrent.futures.ThreadPoolExecutor` to process questions concurrently, potentially speeding up the process.
* Text preprocessing
- The script splits the input document into chunks to fit within GPT-3's token limit, and post-processes the answers to remove duplicate sentences.
* Saving conversation history
- The script allows users to save the conversation history to a text file.
* Caching
- The script uses `lru_cache` decorator to cache the answers generated by GPT-3. This way, if a user asks the same question again, the cached answer can be returned instead of making another API call.
* Gui
- The script provides a friendly graphical user interface built using the tkinter library and `ttkthemes` allowing users to select a file, input a question, view the answer, and save the conversation history.## How To Use
- Put you api key in line 45
- Run the script
- Select your file
- Enter your question and click submitIt's as simple as that
> **Note**
> We will provide an executable version soon## Requirements
`pip install openai pdfplumber python-docx`
## Copyright
All rights reserved to Bropocalypse Team.