https://github.com/harimkang/docsense
An intelligent document assistant powered by Open-Source Large Language Models
https://github.com/harimkang/docsense
document-qa llm nlp qwen qwen2
Last synced: 3 months ago
JSON representation
An intelligent document assistant powered by Open-Source Large Language Models
- Host: GitHub
- URL: https://github.com/harimkang/docsense
- Owner: harimkang
- License: mit
- Created: 2024-12-17T09:49:25.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-12-17T12:16:55.000Z (10 months ago)
- Last Synced: 2025-04-12T08:58:03.592Z (6 months ago)
- Topics: document-qa, llm, nlp, qwen, qwen2
- Language: Python
- Homepage: https://harimkang.github.io/docsense/
- Size: 39.1 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DocSense 📚
[](https://badge.fury.io/py/docsense)
[](https://opensource.org/licenses/MIT)
[](https://pypi.org/project/docsense/)
[](https://github.com/harimkang/docsense/actions/workflows/test.yml)
[](https://codecov.io/gh/harimkang/docsense)An intelligent document assistant powered by Open-Source Large Language Models 🤖
DocSense is a powerful tool that helps you interact with your documents using natural language. It leverages the open-source Qwen language model (with plans to support more open-source models) to understand and answer questions about your documents with high accuracy and context awareness, all completely free to use.
## Features ✨
- 🔍 Advanced semantic search using FAISS
- 💡 Intelligent question answering with open-source LLMs (currently Qwen)
- 📝 Support for multiple document formats (txt, md, rst, etc.)
- ⚡ GPU acceleration for faster processing
- 🔄 Batch processing for memory efficiency
- 💾 Persistent vector storage## Installation 🛠️
### CPU Version
```bash
pip install docsense
```### GPU Version (Recommended)
First, install PyTorch with CUDA support:
```bash
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
```Then install FAISS with GPU support:
```bash
conda install -c conda-forge faiss-gpu
```Finally, install DocSense:
```bash
pip install docsense
```## Usage 🚀
### Creating Document Index
Index your documents directory:
```bash
docsense index /path/to/your/documents
```### Asking Questions
Ask a question to your documents:
```bash
docsense ask "How to use this library?"
```### Interactive Mode
Start an interactive session for multiple questions:
```bash
docsense daemon
```### Command Line Options
All commands support the following options:
- `--model-name`: Specify the Qwen model to use (default: "Qwen/Qwen2-7B")
- `--device`: Choose computing device ("cuda" or "cpu", default: "cuda")
- `--index-path`: Set custom path for the vector indexExample with options:
```bash
docsense index /path/to/your/documents --model-name "Qwen/Qwen2-7B" --device "cuda" --index-path /path/to/your/index
```## License 📄
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Star History 🌟
[](https://star-history.com/#harimkang/docsense&Date)