Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jmservera/aithena
A system to extract knowledge from books
https://github.com/jmservera/aithena
Last synced: 8 days ago
JSON representation
A system to extract knowledge from books
- Host: GitHub
- URL: https://github.com/jmservera/aithena
- Owner: jmservera
- License: mit
- Created: 2023-07-16T21:43:59.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-08T22:41:44.000Z (7 months ago)
- Last Synced: 2024-06-09T00:06:40.503Z (7 months ago)
- Language: Python
- Size: 1.57 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 16
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Aithena
A system to extract knowledge from books in pdf format. It uses the LLaMA embeddings
to generate a vector database of the books in Qdrant. The system is built using
Docker Compose with the following services:* [Qdrant](https://qdrant.tech/) - Vector database
* [LLaMA.cpp Server](https://llama-cpp-python.readthedocs.io/en/latest/) - Embeddings
* [LLaMA model](https://huggingface.co/openlm-research/open_llama_7b) - Embeddings model:
you can download the model and then generate the bin file running the llama-base
container `docker exec -it aithena-llama-base-1 python3 convert.py models/7B`.
* A document lister and document indexer that will go through all the pdf files
stored in an Azure Storage account and index them in Qdrant.## Usage
Configure the volumes folders in the `docker-compose.yml` file.
Run `docker-compose up` to start the system. The first time you run it, it will
fail because you won't have the LLaMA model downloaded. You can download it from
the [Hugging Face model hub](https://huggingface.co/openlm-research/open_llama_7b)
and then generate the bin file running the llama-base container `docker exec -it
aithena-llama-base-1 python3 convert.py models/7B`.# Reference
* https://python.langchain.com/docs/ecosystem/integrations/llamacpp