https://github.com/jlonge4/local_llama

This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.
https://github.com/jlonge4/local_llama

artificial-intelligence langchain llama-cpp llamaindex machinelearning offline python

Last synced: 3 months ago
JSON representation

This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.

Host: GitHub
URL: https://github.com/jlonge4/local_llama
Owner: jlonge4
License: apache-2.0
Created: 2023-05-20T18:05:24.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-07-12T22:18:50.000Z (about 1 year ago)
Last Synced: 2024-11-04T19:42:28.389Z (8 months ago)
Topics: artificial-intelligence, langchain, llama-cpp, llamaindex, machinelearning, offline, python
Language: Python
Homepage:
Size: 71.3 KB
Stars: 238
Watchers: 6
Forks: 39
Open Issues: 10
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-ccamel - jlonge4/local_llama - This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies. (Python)

README

# Local Llama

This project enables you to chat with your PDFs, TXT files, or Docx files entirely offline, free from OpenAI dependencies. It's an evolution of the gpt_chatwithPDF project, now leveraging local LLMs for enhanced privacy and offline functionality.

## Features

- Offline operation: Run in airplane mode
- Local LLM integration: Uses Ollama for improved performance
- Multiple file format support: PDF, TXT, DOCX, MD
- Persistent vector database: Reusable indexed documents
- Streamlit-based user interface

## New Updates

- Ollama integration for significant performance improvements
- Uses nomic-embed-text and llama3:8b models (can be changed to your liking)
- Upgraded to Haystack 2.0
- Persistent Chroma vector database to enable re-use of previously updloaded docs

## Installation

1. Install Ollama from https://ollama.ai/download
2. Clone this repository
3. Install dependencies:
```
pip install -r requirements.txt
```
4. Pull required Ollama models:
```
ollama pull nomic-embed-text
ollama pull llama3:8b
```

## Usage

1. Start the Ollama server:
```
ollama serve
```
2. Run the Streamlit app:
```
python -m streamlit run local_llama_v3.py
```
3. Upload your documents and start chatting!

## How It Works

1. Document Indexing: Uploaded files are processed, split, and embedded using Ollama.
2. Vector Storage: Embeddings are stored in a local Chroma vector database.
3. Query Processing: User queries are embedded and relevant document chunks are retrieved.
4. Response Generation: Ollama generates responses based on the retrieved context and chat history.

## License

This project is licensed under the Apache 2.0 License.

## Acknowledgements

- Ollama team for their excellent local LLM solution
- Haystack for providing the RAG framework
- The-Bloke for the GGUF models

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jlonge4/local_llama

Awesome Lists containing this project

README