Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/run-llama/llama_index
LlamaIndex is a data framework for your LLM applications
https://github.com/run-llama/llama_index
agents application data fine-tuning framework llamaindex llm rag vector-database
Last synced: about 1 month ago
JSON representation
LlamaIndex is a data framework for your LLM applications
- Host: GitHub
- URL: https://github.com/run-llama/llama_index
- Owner: run-llama
- License: mit
- Created: 2022-11-02T04:24:54.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-18T06:00:55.000Z (2 months ago)
- Last Synced: 2024-04-18T06:14:37.548Z (2 months ago)
- Topics: agents, application, data, fine-tuning, framework, llamaindex, llm, rag, vector-database
- Language: Python
- Homepage: https://docs.llamaindex.ai
- Size: 172 MB
- Stars: 30,699
- Watchers: 224
- Forks: 4,144
- Open Issues: 677
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
- Security: SECURITY.md
Lists
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-llm-agents - Llama Index - Data framework for your LLM application. (Frameworks)
- awesome-ml-python-packages - LlamaIndex
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome - run-llama/llama\_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index
- awesome-LLM-resourses - LlamaIndex
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-ChatGPT-repositories - llama_index - LlamaIndex (formerly GPT Index) is a data framework for your LLM applications (Langchain)
- awesome - run-llama/llama\_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- my-awesome - run-llama/llama_index - LlamaIndex (formerly GPT Index) is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex (formerly GPT Index) is a data framework for your LLM applications (Python)
- my-awesome-stars - run-llama/llama_index - LlamaIndex (formerly GPT Index) is a data framework for your LLM applications (Python)
- awesome-from-stars - run-llama/llama_index
- awesome-stars - llama_index - llama | 32669 | (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-starts - run-llama/llama_index - LlamaIndex (formerly GPT Index) is a data framework for your LLM applications (Python)
- awesome-stars - llama_index - llama | 26962 | (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- my-awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (llm)
- awesome-starred - llama_index - llama | 32669 | (Python)
- my-awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- my-awesome-starred - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - llama_index - llama | 32670 | (Python)
- awesome-stars - run-llama/llama_index - `★32724` LlamaIndex is a data framework for your LLM applications (Python)
- awesome-LLMs-finetuning - LlamaIndex
- my-awesome-starred - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- stars - run-llama/llama_index - LlamaIndex (formerly GPT Index) is a data framework for your LLM applications (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- awesome-llm-json - LlamaIndex - defined Pydantic programs for specific output types. (Python Libraries)
- awesome-llm-and-aigc - LlamaIndex - llama/llama_index?style=social"/> : LlamaIndex is a data framework for your LLM applications. [docs.llamaindex.ai](https://docs.llamaindex.ai/) (Summary)
- awesome-stars - llama_index - llama | 32652 | (Python)
- awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- Awesome-RAG - LlamaIndex
- awesome-stars - llama_index - llama | 32688 | (Python)
- my-awesome-stars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- jimsghstars - run-llama/llama_index - LlamaIndex is a data framework for your LLM applications (Python)
- my-awesome - run-llama/llama_index - tuning,framework,llamaindex,llm,rag,vector-database pushed_at:2024-06 star:32.6k fork:4.5k LlamaIndex is a data framework for your LLM applications (Python)
README
# 🗂️ LlamaIndex 🦙
[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-index)](https://pypi.org/project/llama-index/)
[![GitHub contributors](https://img.shields.io/github/contributors/jerryjliu/llama_index)](https://github.com/jerryjliu/llama_index/graphs/contributors)
[![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.gg/dGcwcsnxhU)LlamaIndex (GPT Index) is a data framework for your LLM application.
PyPI:
- LlamaIndex: https://pypi.org/project/llama-index/.
- GPT Index (duplicate): https://pypi.org/project/gpt-index/.LlamaIndex.TS (Typescript/Javascript): https://github.com/run-llama/LlamaIndexTS.
Documentation: https://docs.llamaindex.ai/en/stable/.
Twitter: https://twitter.com/llama_index.
Discord: https://discord.gg/dGcwcsnxhU.
### Ecosystem
- LlamaHub (community library of data loaders): https://llamahub.ai
- LlamaLab (cutting-edge AGI projects using LlamaIndex): https://github.com/run-llama/llama-lab## 🚀 Overview
**NOTE**: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!
### Context
- LLMs are a phenomenal piece of technology for knowledge generation and reasoning. They are pre-trained on large amounts of publicly available data.
- How do we best augment LLMs with our own private data?We need a comprehensive toolkit to help perform this data augmentation for LLMs.
### Proposed Solution
That's where **LlamaIndex** comes in. LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:
- Offers **data connectors** to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.)
- Provides ways to **structure your data** (indices, graphs) so that this data can be easily used with LLMs.
- Provides an **advanced retrieval/query interface over your data**: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
- Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, anything else).LlamaIndex provides tools for both beginner users and advanced users. Our high-level API allows beginner users to use LlamaIndex to ingest and query their data in
5 lines of code. Our lower-level APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, query engines, reranking modules),
to fit their needs.## 💡 Contributing
Interested in contributing? See our [Contribution Guide](CONTRIBUTING.md) for more details.
## 📄 Documentation
Full documentation can be found here: https://docs.llamaindex.ai/en/latest/.
Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!
## 💻 Example Usage
```
pip install llama-index
```Examples are in the `examples` folder. Indices are in the `indices` folder (see list of indices below).
To build a simple vector store index using OpenAI:
```python
import osos.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("YOUR_DATA_DIRECTORY").load_data()
index = VectorStoreIndex.from_documents(documents)
```To build a simple vector store index using non-OpenAI LLMs, e.g. Llama 2 hosted on [Replicate](https://replicate.com/), where you can easily create a free trial API token:
```python
import osos.environ["REPLICATE_API_TOKEN"] = "YOUR_REPLICATE_API_TOKEN"
from llama_index.llms import Replicate
llama2_7b_chat = "meta/llama-2-7b-chat:8e6975e5ed6174911a6ff3d60540dfd4844201974602551e10e9e87ab143d81e"
llm = Replicate(
model=llama2_7b_chat,
temperature=0.01,
additional_kwargs={"top_p": 1, "max_new_tokens": 300},
)# set tokenizer to match LLM
from llama_index import set_global_tokenizer
from transformers import AutoTokenizerset_global_tokenizer(
AutoTokenizer.from_pretrained("NousResearch/Llama-2-7b-chat-hf").encode
)from llama_index.embeddings import HuggingFaceEmbedding
from llama_index import ServiceContextembed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
service_context = ServiceContext.from_defaults(
llm=llm, embed_model=embed_model
)from llama_index import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("YOUR_DATA_DIRECTORY").load_data()
index = VectorStoreIndex.from_documents(
documents, service_context=service_context
)
```To query:
```python
query_engine = index.as_query_engine()
query_engine.query("YOUR_QUESTION")
```By default, data is stored in-memory.
To persist to disk (under `./storage`):```python
index.storage_context.persist()
```To reload from disk:
```python
from llama_index import StorageContext, load_index_from_storage# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir="./storage")
# load index
index = load_index_from_storage(storage_context)
```## 🔧 Dependencies
The main third-party package requirements are `tiktoken`, `openai`, and `langchain`.
All requirements should be contained within the `setup.py` file.
To run the package locally without building the wheel, simply run:```bash
pip install poetry
poetry install --with dev
```## 📖 Citation
Reference to cite if you use LlamaIndex in a paper:
```
@software{Liu_LlamaIndex_2022,
author = {Liu, Jerry},
doi = {10.5281/zenodo.1234},
month = {11},
title = {{LlamaIndex}},
url = {https://github.com/jerryjliu/llama_index},
year = {2022}
}
```