https://github.com/langchain-ai/rag-from-scratch
https://github.com/langchain-ai/rag-from-scratch
Last synced: 4 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/langchain-ai/rag-from-scratch
- Owner: langchain-ai
- Created: 2024-01-31T01:23:48.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-07-09T21:45:44.000Z (9 months ago)
- Last Synced: 2025-04-11T06:17:54.289Z (4 days ago)
- Language: Jupyter Notebook
- Size: 3.17 MB
- Stars: 3,864
- Watchers: 41
- Forks: 1,135
- Open Issues: 35
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- jimsghstars - langchain-ai/rag-from-scratch - (Jupyter Notebook)
- awesome-llm-and-aigc - langchain-ai/rag-from-scratch - ai/rag-from-scratch?style=social"/> : Retrieval augmented generation (RAG) comes is a general methodology for connecting LLMs with external data sources. These notebooks accompany a video series will build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation. (Summary)
README
# RAG From Scratch
LLMs are trained on a large but fixed corpus of data, limiting their ability to reason about private or recent information. Fine-tuning is one way to mitigate this, but is often [not well-suited for facutal recall](https://www.anyscale.com/blog/fine-tuning-is-for-form-not-facts) and [can be costly](https://www.glean.com/blog/how-to-build-an-ai-assistant-for-the-enterprise).
Retrieval augmented generation (RAG) has emerged as a popular and powerful mechanism to expand an LLM's knowledge base, using documents retrieved from an external data source to ground the LLM generation via in-context learning.
These notebooks accompany a [video playlist](https://youtube.com/playlist?list=PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x&feature=shared) that builds up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation.
