Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-azure-openai-llm
"Awesome-AOAI-LLM: a curated list of Azure OpenAI & Large Language Models" 🔎References to Azure OpenAI, 🦙Large Language Models, and related 🌌 services and 🎋libraries.
https://github.com/kimtth/awesome-azure-openai-llm
Last synced: 4 days ago
JSON representation
-
**Section 1: RAG, LlamaIndex, and Vector Storage**
-
**What is the RAG (Retrieval-Augmented Generation)?**
-
**Retrieval-Augmented Generation: Research Papers**
- Self-RAG - 4. 2. `Generator model M`: The main language model that generates task outputs and reflection tokens. It leverages the data labeled by the critic model during training. 3. `Retriever model R`: Retrieves relevant passages. The LM decides if external passages (retriever) are needed for text generation. [git](https://github.com/AkariAsai/self-rag) [17 Oct 2023]
- Retrieval Augmented Generation or Long-Context LLMs? - Context consistently outperforms RAG in terms of average performance. However, RAG's significantly lower cost remains a distinct advantage. [23 Jul 2024]
- RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval - augmented language models by constructing a recursive tree structure from documents. [git](https://github.com/run-llama/llama_index/blob/main/llama-index-packs/llama-index-packs-raptor/README.md) `pip install llama-index-packs-raptor` / [git](https://github.com/profintegra/raptor-rag) [31 Jan 2024]
- Astute RAG
- A Survey on Retrieval-Augmented Text Generation - augmented text generation, highlighting its advantages and state-of-the-art performance in many NLP tasks. These tasks include Dialogue response generation, Machine translation, Summarization, Paraphrase generation, Text style transfer, and Data-to-text generation. [2 Feb 2022]
- Retrieval meets Long Context LLMs - augmentation significantly improves the performance of 4K context LLMs. Perhaps surprisingly, we find this simple retrieval-augmented baseline can perform comparable to 16K long context LLMs. [4 Oct 2023]
- FreshLLMs - augmented prompting methods such as Self-Ask (Press et al., 2022) as well as commercial systems such as Perplexity.AI. [git](https://www.github.com/freshllms/freshqa) [5 Oct 2023]
- RAG for LLMs - Augmented Generation for Large Language Models: A Survey: `Three paradigms of RAG Naive RAG > Advanced RAG > Modular RAG`
- Benchmarking Large Language Models in Retrieval-Augmented Generation - Augmented Generation Benchmark (RGB) is proposed to assess LLMs on 4 key abilities [4 Sep 2023]:
- Active Retrieval Augmented Generation - Looking Active REtrieval augmented generation (FLARE): FLARE iteratively generates a temporary next sentence and check whether it contains low-probability tokens. If so, the system retrieves relevant documents and regenerates the sentence. Determine low-probability tokens by `token_logprobs in OpenAI API response`. [git](https://github.com/jzbjyb/FLARE/blob/main/src/templates.py) [11 May 2023]
- Self-RAG - 4. 2. `Generator model M`: The main language model that generates task outputs and reflection tokens. It leverages the data labeled by the critic model during training. 3. `Retriever model R`: Retrieves relevant passages. The LM decides if external passages (retriever) are needed for text generation. [git](https://github.com/AkariAsai/self-rag) [17 Oct 2023]
- Retrieval-Augmentation for Long-form Question Answering - existent information.` [18 Oct 2023]
- INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning - document relationship understanding. The dataset is designed for instruction tuning, a method that fine-tunes LLMs on natural language instructions. [git](https://github.com/DaoD/INTERS) [12 Jan 2024]
- RAG vs Fine-tuning
- The Power of Noise: Redefining Retrieval for RAG Systems - 5 relevant docs + some amount of random noise to the LLM context maximizes the accuracy of the RAG. [26 Jan 2024]
- Corrective Retrieval Augmented Generation (CRAG) - ai/langgraph/blob/main/examples/rag/langgraph_crag.ipynb)
- OP-RAG: Order-preserve RAG
- Graph Retrieval-Augmented Generation: A Survey
- Retrieval Augmented Generation (RAG) and Beyond - tuning LLMs for specialized tasks. [23 Sep 2024]
- Adaptive-RAG - Augmented Large Language Models through Question Complexity [git](https://github.com/starsuzi/Adaptive-RAG) [21 Mar 2024]
- PlanRAG - > Retrieve -> Make a decision (PlanRAG) [git](https://github.com/myeon9h/PlanRAG) [18 Jun 2024]
- RECOMP: Improving Retrieval-Augmented LMs with Compressors - augmented language models (RALMs). 2. We present two compressors – an `extractive compressor` which selects useful sentences from retrieved documents and an `abstractive compressor` which generates summaries by synthesizing information from multiple documents. 3. Both compressors are trained. [6 Oct 2023]
- Searching for Best Practices in Retrieval-Augmented Generation
- CRAG: Comprehensive RAG Benchmark - answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search [ref](https://www.aicrowd.com/challenges/meta-comprehensive-rag-benchmark-kdd-cup-2024) [7 Jun 2024]
-
**RAG Pipeline & Advanced RAG**
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- ref
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- cite - > Tune Chunks -> Rerank & Classify -> Prompt Engineering. In `llama_index`... [Youtube](https://www.youtube.com/watch?v=ahnGLM-RC1Y)
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- Evaluation with Ragas
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
-
**Advanced RAG**
- git
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- Contextual Retrieval
- Indexing optimization
- HyDE
- Graph RAG (by NebulaGraph) - graph.io/demo) [8 Sep 2023]
- LangChain: HypotheticalDocumentEmbedder - > generate n hypothetical documents -> documents embedding - (avg of embeddings) -> retrieve -> final result.` [ref](https://www.jiang.jp/posts/20230510_hyde_detailed/index.html)
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
- ref - to-improve-rag-peformance-advanced-rag-patterns-part2-0c84e2df66e6) [17 Oct 2023]
- 9 Effective Techniques To Boost Retrieval Augmented Generation (RAG) Systems - effective-rag-techniques.png): ReRank, Prompt Compression, Hypothetical Document Embedding (HyDE), Query Rewrite and Expansion, Enhance Data Quality, Optimize Index Structure, Add Metadata, Align Query with Documents, Mixed Retrieval (Hybrid Search) [2 Jan 2024]
- Evaluation with Ragas
-
**The Problem with RAG**
-
**RAG Solution Design & Application**
- ref
- ref
- RAG capabilities of LlamaIndex to QA about SEC 10-K & 10-Q documents - stack application using LlamaIndex [Sep 2023]
- RAGxplorer
- Danswer
- llm-answer-engine - Inspired Answer Engine Using Next.js, Groq, Mixtral, LangChain, OpenAI, Brave & Serper [Mar 2024]
- turboseek
- quivr
- Cognita
- Perplexica - ai/marqo) / [txtai](https://github.com/neuml/txtai) / [Typesense](https://github.com/typesense/typesense) / [Morphic](https://github.com/miurla/morphic)
- AutoRAG
- RAGflow
- ref
- PaperQA2
- ref
- LightRAG
- ref
- MindSearch - source AI Search Engine Framework [Jul 2024]
- ref
- ref
- Haystack - ready LLM applications. [5 May 2020]
- RAGChecker - grained Framework For Diagnosing RAG [git](https://github.com/amazon-science/RAGChecker) [15 Aug 2024]
- HybridRAG
- MedGraphRAG - graph-rag) [8 Aug 2024]
- ref
- ref
- ref
- ref
- "From Local to Global" GraphRAG with Neo4j and LangChain
- Advanced RAG Techniques - Augmented Generation (RAG) [Jul 2024]
- ref
- ref
- Galileo eBook - compressed.pdf) [Sep 2024]
- SWIRL AI Connect - Pilot. [Apr 2022]
- ref
- Advanced RAG with Azure AI Search and LlamaIndex
- RAG at scale
- LlamIndex Building Performant RAG Applications for Production
- Advanced RAG on Hugging Face documentation using LangChain
- Papers with code
- RAGApp
- ref
- A Practical Approach to Retrieval Augmented Generation (RAG) Systems
- STORM - like articles from scratch based on Internet search. [Mar 2024]
- FlashRAG
- Canopy - source RAG framework and context engine built on top of the Pinecone vector database. [Aug 2023]
- kotaemon - source clean & customizable RAG UI for chatting with your documents. [Mar 2024]
- ref
- Azure OpenAI chat baseline architecture in an Azure landing zone
- ref
- What AI Engineers Should Know about Search
- ref
- ref
- ref
- RAG Builder - ready Retrieval-Augmented Generation (RAG) setup for your data. [Jun 2024]
- ref
- Learn RAG with Langchain
- Genie: Uber’s Gen AI On-Call Copilot
- Introduction to Large-Scale Similarity Search: HNSW, IVF, LSH
- 5 Chunking Strategies For RAG
- ref
- ref
- github:topic
- ref
- GraphRAG Implementation with LlamaIndex
- ref
-
**LlamaIndex**
- git - 2.ai/llamaindex-cli/) [Nov 2023] / `LlamaParse`: A unique parsing tool for intricate documents [git](https://github.com/run-llama/llama_parse) [Feb 2024]
- cite
- cite
- LlamaIndex Overview (Japanese)
- LlamaIndex Tutorial
- CallbackManager (Japanese) - tutorial-002-text-splitter/) [27 May 2023] / -->
- ref
- ref - index.readthedocs.io/en/latest/index.html) / blog:[ref](https://www.llamaindex.ai/blog) / [git](https://github.com/run-llama/llama_index) [Nov 2022]
- Chat engine ReAct mode
- Fine-Tuning a Linear Adapter for Any Embedding Model - tuning the embeddings model requires you to reindex your documents. With this approach, you do not need to re-embed your documents. Simply transform the query instead. [7 Sep 2023]
- ref - index.readthedocs.io/en/latest/index.html): Docs / High-Level Concept: [ref](https://docs.llamaindex.ai/en/latest/getting_started/concepts.html): Concepts / [git](https://github.com/run-llama/llama_index) [Nov 2022]
- Building and Productionizing RAG - to-Big 3. Agents 4. Fine-Tuning 5. Evaluation [Nov 2023]
-
**Vector Database Comparison**
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- git
- Weaviate
- Chroma - source embedding database [Oct 2022]
- Qdrant
- A SQLite extension for efficient vector search, based on Faiss!
- pgvector - source vector similarity search for Postgres [Apr 2021] / [pgvectorscale](https://github.com/timescale/pgvectorscale): 75% cheaper than pinecone [Jul 2023]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Contextual Document Embedding (CDE) - the-power-of-contextual-document-embeddings-enhancing-search-relevance-01abfa814c76) [3 Oct 2024]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Faiss
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Vector Search in Azure Cosmos DB for MongoDB vCore
- A Comprehensive Survey on Vector Database - based, tree-based, graph-based, and quantization-based. [18 Oct 2023]
- text-embedding-ada-002
- Vector Search with OpenAI Embeddings: Lucene Is All You Need
- git
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- lancedb - source columnar format. [Feb 2023]
- Azure SQL's support for natively storing and querying vectors
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Is Cosine-Similarity of Embeddings Really About Similarity? - norm regularization to the product of matrices A and B, a process similar to dropout. The second objective applies L2-norm regularization to each individual matrix, similar to the weight decay technique used in deep learning. [8 Mar 2024]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Pinecone
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Fine-tuning Embeddings for Specific Domains
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
- Not All Vector Databases Are Made Equal - dbs.pdf) [2 Oct 2021]
-
**LlamaIndex example**
- A Cheat Sheet and Some Recipes For Building Advanced RAG - rag-diagram-llama-index.png) [Jan 2024]
- Fine-Tuning a Linear Adapter for Any Embedding Model - tuning the embeddings model requires you to reindex your documents. With this approach, you do not need to re-embed your documents. Simply transform the query instead. [7 Sep 2023]
- ref
- ref
- Training code
- ref
- A Cheat Sheet and Some Recipes For Building Advanced RAG - rag-diagram-llama-index.png) [Jan 2024]
-
**Vector Database Options for Azure**
- Pgvector extension on Azure Cosmos DB for PostgreSQL
- Azure Cache for Redis Enterprise
- ![Deploy to Azure - openai-elastic-vector-langchain%2Fmain%2Finfra%2Fdeployment.json)
-
**Lucene based search engine with text-embedding-ada-002**
-
**RAG Solution Design guide**
-
-
**Section 2** : Azure OpenAI and Reference Architecture
-
**Microsoft Azure OpenAI relevant LLM Framework**
- Prompt Engine - engine-py)
- DeepSpeed
- LMOps - to-image generation and Structured Prompting.
- LLMLingua - Cache, achieving up to 20x compression with minimal performance loss. LLMLingua-2 was released in Mar 2024.
- FLAML
- TaskWeaver - first agent framework for converting natural language requests into executable code with support for rich data structures and domain-adapted planning.
- JARVIS
- Autogen - us/research/blog/autogen-enabling-next-generation-large-language-model-applications/) / [Autogen Studio](https://www.microsoft.com/en-us/research/blog/introducing-autogen-studio-a-low-code-interface-for-building-multi-agent-workflows/) (June 2024)
- AI Central
- PromptBench
- UFO - focused agent for Windows OS interaction.
- PyRIT
- Semantic Kernel - source SDK for integrating AI services like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages such as C# and Python. It's an LLM orchestrator, similar to LangChain. / [git](https://github.com/microsoft/semantic-kernel) / [x-ref](#semantic-kernel)
- Prompty
- Microsoft Fabric
- ref
- Semantic Workbench - ai-platform-blog/introducing-semantic-workbench-your-gateway-to-agentic-ai/ba-p/4212695)
- Kernel Memory - source service and plugin for efficient dataset indexing through custom continuous data hybrid pipelines.
- Azure Machine Learning Promt flow - machine-learning-blog/harness-the-power-of-large-language-models-with-azure-machine/ba-p/3828459) / [git](https://github.com/microsoft/promptflow) [Jun 2023]
- TypeChat
- SAMMO - purpose framework for prompt optimization. / [ref](https://www.microsoft.com/en-us/research/blog/sammo-a-general-purpose-framework-for-prompt-optimization/)
- OmniParser
-
**Microsoft Copilot Product Lineup**
- Copilot Pro
- Copilot Pro
- Copilot Pro
- Copilot Pro
- Microsoft Copilot for Azure - infrastructure-blog/simplify-it-management-with-microsoft-copilot-for-azure-save/ba-p/3981106) [Nov 2023]
- Security Copilot - microsoft-security-copilot-empowering-defenders-at-the-speed-of-ai/) [March 2023]
- Dynamics 365 Copilot - microsoft-dynamics-365-copilot/) [March 2023]
- Copilot in Azure Quantum
- Microsoft 365 Copilot - microsoft-365-copilot-your-copilot-for-work/) [Nov 2023]
- Power App AI Copilot - us/blog/copilot-in-power-automate-new-time-saving-experiences-announced-at-microsoft-ignite-2023/): [Copilot in cloud flows](https://learn.microsoft.com/en-us/power-automate/get-started-with-copilot), [Copilot in Process Mining ingestion](https://learn.microsoft.com/en-us/power-automate/process-mining-copilot-in-ingestion), [Copilot in Power Automate for desktop](https://learn.microsoft.com/en-us/power-automate/desktop-flows/copilot-in-power-automate-for-desktop) ... [Nov 2023]
- Sales Copilot
- Service Copilot
- blog
- youtube
- blog - us/fabric/get-started/copilot-fabric-overview) / [PowerBI Copilot](https://learn.microsoft.com/en-us/power-bi/create-reports/copilot-introduction) [March 2024]
- Copilot Pro
- Microsoft Copilot
- Microsoft Clarity Copilot - copilot/) [March 2023]
- Microsoft AI
- blog
- Azure AI Studio - us/products/ai-studio) + Promptflow + Azure AI Content safety / [youtube](https://www.youtube.com/watch?v=Qes7p5w8Tz8) / [SDK and CLI](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/sdk-generative-overview)
- Microsoft Copilot Studio - us/copilot/microsoft-copilot-studio) [Nov 2023]
- Microsoft Copilot Dashboard - viva-blog/new-ways-microsoft-copilot-and-viva-are-transforming-the/ba-p/3982293)
- Microsoft Office Copilot: Natural Language Commanding via Program Synthesis - friendly AI system for productivity software such as Microsoft Office that leverages large language models (LLMs) to execute user intent across application features. [6 Jun 2023]
- Copilot Pro
- Copilot Pro
- Copilot Pro
- Copilot Pro
- Copilot Pro
- Copilot Pro
- AutoGen Studio - Code Developer Tool for Building and Debugging Multi-Agent Systems [9 Aug 2024]
- Copilot Pro
- An AI companion for everyone
- Copilot Pro
- Team Copilot
- Copilot+ PC - powered and NPU-equipped Windows PCs [May 2024]
- Windows Copilot Runtime - device models, a new layer of Windows. [May 2024]
- Microsoft AI
- Copilot Pro
- Copilot Scenario Library
- Copilot Pro
- Github Copilot
- Copilot Pro
- Microsoft Copilot in Windows
- Nuance DAX Copilot
- NL2KQL
- Copilot Pro
- Copilot Pages
- Copilot Pro
- Copilot Pro
- GraphRAG (by Microsoft) - based approach to efficiently answer both specific and broad questions over large text corpora1. [ref](https://microsoft.github.io/graphrag) [git](https://github.com/microsoft/graphrag) [24 Apr 2024]
- Copilot Pro
- Copilot Pro
- Copilot Pro
- Copilot Pro
- SpreadsheetLLM
- Copilot Pro
-
**Azure Reference Architectures**
- Azure OpenAI Embeddings QnA - Samples/cosmosdb-chatgpt) C# blazor [Mar 2023] |
- C# Implementation - at-scale) TypeScript, ReactJs and Flask [Apr 2023] |
- Azure-Cognitive-Search-Azure-OpenAI-Accelerator
- Conversational-Azure-OpenAI-Accelerator
- ref
- ref
- git
- git
- git - azureai) [Jan 2024]
- git
- git
- git
- git
- Azure SQL DB + AOAI - Samples/openai-aca-lb) / [Azure Functions (C#) bindings for OpenAI](https://github.com/Azure/azure-functions-openai-extension) / [Microsoft Entra ID Authentication for AOAI](https://github.com/Azure-Samples/openai-chat-app-entra-auth-builtin) / [Azure OpenAI workshop](https://github.com/microsoft/OpenAIWorkshop) / [RAG for Azure Data](https://github.com/microsoft/AzureDataRetrievalAugmentedGenerationSamples) / [AI-Sentry](https://github.com/microsoft/ai-sentry): A lightweight, pluggable facade layer for AOAI
- AI-in-a-Box - in-a-Box aims to provide an "Azure AI/ML Easy Button" for common scenarios [Sep 2023]
- AI Samples for .NET
- OpenAI Official .NET Library
- Smart Components - to-end AI features for .NET apps [Mar 2024]
- Azure OpenAI Design Patterns
- Azure AI Services Landing Zone - architecture-blog/azure-openai-landing-zone-reference-architecture/ba-p/3882102) [24 Jul 2023]
- git
- Azure Multimodal AI + LLM Processing Accelerator
- Retrieval Augmented Fine Tuning - tuning (SFT) [25 Sep 2024]
- Azure AI CLI - line tool for ai [Jul 2023]
- git
- GPT-RAG - Augmented Generation pattern running in Azure [Jun 2023]
- ref
- git
- Can ChatGPT work with your enterprise data?
- Azure Video Indexer demo - Samples/miyagi) Integration demonstrate for multiple langchain libraries [Feb 2023] |
- Azure Command Companion - 3.5 Turbo for Azure CLI Command Generation [10 Dec 2023 ]
- Chat with your Azure DevOps data
- Baseline OpenAI end-to-end chat reference architecture
- Grounding LLMs - Augmented Generation (RAG) [09 Jun 2023]
- Revolutionize your Enterprise Data with ChatGPT
- Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback
- Security Best Practices for GenAI Applications (OpenAI) in Azure
- Authentication and Authorization in Generative AI applications with Entra ID and Azure AI Search
- Integrate private access to your Azure Open AI Chatbot
- An Introduction to LLMOps
- ref
- ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search
- Baseline Agentic AI Systems Architecture
- Azure OpenAI と Azure Cognitive Search の組み合わせを考える
- Integrated vectorization
- Prompt Buddy
- AI Agent-Driven Auto Insurance Claims RAG Pipeline
- VoiceRAG - 4o Realtime API for Audio [ref](https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/voicerag-an-app-pattern-for-rag-voice-using-azure-ai-search-and/ba-p/4259116) [Sep 2024]
- ARGUS - Vision to get better results without any pre-training. [Jun 2024]
- azure-llm-fine-tuning - tuning on Azure [May 2024]
- OpenAI Chat Application with Microsoft Entra Authentication
- Using keyless authentication with Azure OpenAI
- Designing and developing a RAG solution
- NL to SQL Architecture Alternative - memory/tree/NL2SQL/examples/200-dotnet-nl2sql)
- Azure OpenAI Best Practices Insights from Customer Journeys
- AI Feed - ai-platform-blog/bg-p/AIPlatformBlog)
- Responsible AI Transparency Report
- Safeguard and trustworthy generative AI applications
- Optimize Azure OpenAI Applications with Semantic Caching
- Azure OpenAI and Call Center Modernization
- An open-source template gallery
- Evaluating a RAG Chat App
- Microsoft.Extensions.AI
- Microsoft Copilot Studio Samples
- eShopSupport - kernel/eshop-infused-with-ai-a-comprehensive-intelligent-app-sample-with-semantic-kernel/) [Apr 2024]
-
**Azure Enterprise Services**
- 18 Jul 2023
- ref
- ref
- ref
- Models as a Service (MaaS) - based AI approach that provides developers and businesses with access to pre-built, pre-trained machine learning models. [July 2023]
- Assistants API
-
**Azure AI Search**
-
-
**Section 3** : Microsoft Semantic Kernel and Stanford NLP DSPy
-
**Micro-orchestration**
- git
- ref - kernel-bot-in-a-box) [26 Oct 2023]
- ref
- ref
- Prompt Template language
- skills, - us/semantic-kernel/concepts-sk/memories) and [connectors](https://learn.microsoft.com/en-us/semantic-kernel/concepts-sk/connectors) |
- Handlebars - 5: Stepwise Planner supports Function Calling. [ref](https://devblogs.microsoft.com/semantic-kernel/semantic-kernels-ignite-release-beta8-for-the-net-sdk/) [16 Nov 2023]
- The kernel
- The planner
- Architecting AI Apps with Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- Step-by-Step Guide to Building a Powerful AI Monitoring Dashboard with Semantic Kernel and Azure Monitor - by-step guide to building an AI monitoring dashboard using Semantic Kernel and Azure Monitor to track token usage and custom metrics. [23 Aug 2024]
- Chat Copilot - kernel/microsoft-hackathon-project-micronaire-using-semantic-kernel/): A Semantic Kernel RAG Evaluation Pipeline [git](https://github.com/microsoft/micronaire) [3 Oct 2024]
- Learning Paths for Semantic Kernel
- Glossary in Git - us/semantic-kernel/whatissk#sk-is-a-kit-of-parts-that-interlock)
- A Pythonista’s Intro to Semantic Kernel
- ref
- The future of Planners in Semantic Kernel
-
**Semantic Kernel**
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- ref
- Semantic Kernel Feature Matrix - us/semantic-kernel) / blog:[ref](https://devblogs.microsoft.com/semantic-kernel/) / [git](https://aka.ms/sk/repo) [Feb 2023]
- ref
- ref
- cite - ->
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
- Project Micronaire
- A Pythonista’s Intro to Semantic Kernel
- A Pythonista’s Intro to Semantic Kernel
-
**DSPy**
- git
- ref
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- ref
- ref
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- ref
- DSPy - Improving Pipelines [5 Oct 2023] / [git](https://github.com/stanfordnlp/dspy)
- youtube
- ref
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- ref
- ref
- ref
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- ref
- Prompt Like a Data Scientist: Auto Prompt Optimization and Testing with DSPy
- ref
- ref
- ref
- ref
-
**Optimizer frameworks**
-
-
**Section 4** : LangChain Features, Usage, and Comparisons
-
**LangChain Feature Matrix & Cheetsheet**
- Cheetsheet
- RAG From Scratch
- Awesome LangChain
- ref - courses/langchain-chat-with-your-data/)
- LangChain Cheetsheet KD-nuggets - nuggets [doc](files/LangChain_kdnuggets.pdf) [Aug 2023]
- LangChain Tutorial
- LangChain AI Handbook
-
**LangChain features and related libraries**
- Flowise
- OpenGPTs
- LangGraph - ai.github.io/langgraph/) [Aug 2023]
- LangSmith
- LangChain Template
- LangChain/context-aware-splitting
- LangChain Expression Language
- LangChain/cache
-
**LangChain vs Competitors**
- LangChain - kernel) [Feb 2023] | [Microsoft guidance](https://github.com/microsoft/guidance) [Nov 2022] | [Azure ML Promt flow](https://github.com/microsoft/promptflow) [Jun 2023] | [DSPy](https://github.com/stanfordnlp/dspy) [Jan 2023]
- Prompting Framework (PF) - Framework-Survey)
- ref - us/semantic-kernel/prompt-engineering/prompt-template-syntax) |
- ref
- cite
- What Are Tools Anyway? - used tools incl. Tool creation and reuse. Tool is not useful when machine translation, summarization, and sentiment analysis (among others). 3. Evaluation metrics [18 Mar 2024]
-
**LangChain Agent & Memory**
-
**Optimizer frameworks**
-
**Macro and Micro-orchestration**
-
**LangChain chain type: Chains & Summarizer**
-
-
**Section 5: Prompt Engineering, Finetuning, and Visual Prompts**
-
**Prompt Engineering**
- Plan-and-Solve Prompting
- promptbase
- `Iteration of Thought (IoT)` - analysis of 100+ papers shows CoT significantly improves performance in math and logic tasks. [18 Sep 2024]
- Tree of Thought (ToT) - evaluate the progress intermediate thoughts make towards solving a problem [17 May 2023] [git](https://github.com/ysymyth/tree-of-thought-llm) / Agora: Tree of Thoughts (ToT) [git](https://github.com/kyegomez/tree-of-thoughts)
- Re-Reading Improves Reasoning in Large Language Models - Reading), which involves re-reading the question as input to enhance the LLM's understanding of the problem. `Read the question again` [12 Sep 2023]
- Is the new norm for NLP papers "prompt engineering" papers?
- Language Models as Compilers - and-Execute is effective. It enhances large language models’ reasoning by using task-level logic and pseudocode, outperforming instance-specific methods. [20 Mar 2023]
- Large Language Models are Zero-Shot Reasoners
- Self-Consistency (CoT-SC) - consistency method: 1) prompt the language model using CoT prompting, 2) sample a diverse set of reasoning paths from the language model, and 3) marginalize out reasoning paths to aggregate final answers and choose the most consistent answer. [21 Mar 2022]
- Recursively Criticizes and Improves (RCI)
- Large Language Models as Optimizers - by-step.` to improve its accuracy. Optimization by PROmpting (OPRO) [7 Sep 2023]
- GPT-4 with Medprompt - 4, using a method called Medprompt that combines several prompting strategies, has surpassed MedPaLM 2 on the MedQA dataset without the need for fine-tuning. [28 Nov 2023]
- ref
- Prompt Principle for Instructions
- git
- ref
- GPT-4 with Medprompt - 4, using a method called Medprompt that combines several prompting strategies, has surpassed MedPaLM 2 on the MedQA dataset without the need for fine-tuning. [ref](https://www.microsoft.com/en-us/research/blog/the-power-of-prompting/) [28 Nov 2023]
- Graph of Thoughts (GoT) - of-thoughts) [18 Aug 2023]
- cite
- Chain-of-Verification reduces Hallucination in LLMs - step process that consists of generating a baseline response, planning verification questions, executing verification questions, and generating a final verified response based on the verification results. [20 Sep 2023]
- FireAct - tuning. 1. This work takes an initial step to show multiple advantages of fine-tuning LMs for agentic uses. 2. Duringfine-tuning, The successful trajectories are then converted into the ReAct format to fine-tune a smaller LM. 3. This work is an initial step toward language agent fine-tuning,
- Skeleton Of Thought - of-Thought (SoT) reduces generation latency by first creating an answer's skeleton, then filling each skeleton point in parallel via API calls or batched decoding. [28 Jul 2023]
- NLEP (Natural Language Embedded Programs) for Hybrid Language Symbolic Reasoning - 4. [19 Sep 2023]
- Chain of Thought (CoT) - of-Thought Prompting Elicits Reasoning in Large Language Models [[cnt](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=arxiv%3A+2201.11903)]: ReAct and Self Consistency also inherit the CoT concept. [28 Jan 2022]
- Promptist - to-image generation.
- ReAct - lm.github.io/) [6 Oct 2022]
- cite
- RankPrompt - ranking method. Direct Scoring
- Many-Shot In-Context Learning - shot to many-shot In-Context Learning (ICL) can lead to significant performance gains across a wide variety of generative and discriminative tasks [17 Apr 2024]
- Retrieval Augmented Generation (RAG) - intensive tasks. RAG combines an information retrieval component with a text generator model. [22 May 2020]
- git
- ref
- git
- Reflexion
- A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications
-
Prompt Tuner
- Claude Prompt Engineer - metaprompt-experimental) / [Claude Sonnet 3.5 for Coding](https://www.reddit.com/r/ClaudeAI/comments/1dwra38/sonnet_35_for_coding_system_prompt/)
- Cohere’s new Prompt Tuner
- Automatic Prompt Engineer (APE) - shot Chain-of-Thought (CoT) prompts superior to human-designed prompts like “Let’s think through this step-by-step” (Kojima et al., 2022). The prompt “To get the correct answer, let’s think step-by-step.” triggers a chain of thought. Two approaches to generate high-quality candidates: forward mode and reverse mode generation. [3 Nov 2022] [git](https://github.com/keirp/automatic_prompt_engineer) / [ref](https:/towardsdatascience.com/automated-prompt-engineering-78678c6371b9) [Mar 2024]
-
**Prompt Guide & Leaked prompts**
- Awesome ChatGPT Prompts
- Awesome Prompt Engineering
- Awesome-GPTs-Prompts
- Prompts for Education
- GPTs
- LLM Prompt Engineering Simplified
- Power Platform GPT Prompts
- Fabric
- Copilot prompts
- In-The-Wild Jailbreak Prompts on LLMs - source datasets (including 1,405 jailbreak prompts). Collected from December 2022 to December 2023 [Aug 2023]
- LangChainHub
- Prompt Engineering - Context Prompting ... [Mar 2023]
- Prompt Engineering Guide
- Azure OpenAI Prompt engineering techniques
- OpenAI Prompt example
- OpenAI Best practices for prompt engineering
- DeepLearning.ai ChatGPT Prompt Engineering for Developers
- Anthropic courses > Prompt engineering interactive tutorial - by-step guide to key prompting techniques / prompt evaluations [Aug 2024]
- Anthropic Prompt Library
-
**RLHF (Reinforcement Learning from Human Feedback) & SFT (Supervised Fine-Tuning)**
- git
- cite
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- Reinforcement Learning from AI Feedback (RLAF) - of-Thought, Improved), Few-shot (Not improved). Only explores the task of summarization. After training on a few thousand examples, performance is close to training on the full dataset. RLAIF vs RLHF: In many cases, the two policies produced similar summaries. [1 Sep 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- InstructGPT: Training language models to follow instructions with human feedback
- TRL
- SFTTrainer - ->
- ref
- ref
- Direct Preference Optimization (DPO)
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref
- Reinforcement Learning from Human Feedback (RLHF) - provided grading, direct human feedback is no longer needed, and the language model continues learning and improving using algorithmic grading alone. [18 Sep 2019] [ref](https://huggingface.co/blog/rlhf) [9 Dec 2022]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ORPO (odds ratio preference optimization) - tuning and preference alignment into one process` [git](https://github.com/xfactlab/orpo) [12 Mar 2024] [Fine-tune Llama 3 with ORPO](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) [Apr 2024] <br/>
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO, DPO and More
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
- ref - rlhf-because-dpo-is-what-you-actually-need-f10ce82c9b95) [1 Jul 2023]
-
**Finetuning**
- ref
- git
- Youtube
- ref
- ref
- ref
- PEFT - Efficient Fine-Tuning. PEFT is an approach to fine tuning only a few parameters. [10 Feb 2023]
- LoRA: Low-Rank Adaptation of Large Language Models - rank decomposition. [git](https://github.com/microsoft/LoRA) [17 Jun 2021]
- Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation)
- QLoRA: Efficient Finetuning of Quantized LLMs - bit quantized pre-trained language model into Low Rank Adapters (LoRA). [git](https://github.com/artidoro/qlora) [23 May 2023]
- LIMA: Less Is More for Alignment - tuned with the standard supervised loss on `only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling.` LIMA demonstrates remarkably strong performance, either equivalent or strictly preferred to GPT-4 in 43% of cases. [18 May 2023]
- Efficient Streaming Language Models with Attention Sinks - tuning. 2. We neither expand the LLMs' context window nor enhance their long-term memory. [git](https://github.com/mit-han-lab/streaming-llm) [29 Sep 2023]
- LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models - research/LongLoRA) [21 Sep 2023]
- Fine-tuning a GPT - LoRA - tuning_a_GPT_LoRA.pdf) [20 Jun 2023]
- Comprehensive Guide for LLaMA with RLHF - on guide to train LLaMA with RLHF [5 Apr 2023]
- QA-LoRA - Aware Low-Rank Adaptation of Large Language Models. A method that integrates quantization and low-rank adaptation for large language models. [git](https://github.com/yuhuixu1993/qa-lora) [26 Sep 2023]
- Llama 1 - ref](#open-source-large-language-models) / Llama3 > Build an llms from scratch [x-ref](#build-an-llms-from-scratch-picogpt-and-lit-gpt) <br/>
- Multi-query attention (MQA)
- Youtube - llama) [03 Sep 2023] <br/>
- ref
- ref
- ref
- ref
- ref
- LoRA+ - tuning speed by setting different learning rates for the LoRA adapter matrices. [19 Feb 2024]
- LoTR
- The Expressive Power of Low-Rank Adaptation
- DoRA - Decomposed Low-Rank Adaptation. Decomposes pre-trained weight into two components, magnitude and direction, for fine-tuning. [14 Feb 2024]
- ref
- ref
- ref
- ref
- ref - FA, VeRA, Delta-LoRA, LoRA+ [May 2024]
- LoRA learns less and forgets less
- ref
- ref
- How to continue pretraining an LLM on new data
- ref
- ref
- ref
- ref
- ref
- ref
- Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning
- ref
- ref
-
**Finetuning & Model Compression**
-
**2. Finetuning & Model Compression**
-
**Llama 2 Finetuning**
-
-
**Model Compression for Large Language Models**
-
**Memory Optimization**
- TokenAttention
- PagedAttention
- CPU vs GPU vs TPU - bandwidth memories (HBM). `HBM Bandwidth: 1.5-2.0TB/s vs SRAM Bandwidth: 19TB/s ~ 10x HBM` [27 May 2024]
- Flash Attention - 2](https://arxiv.org/abs/2307.08691): [[cnt](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=arxiv%3A+2307.08691)] [17 Jul 2023]: An method that reorders the attention computation and leverages classical techniques (tiling, recomputation). Instead of storing each intermediate result, use kernel fusion and run every operation in a single kernel in order to avoid memory read/write overhead. [git](https://github.com/Dao-AILab/flash-attention) -> Compared to a standard attention implementation in PyTorch, FlashAttention-2 can be up to 9x faster / [FlashAttention-3](https://arxiv.org/abs/2407.08608) [11 Jul 2024]
-
**Other techniques and LLM patterns**
- Model merging
- Lamini Memory Tuning - memory-tuning) [Jun 2024]
- RouteLLM
- Model merging
- Differential Transformer
- What We’ve Learned From A Year of Building with LLMs
- Model merging
- LLM patterns - patterns-og.png)
- Large Transformer Model Inference Optimization
- Mixture of experts models
- Huggingface Mixture of Experts Explained
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Model merging
- Sakana.ai: Evolutionary Optimization of Model Merging Recipes. - model-merge) [19 Mar 2024]
- Mixture-of-Depths - k tokens for processing, and the rest skip it. [ref](https://www.linkedin.com/embed/feed/update/urn:li:share:7181996416213372930) [2 Apr 2024]
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces - spaces/mamba): 1. Structured State Space (S4) - Class of sequence models, encompassing traits from RNNs, CNNs, and classical state space models. 2. Hardware-aware (Optimized for GPU) 3. Integrating selective SSMs and eliminating attention and MLP blocks [ref](https://www.unite.ai/mamba-redefining-sequence-modeling-and-outforming-transformers-architecture/) / A Visual Guide to Mamba and State Space Models [ref](https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mamba-and-state) [19 FEB 2024]
- Model merging
- Model merging
- Better & Faster Large Language Models via Multi-token Prediction
- Model merging
- Model merging
- Scaling Synthetic Data Creation with 1,000,000,000 Personas - driven data synthesis methodology using Text-to-Persona and Persona-to-Persona. [28 Jun 2024]
- Model merging
- Mamba-2 - 8X faster [31 May 2024]
- Simplifying Transformer Blocks - blocks and normalisation layers without loss of training speed. [3 Nov 2023]
- Model merging
- KAN or MLP: A Fairer Comparison
- Model merging
- Kolmogorov-Arnold Networks (KANs) - Layer Perceptrons (MLPs) do. Each weight in KANs is replaced by a learnable 1D spline function. KANs’ nodes simply sum incoming signals without applying any non-linearities. [git](https://github.com/KindXiaoming/pykan) [30 Apr 2024] / [ref](https://www.dailydoseofds.com/a-beginner-friendly-introduction-to-kolmogorov-arnold-networks-kan/): A Beginner-friendly Introduction to Kolmogorov Arnold Networks (KAN) [19 May 2024]
-
**Other optimization techniques**
-
**Pruning and Sparsification**
- Wanda Pruning - model-pruning-large-language-models-wandas-ayoub-kirouane)
-
**Knowledge Distillation: Reducing Model Size with Textbooks**
- git
- phi-1.5
- phi-1 - 1 attained 50.6% on HumanEval and 55.5% on MBPP. Textbooks Are All You Need. [ref](https://analyticsindiamag.com/microsoft-releases-1-3-bn-parameter-language-model-outperforms-llama/) [20 Jun 2023]
- Orca 2 - by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. [ref](https://www.microsoft.com/en-us/research/blog/orca-2-teaching-small-language-models-how-to-reason/) [18 Nov 2023]
- Mistral 7B - query attention (GQA) for faster inference. Uses Sliding Window Attention (SWA) to handle longer sequences at smaller cost. [ref](https://mistral.ai/news/announcing-mistral-7b/) [10 Oct 2023]
- Zephyr 7B - 7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). [ref](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) [25 Oct 2023]
- ref
- ref
- Apr 2024
-
**3. Visual Prompting & Visual Grounding**
- Visual Prompting
- What is Visual prompting - trained vision transformers have made it possible for us to implement Visual Prompting. [doc](files/vPrompt.pdf) [26 Apr 2023]
- Andrew Ng’s Visual Prompting Livestream
- What is Visual Grounding
- Screen AI
-
**Quantization Techniques**
- The Era of 1-bit LLMs - 1, 0, 1}. [27 Feb 2024]
- git
-
**RLHF (Reinforcement Learning from Human Feedback) & SFT (Supervised Fine-Tuning)**
-
**Other techniques and patterns**
-
-
**Section 6** : Large Language Model: Challenges and Solutions
-
**OpenAI's Roadmap and Products**
- Jun 2018
- 14 Feb 2019
- 11 Jun 2020
- The Timeline of the OpenaAI's Founder Journeys
- DALL·E 3 - E 3 [git](https://github.com/openai/dall-e) [Sep 2023]
- Humanloop Interview 2023 - plans.pdf) [29 May 2023]
- ref
- ref
- ref
- ref
- The Dawn of LMMs - 4V(ision) [29 Sep 2023]
- ref
- OpenAI DevDay 2023 - 4 Turbo with 128K context, Assistants API (Code interpreter, Retrieval, and function calling), GPTs (Custom versions of ChatGPT: [ref](https://openai.com/blog/introducing-gpts)), Copyright Shield, Parallel Function Calling, JSON Mode, Reproducible outputs [6 Nov 2023]
- ChatGPT Function calling
- Introducing the GPT Store
- Structured Outputs in the API - generated outputs will exactly match JSON Schemas provided by developers. [6 Aug 2024]
- ref
- A new series of reasoning models - specialized model, OpenAI o1 series, excels in math, coding, and science, outperforming GPT-4o on key benchmarks. [12 Sep 2024] / [ref](https://github.com/hijkzzz/Awesome-LLM-Strawberry): Awesome LLM Strawberry (OpenAI o1)
- ref
- Custom instructions - session memory that allows ChatGPT to retain key instructions across chat sessions. [20 Jul 2023]
- OpenAI DevDay 2024 - time API (speech-to-speech), Vision Fine-Tuning, Prompt Caching, and Distillation (fine-tuning a small language model using a large language model). [ref](https://community.openai.com/t/devday-2024-san-francisco-live-ish-news/963456) [1 Oct 2024]
- Sora - to-video model. Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt. [15 Feb 2024]
- ref - ->
- ref
- ChatGPT Memory
- ref
- ChatGPT can now see, hear, and speak
- GPT-3.5 Turbo Fine-tuning - tuning for GPT-3.5 Turbo is now available, with fine-tuning for GPT-4 coming this fall. [22 Aug 2023]
- New embedding models - embedding-3-small`: Embedding size: 512, 1536 `text-embedding-3-large`: Embedding size: 256,1024,3072 [25 Jan 2024]
- ref
- GPT-4o - 4o mini](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/): 15 cents per million input tokens, 60 cents per million output tokens, MMLU of 82%, and fast. [18 Jul 2024]
- ChatGPT Plugin
- CriticGPT - 4 fine-tuned to critique code generated by ChatGPT [27 Jun 2024]
- A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
- SearchGPT
-
**Numbers LLM**
- tiktoken
- Numbers every LLM Developer should know
- Open AI Tokenizer - 3, Codex Token counting
- What are tokens and how to count them?
- 5 Approaches To Solve LLM Token Limits - limits-5-approaches.pdf) [2023]
- Byte-Pair Encoding (BPE) - pair-encoding-subword-based-tokenization-algorithm-77828a70bee0) [13 Aug 2021]
- Tokencost
-
**Trustworthy, Safe and Secure LLM**
- NeMo Guardrails
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- Political biases of LLMs
- ref
- The Foundation Model Transparency Index
- Hallucinations
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- Mapping the Mind of a Large Language Model - mind-language-model) [21 May 2024]
- Frontier Safety Framework
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- OpenAI Weak-to-strong generalization - 4 or 3.5-level model using a GPT-2-level model. It finds that while strong models supervised by weak models can outperform the weak models, they still don’t perform as well as when supervised by ground truth. [git](https://github.com/openai/weak-to-strong) [14 Dec 2023]
- NIST AI Risk Management Framework
- Anthropic Many-shot jailbreaking - context attack, Bypassing safety guardrails by bombarding them with unsafe or harmful questions and answers. [3 Apr 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- LLMs Will Always Hallucinate, and We Need to Live With This - checking mechanisms due to fundamental mathematical and logical limitations. [9 Sep 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
- FactTune - tuning of a separated LLM using methods such as DPO and RLAIF, guided by preferences generated by [FActScore](https://github.com/shmsw25/FActScore). [14 Nov 2023] `FActScore` works by breaking down a generation into a series of atomic facts and then computing the percentage of these atomic facts by a reliable knowledge source.
- The Instruction Hierarchy - level instructions based on their alignment with higher-level instructions. [19 Apr 2024]
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- Trustworthy LLMs
- AI models collapse when trained on recursively generated data - generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. [24 Jul 2024]
- Large Language Models Reflect the Ideology of their Creators - Chinese figures; Western LLMs similarly align more with Western values, even in English prompts. [24 Oct 2024]
- Extracting Concepts from GPT-4 - 4. They extract 16 million interpretable features using GPT-4's outputs as input for training. [6 Jun 2024]
-
**Large Language Model Is: Abilities**
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- A Survey on Employing Large Language Models for Text-to-SQL Tasks - to-SQL tasks [21 Jul 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Emergent Abilities of Large Language Models - shot and augmented prompting techniques. [ref](https://www.jasonwei.net/blog/emergence) [15 Jun 2022]
- Multitask Prompted Training Enables Zero-Shot Task Generalization - shot manner. [15 Oct 2021]
- Language Modeling Is Compression - specific compressors like PNG (58.5%) or FLAC (30.3%). [19 Sep 2023]
- WizardMath - Instruct and Reinforcement Learning techniques, these models excel in math-related instructions like GSM8k and MATH. [git](https://github.com/nlpxucan/WizardLM) [18 Aug 2023] / Math solving Plugin: [Wolfram alpha](https://www.wolfram.com/wolfram-plugin-chatgpt/)
- LLMs Represent Space and Time - only training. [3 Oct 2023]
- Improving mathematical reasoning with process supervision
- Large Language Models for Software Engineering
- LLMs for Chip Design - Adapted LLMs for Chip Design [31 Oct 2023]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Can LLMs Generate Novel Research Ideas? - Scale Human Study with 100+ NLP Researchers. We find LLM-generated ideas are judged as more novel (p < 0.05) than human expert ideas. However, the study revealed a lack of diversity in AI-generated ideas. [6 Sep 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Design2Code - End Engineering? `64% of cases GPT-4V
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
- Testing theory of mind in large language models and humans - models-can-outperform-humans-in-tests-to-identify-mental-states) [20 May 2024]
-
**Context constraints**
- Introducing 100K Context Windows
- Lost in the Middle: How Language Models Use Long Contexts
- Rotary Positional Embedding (RoPE) - embeddings/) / [doc](files/RoPE.pdf) [20 Apr 2021]
- Structured Prompting: Scaling In-Context Learning to 1,000 Examples
- keys
- Ring Attention - attention to distribute long sequences across multiple devices while overlapping the communication of key-value blocks with the computation of blockwise attention. 2. Ring Attention can reduce the memory requirements of Transformers, enabling us to train more than 500 times longer sequence than prior memory efficient state-of-the-arts and enables the training of sequences that exceed 100 million in length without making approximations to attention. 3. we propose an enhancement to the blockwise parallel transformers (BPT) framework. [git](https://github.com/lhao499/llm_large_context) [3 Oct 2023]
- LLM Maybe LongLM - Extend LLM Context Window Without Tuning. With only four lines of code modification, the proposed method can effortlessly extend existing LLMs' context window without any fine-tuning. [2 Jan 2024]
- Giraffe - long-context-llms/) [2 Jan 2024]
- “Needle in a Haystack” Analysis - 4](https://github.com/gkamradt/LLMTest_NeedleInAHaystack); [Long context prompting for Claude 2.1](https://www.anthropic.com/index/claude-2-1-prompting) `adding just one sentence, “Here is the most relevant sentence in the context:”, to the prompt resulted in near complete fidelity throughout Claude 2.1’s 200K context window.` [6 Dec 2023]
- Leave No Context Behind - attention. The Infini-attention incorporates a compressive memory into the vanilla attention mechanism. Integrate attention from both local and global attention. [10 Apr 2024]
-
**OpenAI's Roadmap and Future Plans**
-
-
**Section 7** : Large Language Model: Landscape
-
**Large Language Models (in 2023)**
- LLMprices.dev - 4, Claude Sonnet 3.5, Llama 3.1 405b and many more.
- AI Model Review
- Artificial Analysis
- Inside language models (from GPT to Olympus)
- LLM Pre-training and Post-training Paradigms
-
**Open-Source Large Language Models**
- KoAlpaca
- Open-Sora
- MEGALODON
- Huggingface Open LLM Learboard
- The LLM Index
- Chatbot Arena
- Alpaca - tuned from the LLaMA 7B model [Mar 2023]
- Llama 2
- Falcon LLM
- Koala
- dolly
- Cerebras-GPT
- GPT4All Download URL - ->
- Pythia - only autoregressive language models ranging from 70M to 12B parameters [git](https://github.com/EleutherAI/pythia) [Apr 2023]
- Gemma - deepmind/gemma) [Feb 2024]
- Huggingface Open LLM Learboard
- OLMo - open LLM leverages sparse Mixture-of-Experts [Sep 2024]
- Qualcomm’s on-device AI models
- OpenELM - based language model. Four sizes of the model: 270M, 450M, 1.1B, and 3B parameters. [April 2024]
- Grok - of-Experts (MoE) model. Released under the Apache 2.0 license. Not includeded training code. Developed by JAX [git](https://github.com/xai-org/grok) [March 17, 2024]
- LLM Collection
- ref
- DBRX - purpose LLM created by Databricks. [git](https://github.com/databricks/dbrx) [27 Mar 2024]
- Jamba - Transformer Model. Mamba + Transformer + MoE [28 Mar 2024]
- The Open Source AI Definition
- Llama 3.2 - only models (1B, 3B) and text-image models (11B, 90B), with quantized versions of 1B and 3B [Sep 2024]
- NotebookLlama
- ollam - supported models
- Nemotron-4 340B
- ref
- Llama 3.1 - 4o. [23 Jul 2024] / [Llama 3.2](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/): Multimodal 11B and 90B model support image reasoning. lightweight 1B and 3B models. [25 Sep 2024]
- NeMo
- Llama-3-Groq-Tool-Use
-
**Navigating the Generative AI Landscape**
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
-
**Evolutionary Tree of Large Language Models**
-
**A Taxonomy of Natural Language Processing**
-
**GPT for Domain Specific**
- Huggingface StarCoder: A State-of-the-Art LLM for Code
- Code Llama - llama-large-language-model-coding/) / [git](https://github.com/facebookresearch/codellama) [24 Aug 2023]
- BioGPT - trained Transformer for Biomedical Text Generation and Mining [git](https://github.com/microsoft/BioGPT) [19 Oct 2022]
- SaulLM-7B
- Devin AI
- FrugalGPT - cost to high-cost LLMs. [git](https://github.com/stanford-futuredata/FrugalGPT) [9 May 2023]
- BloombergGPT
- Galactica
- EarthGPT - modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain [30 Jan 2024]
- OpenDevin - source project aiming to replicate Devin [Mar 2024]
- TimeGPT
-
**MLLM (multimodal large language model)**
- Multimodal Foundation Models: From Specialists to General-Purpose Assistants - language capabilities. Specific-Purpose 1. Visual understanding tasks 2. Visual generation tasks General-Purpose 3. General-purpose interface. [18 Sep 2023]
- LLaVA-1.5 - liu/LLaVA): Changing from a linear projection to an MLP cross-modal. [5 Oct 2023]
- Video-ChatGPT - oryx/Video-ChatGPT) [8 Jun 2023]
- CLIP - Image Pretraining), Trained on a large number of internet text-image pairs and can be applied to a wide range of tasks with zero-shot learning. [git](https://github.com/openai/CLIP) [26 Feb 2021]
- MiniGPT-4 & MiniGPT-v2 - language Understanding with Advanced Large Language Models [git](https://minigpt-4.github.io/) [20 Apr 2023]
- LLaVa - and-Vision Assistant [git](https://llava-vl.github.io/) [17 Apr 2023]
- TaskMatrix, aka VisualChatGPT - anything.git) [8 Mar 2023]
- GroundingDINO - Training for Open-Set Object Detection [git](https://github.com/IDEA-Research/GroundingDINO) [9 Mar 2023]
- BLIP-2 - Former) / [git](https://github.com/salesforce/LAVIS/blob/main/lavis/models/blip2_models/blip2_qformer.py) / [ref](https://huggingface.co/blog/blip-2) / [Youtube](https://www.youtube.com/watch?v=k0DAtZCCl1w) / [BLIP](https://arxiv.org/abs/2201.12086): [[cnt](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=arxiv%3A+2201.12086)]: [git](https://github.com/salesforce/BLIP) [28 Jan 2022]
- ref
- facebookresearch/ImageBind
- facebookresearch/segment-anything(SAM) - anything) [5 Apr 2023]
- facebookresearch/SeamlessM4T - in-one multilingual multimodal AI translation and transcription model. This single model can perform speech-to-text, speech-to-speech, text-to-speech, and text-to-text translations for up to 100 languages depending on the task. [ref](https://about.fb.com/news/2023/08/seamlessm4t-ai-translation-model/) [22 Aug 2023]
- Models and libraries
- Kosmos-1
- Kosmos-2
- Kosmos-2.5
- BEiT-3 - Language Tasks [22 Aug 2022]
- Gemini 1.5
- TaskMatrix.AI
- ref - memory-optim) [2 Jul 2023]
- Claude 3 Opus - 4 and Google’s Gemini 1.0 Ultra. Three variants: Opus, Sonnet, and Haiku. [Mar 2024]
- Chameleon - fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. The unified approach uses fully token-based representations for both image and textual modalities. [16 May 2024]
- Foundation Models
- ref
-
**Generative AI Landscape**
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
- The Generative AI Revolution: Exploring the Current Landscape - ai-landscape.pdf) [28 Jun 2023]
- Diffusion Models vs. GANs vs. VAEs: Comparison of Deep Generative Models
-
-
**Section 9: Relevant Solutions and Frameworks**
-
**Application Development and User Interface (UI/UX)**
- Generative AI Design Patterns: A Comprehensive Guide
- Opencopilot - source AI Copilots into your product with ease. [Aug 2023]
- Generative AI Design Patterns: A Comprehensive Guide
- Generative AI Design Patterns: A Comprehensive Guide
-
**Solutions and Frameworks**
-
**Awesome demo**
- FRVR Official Teaser - powered end-to-end game creation [16 Jun 2023]
- Mobile ALOHA
- Sora - to-video model [Feb 2024]
-
-
What's the difference between Azure OpenAI and OpenAI?
-
**Section 4** : Langchain - Features, Usage, and Comparisons
-
**Langchain Impressive Features**
-
**Langchain Quick Start: How to Use**
-
-
**Section 4** : Langchain Features, Usage, and Comparisons
-
**Langchain Impressive Features**
-
**DSPy**
-
**Comparison: Langchain vs Its Competitors**
- LangChain - ai/langchain) [Oct 2022]
-
-
**Section 9: Applications and Frameworks**
-
**Agents: AutoGPT and Communicative Agents**
- Gorilla: An API store for LLMs
- cite
- Meta: Toolformer - pytorch) [9 Feb 2023]
- ToolLLM - world APIs [git](https://github.com/OpenBMB/ToolBench) [31 Jul 2023]
- The Rise and Potential of Large Language Model Based Agents: A Survey - based agents [[cnt](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=arxiv%3A+2309.07864)] / [git](https://github.com/WooooDyy/LLM-Agent-Paper-List) [14 Sep 2023]
- AgentBench - as Agent’s reasoning and decision-making abilities. [7 Aug 2023]
- Hugging Face (camel-agents)
- Project Astra
- ref
- Self-Refine
- CRITIC
- MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
- Efficient Tool Use with Chain-of-Abstraction Reasoning
- HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
- Understanding the planning of LLM agents: A survey
- Communicative Agents for Software Development
- AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
- MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
- AI Agents That Matter
- APIGen - Calling Datasets [26 Jun 2024]
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
- ref
- AgentCoder: Multiagent-Code Generation with Iterative Testing and Optimisation
- LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step
-
**Applications, Frameworks, and User Interface (UI/UX)**
- openai/shap-e - e)
- Drag Your GAN - based Manipulation on the Generative Image Manifold [git](https://github.com/Zeqiang-Lai/DragGAN) [18 May 2023]
- Table to Markdown - formatted tables more effectively than raw table formats.
- LM Studio
- Open-source GPT Wrappers - Next-Web](https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web) 2. [FastGPT](https://github.com/labring/FastGPT) 3. [Lobe Chat](https://github.com/lobehub/lobe-chat) [Jan 2024]
- Generative AI Design Patterns: A Comprehensive Guide
- Generative AI Design Patterns: A Comprehensive Guide
- Generative AI Design Patterns: A Comprehensive Guide
- Pytorch
- Sentence Transformers - of-the-art sentence, text and image embeddings. Useful for semantic textual similar, semantic search, or paraphrase mining. [git](https://github.com/UKPLab/sentence-transformers) [27 Aug 2019]
- MathPix - OCR](https://github.com/lukas-blecher/LaTeX-OCR) [Jan 2021]
- Nougat
- Generative AI Design Patterns: A Comprehensive Guide
- Generative AI Design Patterns: A Comprehensive Guide
- Generative AI Design Patterns: A Comprehensive Guide
- Generative AI Design Patterns: A Comprehensive Guide
- Generative AI Design Patterns: A Comprehensive Guide
-
**Defensive UX**
-
**LLM for Robotics: Bridging AI and Robotics**
- ref
- Mobile ALOHA - mobile-aloha-robot-learns-from-humans-to-cook-clean-do-laundry/) [4 Jan 2024] / [ALOHA](https://www.trossenrobotics.com/aloha.aspx): A Low-cost Open-source Hardware System for Bimanual Teleoperation.
- LeRobot - world robotics in PyTorch. [git](https://github.com/huggingface/lerobot) [Jan 2024]
- Figure 01 + OpenAI
-
**Awesome demo**
-
**Learning and Supplementary Materials**
-
**Application Development and User Interface (UI/UX)**
-
-
**Section 8: Survey and Reference**
-
**Survey on Large Language Models**
- SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
- Large Language Models: A Survey
- A Survey of Transformers
- A Comprehensive Survey of AI-Generated Content (AIGC)
- Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models
- A Survey on Language Models for Code
- From Google Gemini to OpenAI Q* (Q-Star)
- Retool
- The State of Generative AI in the Enterprise
- Gemini - media/gemini/gemini_1_report.pdf)
- Research at Microsoft 2023
- A Survey on Multimodal Large Language Models
- Survey of Hallucination in Natural Language Generation
- Data Management For Large Language Models: A Survey
- Evaluating Large Language Models: A Comprehensive Survey
- A Survey of Techniques for Optimizing Transformer Inference
- A Cookbook of Self-Supervised Learning
- A Survey on In-context Learning
- An Overview on Language Models: Recent Developments and Outlook
- Efficient Guided Generation for Large Language Models
- Challenges & Application of LLMs
- A Survey on LLM-based Autonomous Agents
- A Survey on Efficient Training of Transformers
- Survey of Aligned LLMs
- Survey on Instruction Tuning for LLMs
- A Survey on Transformers in Reinforcement Learning
- Foundation Models in Vision
- Multimodal Deep Learning
- Universal and Transferable Adversarial Attacks on Aligned Language Models
- A Survey of LLMs for Healthcare
- Overview of Factuality in LLMs
- A Comprehensive Survey of Compression Algorithms for Language Models
- Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
- Standford AI Index Annual Report
- Looking Back at 2020, and Forward to 2021
- Retool: Status of AI - of-ai-2023) -> [2024](https://retool.com/blog/state-of-ai-h1-2024)
-
**Build an LLMs from scratch: picoGPT and lit-gpt**
- 1977
- ref - Read Starter Guide to Mastering Attention Mechanisms in Machine Learning [12 Jun 2023]
- ref
- ref - attention-from-scratch.html) [9 Feb 2023]
- ref
- Andrej Karpathy - 2 (124M) from scratch. [June 2024] / [SebastianRaschka](https://www.youtube.com/watch?v=kPGTx4wcm_w): Developing an LLM: Building, Training, Finetuning [June 2024]
-
**LLM Materials for East Asian Languages**
- LLM 研究プロジェクト
- rinna
- rinna: bilingual-gpt-neox-4b
- LLM を制御するには何をするべきか?
- Matsuo Lab - tokyo.ac.jp/人工知能・深層学習を学ぶためのロードマップ/) / [doc](files/archive/Matsuo_Lab_LLM_2023_Slide_pdf.7z) [Dec 2023]
- 大規模言語モデルで変わる ML システム開発 - scale language models [Mar 2023]
- GPT-4 登場以降に出てきた ChatGPT/LLM に関する論文や技術の振り返り - 4 [Jun 2023]
- LLM の推論を効率化する量子化技術調査
- AI事業者ガイドライン
- ブレインパッド社員が投稿した Qiita 記事まとめ
- 法律:生成 AI の利用ガイドライン
- New Era of Computing - ChatGPT がもたらした新時代
- 1. 生成 AI のマルチモーダルモデルでできること
- LLM の出力制御や新モデルについて
- コード生成を伴う LLM エージェント
- AI 데이터 분석가 ‘물어보새’ 등장 – RAG와 Text-To-SQL 활용
- 生成 AI・LLM のツール拡張に関する論文の動向調査
- LLM の学習・推論の効率化・高速化に関する技術調査
- LLMにまつわる"評価"を整理する
-
**Learning and Supplementary Materials**
- 12 Jun 2017
- Must read: the 100 most cited AI papers in 2022 - cited-2020-2021-2022-papers.pdf) [8 Mar 2023]
- Attention Is All You Need
- The Best Machine Learning Resources
- What are the most influential current AI Papers? - Arxiv) [31 Jul 2023]
- Comparing Adobe Firefly, Dalle-2, OpenJourney, Stable Diffusion, and Midjourney
- DAIR.AI
- Deep Learning cheatsheets for Stanford's CS 230
- LLM Visualization
- Open Problem and Limitation of RLHF
- But what is a GPT?
- Foundational concepts like Transformers, Attention, and Vector Database
- Attention Is All You Need
-
-
**Learning and Supplementary Materials**
-
**Japanese Language Materials for LLMs**
-
-
**Section 10: General AI Tools and Extensions**
-
**Section 11: Datasets for LLM Training**
-
**Awesome demo**
- Self-Instruct - written instructions. [20 Dec 2022]
- Self-Alignment with Instruction Backtranslation - response pairs. The process involves two steps: self-augmentation and self-curation. [11 Aug 2023]
- git - QA pairs or Dialog
- Anthropic human-feedback - Chosen and Rejected pairs
- 大規模言語モデルのデータセットまとめ
- FineWeb - quality web data from the summer of 2013 to March 2024. [Apr 2024]
-
-
**Section 12: Evaluating Large Language Models & LLMOps**
-
**Evaluating Large Language Models**
- HumanEval - Written Evaluation Set for Code Generation Bechmark. 164 Human written Programming Problems. [ref](https://paperswithcode.com/task/code-generation) / [git](https://github.com/openai/human-eval) [7 Jul 2021]
- LLM Model Evals vs LLM Task Evals
- Artificial Analysis LLM Performance Leaderboard
- A Survey on Evaluation of Large Language Models
- ChatGPT’s One-year Anniversary: Are Open-Source Large Language Models Catching up? - Source LLMs vs. ChatGPT; Benchmarks and Performance of LLMs [28 Nov 2023]
- MMLU (Massive Multi-task Language Understanding)
- BIG-bench - up approach; anyone can submit an evaluation task. [git](https://github.com/google/BIG-bench) [9 Jun 2022]
- HELM - down approach; experts curate and decide what tasks to evaluate models on. [git](https://github.com/stanford-crfm/helm) [16 Nov 2022]
- Prometheus: Inducing Fine-grained Evaluation Capability in Language Models - source large language model with 13 billion parameters, designed specifically for evaluation tasks. [12 Oct 2023]
- LLM-as-a-Judge - as-a-Judge offers a quick, cost-effective way to develop models aligned with human preferences and is easy to implement with just a prompt, but should be complemented by human evaluation to address biases. [Jul 2024]
- Can Large Language Models Be an Alternative to Human Evaluations?
-
**Evaluation metrics**
- ref - us/azure/machine-learning/prompt-flow/how-to-bulk-test-evaluate-flow)
-
**Challenges in evaluating AI systems**
- cite
- Pretraining on the Test Set Is All You Need
- Your AI Product Needs Evals - ai.com/blog/how-to-evaluate-llm-applications) [7 Nov 2023]
-
**LLMOps: Large Language Model Operations**
- ref
- Prompt flow - based AI applications [Sep 2023]
-
**LLM Evalution Benchmarks**
- TruthfulQA
- WMT
- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
- SWE-bench - world software issues sourced from GitHub.
- MBPP
- Chatbot Arena - ranked ELO ranking.
- MT Bench - turn open-ended questions
-
**Evaluation Benchmark**
-
Programming Languages
Categories
**Section 1: RAG, LlamaIndex, and Vector Storage**
407
**Section 5: Prompt Engineering, Finetuning, and Visual Prompts**
157
**Section 2** : Azure OpenAI and Reference Architecture
151
**Section 6** : Large Language Model: Challenges and Solutions
124
**Section 7** : Large Language Model: Landscape
106
**Section 3** : Microsoft Semantic Kernel and Stanford NLP DSPy
92
**Section 8: Survey and Reference**
74
**Model Compression for Large Language Models**
71
**Section 9: Applications and Frameworks**
53
**Section 4** : LangChain Features, Usage, and Comparisons
31
**Section 12: Evaluating Large Language Models & LLMOps**
25
**Section 9: Relevant Solutions and Frameworks**
8
**Section 11: Datasets for LLM Training**
6
**Section 10: General AI Tools and Extensions**
4
**Section 4** : Langchain Features, Usage, and Comparisons
4
What's the difference between Azure OpenAI and OpenAI?
2
**Section 4** : Langchain - Features, Usage, and Comparisons
2
**Learning and Supplementary Materials**
1
Sub Categories
**RAG Pipeline & Advanced RAG**
124
**RAG Solution Design & Application**
66
**Azure Reference Architectures**
65
**Microsoft Copilot Product Lineup**
57
**What is the RAG (Retrieval-Augmented Generation)?**
56
**Vector Database Comparison**
54
**DSPy**
48
**Finetuning**
46
**RLHF (Reinforcement Learning from Human Feedback) & SFT (Supervised Fine-Tuning)**
44
**The Problem with RAG**
43
**Other techniques and LLM patterns**
39
**Large Language Model Is: Abilities**
37
**Survey on Large Language Models**
36
**Prompt Engineering**
35
**OpenAI's Roadmap and Products**
35
**Trustworthy, Safe and Secure LLM**
34
**Open-Source Large Language Models**
33
**MLLM (multimodal large language model)**
25
**Retrieval-Augmented Generation: Research Papers**
24
**Agents: AutoGPT and Communicative Agents**
24
**Semantic Kernel**
23
**Microsoft Azure OpenAI relevant LLM Framework**
22
**Prompt Guide & Leaked prompts**
19
**LLM Materials for East Asian Languages**
19
**Micro-orchestration**
18
**Generative AI Landscape**
18
**Applications, Frameworks, and User Interface (UI/UX)**
17
**Learning and Supplementary Materials**
16
**Advanced RAG**
15
**Awesome demo**
15
**LlamaIndex**
12
**Evaluating Large Language Models**
11
**GPT for Domain Specific**
11
**Navigating the Generative AI Landscape**
10
**Context constraints**
10
**Knowledge Distillation: Reducing Model Size with Textbooks**
9
**LangChain features and related libraries**
8
**Finetuning & Model Compression**
8
**Other optimization techniques**
7
**LangChain Feature Matrix & Cheetsheet**
7
**Numbers LLM**
7
**LlamaIndex example**
7
**LLM Evalution Benchmarks**
7
**Optimizer frameworks**
6
**LangChain vs Competitors**
6
**Build an LLMs from scratch: picoGPT and lit-gpt**
6
**LangChain Agent & Memory**
6
**Large Language Models (in 2023)**
6
**Azure Enterprise Services**
6
**3. Visual Prompting & Visual Grounding**
5
**Application Development and User Interface (UI/UX)**
5
**Memory Optimization**
4
**LLM for Robotics: Bridging AI and Robotics**
4
**Challenges in evaluating AI systems**
3
Prompt Tuner
3
**Langchain Impressive Features**
3
**Vector Database Options for Azure**
3
**Other techniques and patterns**
3
**Quantization Techniques**
2
**Evolutionary Tree of Large Language Models**
2
**RAG Solution Design guide**
2
**LLMOps: Large Language Model Operations**
2
**2. Finetuning & Model Compression**
2
**Defensive UX**
2
**Llama 2 Finetuning**
1
**Pruning and Sparsification**
1
**LangChain chain type: Chains & Summarizer**
1
**Azure AI Search**
1
**OpenAI's Roadmap and Future Plans**
1
**Evaluation metrics**
1
**Comparison: Langchain vs Its Competitors**
1
**Lucene based search engine with text-embedding-ada-002**
1
**Solutions and Frameworks**
1
**Macro and Micro-orchestration**
1
**Evaluation Benchmark**
1
**Langchain Quick Start: How to Use**
1
**Japanese Language Materials for LLMs**
1
**A Taxonomy of Natural Language Processing**
1
Keywords
llm
23
rag
19
chatgpt
16
openai
14
ai
13
machine-learning
11
python
11
retrieval-augmented-generation
10
chatbot
9
azure
9
large-language-models
9
gpt
9
deep-learning
8
llms
7
vector-database
7
azd-templates
6
vector-search
6
nlp
6
generative-ai
6
nearest-neighbor-search
5
search-engine
5
prompt
5
gpt-4
4
prompt-engineering
4
agent
4
image-search
4
artificial-intelligence
4
approximate-nearest-neighbor-search
4
information-retrieval
4
semantic-search
4
search
4
langchain
4
question-answering
3
faiss
3
hnsw
3
pytorch
3
mlops
3
awesome-list
3
gpt-3
3
similarity-search
3
vector-search-engine
3
recommender-system
3
typescript
3
language-model
2
paper
2
gpt4
2
llama
2
graphrag
2
csharp
2
neural-search
2