https://github.com/warner-benjamin/modernllmstudygroup

Last synced: about 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/warner-benjamin/modernllmstudygroup
Owner: warner-benjamin
License: mit
Created: 2023-07-31T22:24:38.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2024-05-19T21:47:55.000Z (about 2 years ago)
Last Synced: 2025-03-30T17:04:26.127Z (about 1 year ago)
Language: Jupyter Notebook
Size: 29.6 MB
Stars: 22
Watchers: 5
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Modern LLM Study Group Resources

This repository is for the fastai Modern LLM paper reading study group. Here you can find the papers we covered along with any extra resources.

We are working our way through the seminal LLM papers starting with the GPT-3 paper, [*Language Models are Few-Shot Learners*](https://arxiv.org/abs/2005.14165).

The plan is to read our way through all the modern LLM methods mentioned by Andrej Karpathy in his [*The State of GPT*](https://www.youtube.com/watch?v=bZQun8Y4L2A) talk, along with any new developments since then.

The study group is coordinated through the [fastai discord](https://forums.fast.ai/t/discord-live-coding-details/75370) in the #cluster-of-stars text channel and currently meets weekly on Fridays at 2300 UTC (7pm Eastern) in the #fastai-study-groups voice channel.

## LLM Paper Reading List

Each paper has its own ReadMe with a direct link, summary, further reading (for most papers), and some supporting materials in the section references folder.

### Intro to Modern LLMs

1. [*Language Models are Few-Shot Learners*](Intro_to_Modern_LLMs/Language_Models_are_Few_Shot_Learners.md)
2. [*Finetuned Language Models Are Zero-Shot Learners*](Intro_to_Modern_LLMs/Finetuned_Language_Models_Are_Zero_Shot_Learners.md)
3. [*Chain-of-Thought Prompting Elicits Reasoning in Large Language Models*](Intro_to_Modern_LLMs/Chain_of_Thought_Prompting_Elicits_Reasoning_in_Large_Language_Models.md)
4. [*Training language models to follow instructions with human feedback*](Intro_to_Modern_LLMs/Training_Language_Models_to_Follow_Instructions_with_Human_Feedback.md)
5. [*LoRA: Low-Rank Adaptation of Large Language Models*](Intro_to_Modern_LLMs/LoRA_Low_Rank_Adaptation_of_Large_Language_Models.md)
6. [*Evaluating Large Language Models Trained on Code*](Intro_to_Modern_LLMs/Evaluating_Large_Language_Models_Trained_on_Code.md)

### Retrieval, Chain of Thought, & Tool Use

7. [*Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering*](Retrieval_Chain_of_Thought_Tool_Use/Leveraging_Passage_Retrieval_with_Generative_Models_for_Open_Domain_Question_Answering.md)
8. [*Atlas: Few-shot Learning with Retrieval Augmented Language Models*](Retrieval_Chain_of_Thought_Tool_Use/Atlas_Few_shot_Learning_with_Retrieval_Augmented_Language_Models.md)
9. [*In-Context Retrieval-Augmented Language Models*](Retrieval_Chain_of_Thought_Tool_Use/In_Context_Retrieval_Augmented_Language_Models.md)
10. [*ReAct: Synergizing Reasoning and Acting in Language Models*](Retrieval_Chain_of_Thought_Tool_Use/ReAct_Synergizing_Reasoning_and_Acting_in_Language_Models.md)
11. [*Toolformer: Language Models Can Teach Themselves to Use Tools*](Retrieval_Chain_of_Thought_Tool_Use/Toolformer_Language_Models_Can_Teach_Themselves_to_Use_Tools.md)
12. [*SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking*](Retrieval_Chain_of_Thought_Tool_Use/SequenceMatch_Imitation_Learning_for_Autoregressive_Sequence_Modelling_with_Backtracking.md)
13. [Chain of Papers: Multiple Chain of Thought Papers](Retrieval_Chain_of_Thought_Tool_Use/Chain_of_Papers_Multiple_Chain_of_Thought_Papers.md)
14. [*DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines*](Retrieval_Chain_of_Thought_Tool_Use/DSPy_Compiling_Declarative_Language_Model_Calls_into_Self_Improving_Pipelines.md)
15. [*Self-RAG: Learning to Retrieve, Generate and Critique through Self-Reflections*](Retrieval_Chain_of_Thought_Tool_Use/Self_RAG_Learning_to_Retrieve_Generate_and_Critique_through_Self_Reflections.md)
16. [*TeacherLM: Teaching to Fish Rather Than Giving the Fish: Language Modeling Likewise*](Retrieval_Chain_of_Thought_Tool_Use/TeacherLM_Teaching_to_Fish_Rather_Than_Giving_the_Fish_Language_Modeling_Likewise.md)

### Pretraining Data

17. [*The Pile: An 800GB Dataset of Diverse Text for Language Modeling*](Pretraining_Data/The_Pile_An_800GB_Dataset_of_Diverse_Text_for_Language_Modeling.md)
18. [*TinyStories: How Small Can Language Models Be and Still Speak Coherent English*](Pretraining_Data/TinyStories_How_Small_Can_Language_Models_Be_and_Still_Speak_Coherent_English.md)
19. [*LLaMA: Open and Efficient Foundation Language Models*](Pretraining_Data/LLaMA_Open_and_Efficient_Foundation_Language_Models.md)
20. [*D4: Improving LLM Pretraining via Document De-Duplication and Diversification*](Pretraining_Data/D4_Improving_LLM_Pretraining_via_Document_De_Duplication_and_Diversification.md)
21. [*DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining*](Pretraining_Data/DoReMi_Optimizing_Data_Mixtures_Speeds_Up_Language_Model_Pretraining.md)
22. [*Training Data for the Price of a Sandwich: Common Crawl Impact on Generative AI*](Pretraining_Data/Training_Data_for_the_Price_of_a_Sandwich_Common_Crawl_Impact_on_Generative_AI.md)
23. [*How to Train Data-Efficient LLMs*](Pretraining_Data/How_to_Train_Data_Efficient_LLMs.md)

### Synthetic Data

24. [*Textbooks Are All You Need and Textbooks Are All You Need II: phi-1.5 technical report*](Synthetic_Data/Textbooks_Are_All_You_Need_and_Textbooks_Are_All_You_Need_II_phi_1.5_technical_report.md)
25. [*Cosmopedia*](Synthetic_Data/Cosmopedia.md)

### RLHF (Reinforcement Learning from Human Feedback)

26. [*Training Language Models to Follow Instructions with Human Feedback*](RLHF/Training_Language_Models_to_Follow_Instructions_with_Human_Feedback.md)
27. [*Constitutional AI: Harmlessness from AI Alignment*](RLHF/Constitutional_AI_Harmlessness_from_AI_Alignment.md)
28. [*Direct Preference Optimization: Your Language Model is Secretly a Reward Model*](RLHF/Direct_Preference_Optimization_Your_Language_Model_is_Secretly_a_Reward_Model.md)
29. [*KTO: Model Alignment as Prospect Theoretic Optimization*](RLHF/KTO_Model_Alignment_as_Prospect_Theoretic_Optimization.md)
30. [*ORPO: Monolithic Preference Optimization without Reference Model*](RLHF/ORPO_Monolithic_Preference_Optimization_without_Reference_Model.md)
31. [*RewardBench: Evaluating Reward Models for Language Modeling*](RLHF/RewardBench_Evaluating_Reward_Models_for_Language_Modeling.md)

### Finetuning

32. [*Orca: Progressive Learning from Complex Explanation Traces of GPT-4*](Finetuning/Orca_Progressive_Learning_from_Complex_Explanation_Traces_of_GPT_4.md)
33. [*QLoRA: Efficient Finetuning of Quantized LLMs*](Finetuning/QLoRA_Efficient_Finetuning_of_Quantized_LLMs.md)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/warner-benjamin/modernllmstudygroup

Awesome Lists containing this project

README