https://github.com/warner-benjamin/modernllmstudygroup
https://github.com/warner-benjamin/modernllmstudygroup
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/warner-benjamin/modernllmstudygroup
- Owner: warner-benjamin
- License: mit
- Created: 2023-07-31T22:24:38.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-19T21:47:55.000Z (about 2 years ago)
- Last Synced: 2025-03-30T17:04:26.127Z (about 1 year ago)
- Language: Jupyter Notebook
- Size: 29.6 MB
- Stars: 22
- Watchers: 5
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Modern LLM Study Group Resources
This repository is for the fastai Modern LLM paper reading study group. Here you can find the papers we covered along with any extra resources.
We are working our way through the seminal LLM papers starting with the GPT-3 paper, [*Language Models are Few-Shot Learners*](https://arxiv.org/abs/2005.14165).
The plan is to read our way through all the modern LLM methods mentioned by Andrej Karpathy in his [*The State of GPT*](https://www.youtube.com/watch?v=bZQun8Y4L2A) talk, along with any new developments since then.
The study group is coordinated through the [fastai discord](https://forums.fast.ai/t/discord-live-coding-details/75370) in the #cluster-of-stars text channel and currently meets weekly on Fridays at 2300 UTC (7pm Eastern) in the #fastai-study-groups voice channel.
## LLM Paper Reading List
Each paper has its own ReadMe with a direct link, summary, further reading (for most papers), and some supporting materials in the section references folder.
### Intro to Modern LLMs
1. [*Language Models are Few-Shot Learners*](Intro_to_Modern_LLMs/Language_Models_are_Few_Shot_Learners.md)
2. [*Finetuned Language Models Are Zero-Shot Learners*](Intro_to_Modern_LLMs/Finetuned_Language_Models_Are_Zero_Shot_Learners.md)
3. [*Chain-of-Thought Prompting Elicits Reasoning in Large Language Models*](Intro_to_Modern_LLMs/Chain_of_Thought_Prompting_Elicits_Reasoning_in_Large_Language_Models.md)
4. [*Training language models to follow instructions with human feedback*](Intro_to_Modern_LLMs/Training_Language_Models_to_Follow_Instructions_with_Human_Feedback.md)
5. [*LoRA: Low-Rank Adaptation of Large Language Models*](Intro_to_Modern_LLMs/LoRA_Low_Rank_Adaptation_of_Large_Language_Models.md)
6. [*Evaluating Large Language Models Trained on Code*](Intro_to_Modern_LLMs/Evaluating_Large_Language_Models_Trained_on_Code.md)
### Retrieval, Chain of Thought, & Tool Use
7. [*Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering*](Retrieval_Chain_of_Thought_Tool_Use/Leveraging_Passage_Retrieval_with_Generative_Models_for_Open_Domain_Question_Answering.md)
8. [*Atlas: Few-shot Learning with Retrieval Augmented Language Models*](Retrieval_Chain_of_Thought_Tool_Use/Atlas_Few_shot_Learning_with_Retrieval_Augmented_Language_Models.md)
9. [*In-Context Retrieval-Augmented Language Models*](Retrieval_Chain_of_Thought_Tool_Use/In_Context_Retrieval_Augmented_Language_Models.md)
10. [*ReAct: Synergizing Reasoning and Acting in Language Models*](Retrieval_Chain_of_Thought_Tool_Use/ReAct_Synergizing_Reasoning_and_Acting_in_Language_Models.md)
11. [*Toolformer: Language Models Can Teach Themselves to Use Tools*](Retrieval_Chain_of_Thought_Tool_Use/Toolformer_Language_Models_Can_Teach_Themselves_to_Use_Tools.md)
12. [*SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking*](Retrieval_Chain_of_Thought_Tool_Use/SequenceMatch_Imitation_Learning_for_Autoregressive_Sequence_Modelling_with_Backtracking.md)
13. [Chain of Papers: Multiple Chain of Thought Papers](Retrieval_Chain_of_Thought_Tool_Use/Chain_of_Papers_Multiple_Chain_of_Thought_Papers.md)
14. [*DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines*](Retrieval_Chain_of_Thought_Tool_Use/DSPy_Compiling_Declarative_Language_Model_Calls_into_Self_Improving_Pipelines.md)
15. [*Self-RAG: Learning to Retrieve, Generate and Critique through Self-Reflections*](Retrieval_Chain_of_Thought_Tool_Use/Self_RAG_Learning_to_Retrieve_Generate_and_Critique_through_Self_Reflections.md)
16. [*TeacherLM: Teaching to Fish Rather Than Giving the Fish: Language Modeling Likewise*](Retrieval_Chain_of_Thought_Tool_Use/TeacherLM_Teaching_to_Fish_Rather_Than_Giving_the_Fish_Language_Modeling_Likewise.md)
### Pretraining Data
17. [*The Pile: An 800GB Dataset of Diverse Text for Language Modeling*](Pretraining_Data/The_Pile_An_800GB_Dataset_of_Diverse_Text_for_Language_Modeling.md)
18. [*TinyStories: How Small Can Language Models Be and Still Speak Coherent English*](Pretraining_Data/TinyStories_How_Small_Can_Language_Models_Be_and_Still_Speak_Coherent_English.md)
19. [*LLaMA: Open and Efficient Foundation Language Models*](Pretraining_Data/LLaMA_Open_and_Efficient_Foundation_Language_Models.md)
20. [*D4: Improving LLM Pretraining via Document De-Duplication and Diversification*](Pretraining_Data/D4_Improving_LLM_Pretraining_via_Document_De_Duplication_and_Diversification.md)
21. [*DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining*](Pretraining_Data/DoReMi_Optimizing_Data_Mixtures_Speeds_Up_Language_Model_Pretraining.md)
22. [*Training Data for the Price of a Sandwich: Common Crawl Impact on Generative AI*](Pretraining_Data/Training_Data_for_the_Price_of_a_Sandwich_Common_Crawl_Impact_on_Generative_AI.md)
23. [*How to Train Data-Efficient LLMs*](Pretraining_Data/How_to_Train_Data_Efficient_LLMs.md)
### Synthetic Data
24. [*Textbooks Are All You Need and Textbooks Are All You Need II: phi-1.5 technical report*](Synthetic_Data/Textbooks_Are_All_You_Need_and_Textbooks_Are_All_You_Need_II_phi_1.5_technical_report.md)
25. [*Cosmopedia*](Synthetic_Data/Cosmopedia.md)
### RLHF (Reinforcement Learning from Human Feedback)
26. [*Training Language Models to Follow Instructions with Human Feedback*](RLHF/Training_Language_Models_to_Follow_Instructions_with_Human_Feedback.md)
27. [*Constitutional AI: Harmlessness from AI Alignment*](RLHF/Constitutional_AI_Harmlessness_from_AI_Alignment.md)
28. [*Direct Preference Optimization: Your Language Model is Secretly a Reward Model*](RLHF/Direct_Preference_Optimization_Your_Language_Model_is_Secretly_a_Reward_Model.md)
29. [*KTO: Model Alignment as Prospect Theoretic Optimization*](RLHF/KTO_Model_Alignment_as_Prospect_Theoretic_Optimization.md)
30. [*ORPO: Monolithic Preference Optimization without Reference Model*](RLHF/ORPO_Monolithic_Preference_Optimization_without_Reference_Model.md)
31. [*RewardBench: Evaluating Reward Models for Language Modeling*](RLHF/RewardBench_Evaluating_Reward_Models_for_Language_Modeling.md)
### Finetuning
32. [*Orca: Progressive Learning from Complex Explanation Traces of GPT-4*](Finetuning/Orca_Progressive_Learning_from_Complex_Explanation_Traces_of_GPT_4.md)
33. [*QLoRA: Efficient Finetuning of Quantized LLMs*](Finetuning/QLoRA_Efficient_Finetuning_of_Quantized_LLMs.md)