https://github.com/rraghavkaushik/nlp-learning-resources
List of latest papers and blogs for NLP
https://github.com/rraghavkaushik/nlp-learning-resources
llm-papers llms mechanistic-interpretability mlsys natural-language-processing nlp-learning-resources nlp-papers reinforcement-learning rlhf scaling-laws transformers
Last synced: 8 months ago
JSON representation
List of latest papers and blogs for NLP
- Host: GitHub
- URL: https://github.com/rraghavkaushik/nlp-learning-resources
- Owner: rraghavkaushik
- Created: 2025-06-23T11:53:24.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-10-05T09:20:48.000Z (9 months ago)
- Last Synced: 2025-10-05T11:32:17.282Z (9 months ago)
- Topics: llm-papers, llms, mechanistic-interpretability, mlsys, natural-language-processing, nlp-learning-resources, nlp-papers, reinforcement-learning, rlhf, scaling-laws, transformers
- Homepage:
- Size: 13.7 KB
- Stars: 7
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Learning-Resources
A compilation of resources for keeping up with the latest trends in NLP.
> **Note:** This resource list is a work in progress. More papers and topics will be added regularly. Contributions and suggestions are welcome!
## Some Fundamental Transformers
1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - https://arxiv.org/abs/1810.04805
2. GPT1 - https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
3. GPT2 - https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
4. T5: https://arxiv.org/abs/1910.10683
5. XLNet - https://arxiv.org/pdf/1906.08237
6. RoBERTa: https://arxiv.org/abs/1907.11692
7. ALBERT: https://arxiv.org/abs/1909.11942
8. LongFormer - https://arxiv.org/abs/2004.05150
## Papers for understanding fundamentals
1. Attention is all you need - https://arxiv.org/pdf/1706.03762
2. Memory Is All You Need - https://arxiv.org/pdf/2406.08413
3. Language Models are Few-Shot Learners - https://arxiv.org/abs/2005.14165
## Reinforcement Learning for LLMs
Basics of RL - OpenAI - https://spinningup.openai.com/en/latest/spinningup/rl_intro.html
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism - https://arxiv.org/abs/2305.18438
InstructGPT - https://arxiv.org/abs/2203.02155
DPO:
1. DPO paper: https://arxiv.org/pdf/2305.18290
2. Blog - Math behind DPO - https://www.tylerromero.com/posts/2024-04-dpo/
PPO:
1. Proximal Policy Optimization Algorithms - https://arxiv.org/pdf/1707.06347
2. PPO Docs OpenAI - https://spinningup.openai.com/en/latest/algorithms/ppo.html
GRPO:
1. DeepSeekMath - https://arxiv.org/abs/2402.03300
2. Blog - GRPO Explained - https://aipapersacademy.com/deepseekmath-grpo/
3. DeepSeek-R1 - https://arxiv.org/pdf/2501.12948
## Mechanistic Interpretability
1. Basic Mech Interp Essay - https://www.transformer-circuits.pub/2022/mech-interp-essay
2. Toy Neural Nets with low dimensional inputs - https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/
3. Mechanistic Interpretability for AI Safety Review - https://arxiv.org/abs/2404.14082
4. A Mathematical Framework for Transformer Circuits - https://transformer-circuits.pub/2021/framework/index.html
5. Circuit Tracing: Revealing Computational Graphs in Language Models - https://transformer-circuits.pub/2025/attribution-graphs/methods.html#evaluating-model
## Scaling Laws
1. Scaling Laws for Neural Language Models - https://arxiv.org/pdf/2001.08361
2. Scaling Laws for Autoregressive Generative Modeling - https://arxiv.org/pdf/2010.14701
## MLSys
1. Matrix multiplication - Nvidia Blog - https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html
2. Understanding GPU Performance - Nvidia Blog - https://docs.nvidia.com/deeplearning/performance/dl-performance-gpu-background/index.html#gpu-arch__fig2