https://github.com/jawherkl/llm-foundations
A structured learning path for software engineers to master Large Language Models through theory, practical exercises, and project implementation.
https://github.com/jawherkl/llm-foundations
foundations llm models
Last synced: 9 months ago
JSON representation
A structured learning path for software engineers to master Large Language Models through theory, practical exercises, and project implementation.
- Host: GitHub
- URL: https://github.com/jawherkl/llm-foundations
- Owner: JawherKl
- License: mit
- Created: 2025-09-09T09:14:15.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-09-10T14:24:00.000Z (9 months ago)
- Last Synced: 2025-09-10T14:28:32.106Z (9 months ago)
- Topics: foundations, llm, models
- Homepage:
- Size: 567 KB
- Stars: 2
- Watchers: 0
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
- Roadmap: roadmap.md
Awesome Lists containing this project
README
# LLM Foundations: From Theory to Production π







A structured learning path and project portfolio for software engineers to master **Large Language Models**. This repository moves beyond theory, focusing on the practical application of LLMs through prompt engineering, API integration, and building scalable applications.
> **π‘ For Developers, By a Developer:** This isn't just a list of concepts. It's a hands-on curriculum designed to take you from foundational understanding to building production-ready LLM applications.
## π Overview
The field of Large Language Models is moving fast. This repository provides a structured path to not just keep up, but to become proficient. It's organized into a **28-step curriculum** that balances deep theoretical understanding with immediate, practical application.
Whether you're building AI-powered features into your product, automating workflows, or launching a new AI-based service, this guide will help you develop the necessary skills.
## π§ The 28-Day Learning Path
The curriculum is divided into six logical parts:
| Part | Focus Area | What You'll Achieve |
|:---|:---|:---|
| **1. Theory Foundations** | Core Concepts | Understand how LLMs work under the hood |
| **2. Prompt Engineering** | Communication | Master the art of guiding LLMs to desired outputs |
| **3. Practical Applications** | API Integration | Build functional applications using various LLM APIs |
| **4. Advanced Topics** | Production Systems | Implement RAG, work with vector DBs, and build agents |
| **5. Build Projects** | Portfolio Development | Create showcase projects for your portfolio |
| **6. Next Steps** | Career Planning | Define your specialization and next learning goals |
## ποΈ Repository Structure
```bash
llm-foundations/
βββ 01-theory-foundations/ # Days 1-8: How LLMs work
βββ 02-prompt-engineering/ # Days 9-16: Effective prompting
βββ 03-practical-applications/ # Days 17-18: API integration & simple apps
βββ 04-advanced-topics/ # Days 19-27: RAG, vector DBs, agents
βββ 05-build-projects/ # Portfolio project development
βββ 06-reflection-next-steps/ # Day 28: Planning your path forward
βββ resources/ # Cheatsheets, tools, reading lists
βββ quizzes/ # Self-assessment tools
```
## π Learning Resources
To make the most of this curriculum, you'll want to be familiar with these core technologies and have these tools ready.
### π Prerequisite Knowledge
* **Python Programming**: Intermediate proficiency (functions, classes, decorators, async/await)
* **API Concepts**: REST APIs, HTTP requests, authentication (API keys)
* **Basic Command Line**: Navigating directories, running scripts, managing environments
* **Git & GitHub**: Cloning repositories, making commits, creating pull requests
### π§ Essential Tools & Accounts
| Category | Tools & Services | Description |
|:---|:---|:---|
| **Development** | Python 3.10+, VS Code, Jupyter Notebook | Core coding environment |
| **API Access** | [OpenAI](https://platform.openai.com/), [Anthropic](https://console.anthropic.com/), [Cohere](https://dashboard.cohere.com/) | Accounts for LLM API access (some offer free credits) |
| **Open Source LLMs** | [Ollama](https://ollama.ai/), [LM Studio](https://lmstudio.ai/) | Run models locally on your machine |
| **Vector Databases** | [Pinecone](https://www.pinecone.io/), [Chroma](https://www.trychroma.com/), [Weaviate](https://weaviate.io/) | For RAG implementations (free tiers available) |
| **UI Frameworks** | [Streamlit](https://streamlit.io/), [Gradio](https://www.gradio.app/) | For building web interfaces for your LLM apps |
| **Prompt Tools** | [PromptHero](https://prompthero.com/), [FlowGPT](https://flowgpt.com/) | For prompt inspiration and testing |
### π Recommended Learning Path
1. **Setup Your Environment**: Install Python, create a virtual environment, and install key packages (`openai`, `langchain`, `streamlit`)
2. **Get API Access**: Sign up for OpenAI/Anthropic and get your API keys
3. **Install Ollama**: Follow the [Ollama installation guide](https://github.com/ollama/ollama) to run models locally
4. **Clone This Repo**: `git clone https://github.com/JawherKl/llm-foundations.git`
5. **Explore the Structure**: Review the repository organization and learning path
### π‘ Pro Tips
* **Start Small**: Begin with simple API calls before tackling complex frameworks
* **Use Free Tiers**: Most LLM APIs offer free credits to get started
* **Experiment Locally**: Use Ollama with smaller models (like Llama 3) for experimentation without API costs
* **Document Your Learning**: Keep notes on what works and what doesn't - this becomes valuable reference material
* **Join Communities**: Participate in Discord servers and subreddits like r/LocalLLaMA, r/LangChain, and AI developer communities
### π Free Resources to Supplement Learning
* [DeepLearning.AI Short Courses](https://www.deeplearning.ai/short-courses/) - Free courses on LLMs, ChatGPT, and LangChain
* [Andrew Ng's YouTube Channel](https://www.youtube.com/channel/UCep6Rpvw3PtOMJWAFpKl8Yw) - Excellent explanations of AI concepts
* [Full Stack LLM Bootcamp](https://fullstackdeeplearning.com/llm-bootcamp/) - Comprehensive video series on building LLM applications
* [Hugging Face Course](https://huggingface.co/course/chapter1) - Great for understanding transformers and open-source models
---
## π Getting Started
### For the Structured Learner (Recommended)
1. **Start with Theory**: Begin with `01-theory-foundations/README.md`
2. **Follow the Path**: Progress through each section in order
3. **Build as You Learn**: Implement projects in `05-build-projects/` as you acquire relevant skills
4. **Assess Your Knowledge**: Use the quizzes to validate your understanding
### For the Project-Focused Learner
1. **Skim the Theory**: Review `01-theory-foundations/04-key-terminologies.md`
2. **Master Prompting**: Study `02-prompt-engineering/` thoroughly
3. **Pick a Project**: Choose a project from `05-build-projects/project-ideas.md`
4. **Learn as You Build**: Reference specific sections as needed for your project
### For the Experienced Developer
1. **Assessment First**: Take the `quizzes/final-assessment.md` to identify knowledge gaps
2. **Targeted Learning**: Focus on sections where you need reinforcement
3. **Contribute**: Share your expertise by improving content or adding new examples
## π οΈ Tech Stack & Tools
This curriculum prepares you to work with:
- **LLM APIs**: OpenAI GPT, Anthropic Claude, Cohere, OpenRouter
- **Frameworks**: LangChain, LlamaIndex, Haystack
- **Vector Databases**: Pinecone, Chroma, Weaviate, Qdrant
- **UI Tools**: Streamlit, Gradio, Chainlit
- **Open Source Models**: Llama 2/3, Mistral, Phi via Ollama
- **Development**: Python, Jupyter Notebooks, Docker
## π€ How to Contribute
We welcome contributions! Here's how you can help:
1. **Fix Errors**: Found a mistake? Submit a PR with corrections
2. **Add Examples**: Share your prompt engineering examples or code samples
3. **Improve Explanations**: Help make complex concepts more accessible
4. **Share Projects**: Add your LLM projects to the build-projects section
5. **Suggest Resources**: Recommend great learning materials
Please read our [Contributing Guidelines](CONTRIBUTING.md) before submitting a pull request.
## π Bibliography & Further Reading
This repository synthesizes knowledge from a wide array of exceptional resources. The following books, articles, papers, and documentation were instrumental in its creation and serve as recommended reading for those who wish to dive deeper.
### Foundational Papers
* **[[1706.03762] Attention Is All You Need](https://arxiv.org/abs/1706.03762)** - Vaswani et al. (2017) - The seminal paper introducing the Transformer architecture.
* **[[1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)** - Devlin et al. (2018) - Introduced the encoder-only Transformer and masked language modeling.
* **[[2005.14165] Language Models are Few-Shot Learners (GPT-3 Paper)](https://arxiv.org/abs/2005.14165)** - Brown et al. (2020) - Demonstrated the remarkable scaling and few-shot abilities of large autoregressive models.
* **[[1910.10683] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5 Paper)](https://arxiv.org/abs/1910.10683)** - Raffel et al. (2019) - Reframed all NLP tasks into a text-to-text format.
### Essential Books & Online Books
* **["Natural Language Processing with Transformers"](https://www.oreilly.com/library/view/natural-language-processing/9781098136789/)** by Tunstall, von Werra, & Wolf - The definitive practical guide to using the Hugging Face ecosystem.
* **["Transformers for Natural Language Processing"](https://www.packtpub.com/product/transformers-for-natural-language-processing-second-edition/9781803247335)** by Denis Rothman - A comprehensive guide to Transformer models.
* **["Hands-On Large Language Models"](https://www.oreilly.com/library/view/hands-on-large-language/9781098150952/)** by Suraj Patil & others - A very practical, project-based approach.
* **["The OpenAI API Book"](https://www.linkedin.com/pulse/openai-api-book-build-ai-products-ship-faster-smarper-michael-king/)** by Michael King - A great resource focused on practical API usage.
### Influential Blogs & Articles
* **Jay Alammar's Blog ([The Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/))** - Legendary visual explanations of complex ML concepts.
* **Lil'Log ([Prompt Engineering](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/))** by Lilian Weng - In-depth and technical overview of prompt engineering techniques.
* **Andrej Karpathy's Blog ([AI for Full-Self Driving](https://karpathy.github.io/))** - While focused on AI for cars, his writing on software 2.0 and NN training is foundational.
* **Simon Willison's Blog ([LLM tag](https://simonwillison.net/tags/llm/))** - A prolific writer on practical LLM applications and emerging patterns.
* **EMAXX.IO ([RTF and CRISPE Frameworks](https://emaxx.io/blog/posts/rtf_crispe_frameworks_for_prompt_engineering.html))** - Excellent breakdown of prompt engineering frameworks.
### Official Documentation
* **[OpenAI API Documentation](https://platform.openai.com/docs/introduction)** - The source for all things GPT, embeddings, and fine-tuning on OpenAI's platform.
* **[Anthropic API Documentation](https://docs.anthropic.com/claude/docs)** - Comprehensive guide to using Claude models.
* **[LangChain Documentation](https://python.langchain.com/docs/get_started/introduction)** - Essential for building complex, multi-step LLM applications.
* **[LlamaIndex Documentation](https://docs.llamaindex.ai/en/stable/)** - The best resource for learning about Retrieval-Augmented Generation (RAG).
* **[Hugging Face Transformers Documentation](https://huggingface.co/docs/transformers/index)** - The go-to resource for working with open-source models.
### Courses & Video Series
* **[Full Stack LLM Bootcamp](https://fullstackdeeplearning.com/llm-bootcamp/)** - A free, excellent video series on building production LLM apps.
* **[DeepLearning.AI Short Courses](https://www.deeplearning.ai/short-courses/)** - Specifically "ChatGPT Prompt Engineering for Developers" and "LangChain for LLM Application Development".
* **[CS324 - Large Language Models](https://stanford-cs324.github.io/winter2022/)** - Stanford's course on LLMs, covering fundamentals and advanced topics.
### Community & Inspiration
* **r/LocalLLaMA** - The central Reddit community for open-source LLMs.
* **Hugging Face Discord** - A vibrant community for discussion and help with open-source models.
* **LangChain Discord** - Great for getting help with the LangChain framework.
* **AI Engineer Summit Talks ([YouTube](https://www.youtube.com/@aiDotEngineer))** - Talks from practitioners building the cutting edge of LLM applications.
---
*This bibliography represents a living list. If you have a resource that was foundational to your understanding, please consider contributing to this section.*
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Acknowledgments
- Inspired by various learning paths and roadmaps
- Built upon the work of researchers and developers in the LLM space
- Thanks to all contributors who help improve this resource
---
**β If you find this repository helpful, please give it a star!** This helps others discover it and encourages further development.
## πΊοΈ What's Next?
Ready to begin your LLM journey? Start here: **[Theory Foundations](./01-theory-foundations/README.md)**
---
*This repository is maintained by [JawherKl](https://github.com/JawherKl). For questions or suggestions, please open an issue or discussion.*