{"id":29068559,"url":"https://github.com/incept5/llm_bootcamp","last_synced_at":"2025-06-29T13:02:05.307Z","repository":{"id":300793401,"uuid":"1006914003","full_name":"Incept5/llm_bootcamp","owner":"Incept5","description":null,"archived":false,"fork":false,"pushed_at":"2025-06-26T10:47:27.000Z","size":61202,"stargazers_count":3,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-26T11:45:40.936Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Incept5.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-23T07:25:26.000Z","updated_at":"2025-06-26T10:47:30.000Z","dependencies_parsed_at":"2025-06-26T11:55:58.212Z","dependency_job_id":null,"html_url":"https://github.com/Incept5/llm_bootcamp","commit_stats":null,"previous_names":["incept5/munich_ai_2025","incept5/llm_bootcamp"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Incept5/llm_bootcamp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Incept5%2Fllm_bootcamp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Incept5%2Fllm_bootcamp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Incept5%2Fllm_bootcamp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Incept5%2Fllm_bootcamp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Incept5","download_url":"https://codeload.github.com/Incept5/llm_bootcamp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Incept5%2Fllm_bootcamp/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262244892,"owners_count":23281027,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-27T11:07:55.715Z","updated_at":"2025-06-28T12:02:14.903Z","avatar_url":"https://github.com/Incept5.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# LLM Bootcamp: Hands-on GenAI Development Workshop\n\nThis repository contains a comprehensive collection of code samples, interactive demos, and educational materials for understanding and working with Large Language Models (LLMs). The workshop takes you from basic LLM interactions to advanced techniques like Retrieval-Augmented Generation (RAG) and data extraction.\n\n## Learning Path\n\nFollow these directories in order for the optimal learning experience. Each directory focuses on specific concepts and builds upon previous knowledge:\n\n1. **[intro-to-llms/](intro-to-llms/)** - Start here: Basic LLM interactions, cloud vs local models\n2. **[ui-for-local-llms/](ui-for-local-llms/)** - Building user interfaces for LLM applications  \n3. **[machine-learning/](machine-learning/)** - Traditional ML concepts and TensorFlow integration\n4. **[tokens-in-llms/](tokens-in-llms/)** - Understanding tokenization and text generation mechanics\n5. **[embedding/](embedding/)** - Vector embeddings, semantic similarity, and 3D visualizations\n6. **[reasoning/](reasoning/)** - Chain-of-thought reasoning and AI decision-making processes\n7. **[retrieval-augmented-generation/](retrieval-augmented-generation/)** - RAG systems with ChromaDB and semantic search\n8. **[extracting-data/](extracting-data/)** - Structured data extraction and web scraping with AI\n9. **[mcp/](mcp/)** - Model Context Protocol for advanced integrations\n10. **[extras/](extras/)** - Additional examples and advanced techniques\n\n## Directory Overview\n\n### Core Learning Modules\n\n- **intro-to-llms/** - Foundation concepts: local vs cloud LLMs, basic interactions, performance testing\n- **ui-for-local-llms/** - Building web interfaces with Gradio for LLM applications\n- **machine-learning/** - Traditional neural networks and TensorFlow integration to understand LLM foundations\n- **tokens-in-llms/** - Deep dive into tokenization, next token prediction, and text generation mechanics\n- **embedding/** - Vector embeddings, semantic similarity, word relationships, and 3D visualizations\n- **reasoning/** - Chain-of-thought reasoning, AI astrology examples, and complex decision processes\n- **retrieval-augmented-generation/** - Complete RAG systems using ChromaDB, document chunking, and semantic search\n- **extracting-data/** - AI-powered data extraction, web scraping, and structured output generation\n- **mcp/** - Model Context Protocol demonstrations and advanced integrations\n- **extras/** - Additional examples including fill-in-middle, formatted output, and advanced techniques\n\n### Supporting Files\n\n- **Text Files** - Sample documents for RAG demonstrations (Alice in Wonderland, Grimm Fairy Tales)\n- **Configuration** - Environment setup, requirements, and model configurations\n- **Documentation** - Individual README files in each directory provide detailed explanations\n\n## Quick Start\n\n### Prerequisites\n- Python 3.8+ with pip\n- 8GB+ RAM (16GB recommended for local models)\n- Internet connection for cloud APIs and model downloads\n\n### Installation\n\n1. **Clone and install dependencies:**\n   ```bash\n   git clone [repository-url]\n   cd llm_bootcamp\n   pip install -r requirements.txt\n   ```\n\n2. **Set up Ollama (for local models):**\n   - Install from [ollama.ai](https://ollama.ai)\n   - Start service: `ollama serve`\n   - Pull basic models:\n     ```bash\n     ollama pull qwen3:0.6b\n     ollama pull nomic-embed-text\n     ```\n\n3. **Optional: Set up cloud APIs:**\n   - **Groq**: Get API key from [console.groq.com](https://console.groq.com), add to `.env` file\n   - **Kaggle**: Get API credentials from [kaggle.com](https://www.kaggle.com), place `kaggle.json` in `~/.kaggle/`\n\n### Getting Started\n\nStart with the first directory and follow the learning path:\n```bash\ncd intro-to-llms\npython local_llm_using_ollama.py\n```\n\nEach directory contains its own README with detailed setup instructions and explanations.\n\n## What You'll Learn\n\nThis bootcamp covers the full spectrum of LLM development:\n\n### Core Concepts\n- **LLM Fundamentals**: Tokenization, transformer architecture, text generation mechanics\n- **Embeddings**: Vector representations, semantic similarity, 3D visualizations\n- **Local vs Cloud**: Trade-offs between local models (Ollama) and cloud APIs (Groq)\n- **User Interfaces**: Building web interfaces with Gradio for LLM applications\n\n### Advanced Techniques\n- **Reasoning**: Chain-of-thought prompting, complex decision-making processes\n- **RAG Systems**: Document chunking, semantic search, context-aware generation\n- **Data Extraction**: Structured output generation, web scraping with AI\n- **Integrations**: Model Context Protocol (MCP) and advanced tool usage\n\n### Practical Skills\n- Performance optimization and model comparison\n- Traditional ML foundations with TensorFlow\n- Real-world data processing with Kaggle datasets\n- Production-ready RAG implementations with ChromaDB\n- AI-powered SQL query generation\n\n## Troubleshooting\n\n### Common Issues\n- **Model not found**: Ensure Ollama is running (`ollama serve`) and models are pulled\n- **API errors**: Check API keys in `.env` file and internet connection\n- **Memory issues**: Use smaller models (qwen3:0.6b) and reduce batch sizes\n- **Missing files**: Download required text files (Alice in Wonderland, Grimm Fairy Tales) as noted in individual READMEs\n\n### Performance Tips\n- Use GPU acceleration if available for local models\n- Embedding results are cached to avoid recomputation\n- Start with smaller models and parameters for faster experimentation\n\n## Resources\n\n- [Ollama Documentation](https://ollama.ai/docs) - Local LLM setup and management\n- [Groq API Documentation](https://console.groq.com/docs) - Cloud LLM service\n- [Hugging Face Transformers](https://huggingface.co/transformers/) - Model library and tools\n- [Sentence Transformers](https://www.sbert.net/) - Embedding models and techniques\n\n## Contributing\n\nContributions are welcome! Please submit improvements, additional examples, or bug fixes via pull requests.\n\n---\n\n*This educational content is provided for learning purposes. Please respect the licenses of the underlying models and libraries used.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fincept5%2Fllm_bootcamp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fincept5%2Fllm_bootcamp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fincept5%2Fllm_bootcamp/lists"}