{"id":26686779,"url":"https://github.com/mlane/llm-engineering-cheatsheet","last_synced_at":"2025-10-30T10:10:10.882Z","repository":{"id":284446094,"uuid":"954970289","full_name":"mlane/llm-engineering-cheatsheet","owner":"mlane","description":"Timeless principles and best practices for working with language models — tooling-agnostic, future-proof, and clear.","archived":false,"fork":false,"pushed_at":"2025-04-24T18:26:27.000Z","size":16,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-26T19:06:52.772Z","etag":null,"topics":["ai-best-practices","ai-cheatsheet","ai-patterns","ai-reference","anthropic","chatgpt","context-management","few-shot-learning","generative-ai","langchain","language-models","llm","llm-engineering","openai","prompt-design","prompt-engineering","python","python3","system-prompts","zero-shot"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-25T22:44:04.000Z","updated_at":"2025-05-06T02:16:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"a8277e5c-5998-4aac-9d8e-baa963887c6a","html_url":"https://github.com/mlane/llm-engineering-cheatsheet","commit_stats":null,"previous_names":["mlane/llm-engineering-cheatsheet"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mlane/llm-engineering-cheatsheet","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlane%2Fllm-engineering-cheatsheet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlane%2Fllm-engineering-cheatsheet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlane%2Fllm-engineering-cheatsheet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlane%2Fllm-engineering-cheatsheet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlane","download_url":"https://codeload.github.com/mlane/llm-engineering-cheatsheet/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlane%2Fllm-engineering-cheatsheet/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267709600,"owners_count":24131921,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-best-practices","ai-cheatsheet","ai-patterns","ai-reference","anthropic","chatgpt","context-management","few-shot-learning","generative-ai","langchain","language-models","llm","llm-engineering","openai","prompt-design","prompt-engineering","python","python3","system-prompts","zero-shot"],"created_at":"2025-03-26T12:14:29.705Z","updated_at":"2025-10-30T10:10:05.847Z","avatar_url":"https://github.com/mlane.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# LLM Engineering Cheatsheet\n\nA timeless guide to **thinking and building like a prompt engineer**. This cheatsheet focuses on core principles and patterns that apply across any model, provider, or tool — whether you're using OpenAI, Claude, Llama, or something that doesn't exist yet.\n\n\u003e This is not a cookbook or quickstart. It's a mindset guide — built for those who want to reason clearly and build reliably with LLMs.\n\n---\n\n## Core Philosophy\n\nLLMs are **probabilistic next-token predictors**, not deterministic logic machines. Prompt engineering is about:\n\n- Designing **clear, structured inputs**\n- Working within **context and token limits**\n- Thinking **iteratively**, not magically\n- Debugging failures like a **system**, not like a mystery\n\nTreat prompts as **interfaces**, not incantations.\n\n---\n\n## Prompting Patterns (Universal)\n\n### Zero-Shot\n\nAsk the model to do a task with no examples.\n\n```txt\n\"Summarize the following article in 3 bullet points: ...\"\n```\n\n### One-Shot / Few-Shot\n\nGive one or more examples to improve reliability.\n\n```txt\nReview: \"Great product, but shipping was late.\"\nResponse: \"Thanks for your feedback! Sorry about the delay...\"\n\nReview: \"Terrible quality.\"\nResponse: \"We're sorry to hear that. Could you share more details so we can improve?\"\n```\n\n### Role-Based Prompting\n\nSet a role for the model to adopt.\n\n```txt\nSystem: You are a technical support agent who speaks clearly and concisely.\nUser: My internet keeps cutting out. What should I do?\n```\n\n### Constrained Output\n\nAsk for output formats explicitly.\n\n```txt\n\"List the steps as JSON: [step1, step2, step3]\"\n```\n\n---\n\n### Sampling Controls\n\nControls randomness. Range: `0.0` to `2.0`.\n\n- `0.0` → deterministic, repeatable\n- `0.2 – 0.5` → reliable tasks (Ng recommendation)\n- `0.7` → balanced (OpenAI default)\n- `1.0+` → creative, less reliable\n\nUse `top_p` for more granular control of randomness.\n\n---\n\n## Prompt Structure: The Anatomy\n\nAlways structure prompts with these components:\n\n1. **Role** – Who is the model?\n2. **Task** – What do you want?\n3. **Input** – What info do they need?\n4. **Constraints** – What form should the output take?\n5. **Examples** _(optional)_ – Show what success looks like\n\n### Example Prompt (all parts applied)\n\n```txt\nSystem: You are a helpful travel assistant that gives concise city guides.\nUser: I’m visiting Tokyo for 3 days. Suggest an itinerary with 3 activities per day.\nConstraints:\n- Format your response as bullet points grouped by day.\n- Keep each activity description under 20 words.\nExample:\nDay 1:\n- Visit Meiji Shrine in the morning\n- Eat sushi at Tsukiji Market\n- Explore Shibuya Crossing at night\n```\n\n---\n\n## Context Management\n\n- Be **aware of token limits** (e.g. 4k, 8k, 128k)\n- Use **summarization** for long chat histories\n- Drop irrelevant history when possible\n- **Explicit \u003e implicit** — don't assume the model remembers everything\n- Fewer tokens = faster responses and lower cost\n- Use tools like [OpenAI Tokenizer](https://platform.openai.com/tokenizer) to inspect prompt size\n\n---\n\n## Mindsets That Scale\n\nThe best results come when you treat LLMs as tools that **augment your thinking**, not replace it.\n\n### 1. Use AI for Fast, Focused Tasks\n\nAI thrives when the task is something a human could do in a second or two — things like renaming files, summarizing short content, or generating scaffolding. Don’t force it to solve problems that are too vague or complex. **Break hard problems into small, clear ones.** That’s when AI shines.\n\n\u003e _This is aligned with Andrew Ng’s “One-Second Rule” — tasks that a human can perform in under one second are great candidates for automation._\n\n### 2. Prioritize Accuracy, But Accept Imperfection\n\nWhen you evaluate a model's performance, **accuracy is key** — but perfect accuracy is not realistic. Ambiguity, nuance, and subjectivity are baked into language. Instead of aiming for 100%, aim for **consistent and explainable behavior**, and iterate. Treat errors as feedback loops, not failures.  \n_(In practice, many real-world tasks operate safely with 70–90% accuracy — just make sure you know your risk tolerance.)_\n\n\u003e _Andrew Ng emphasizes setting high but achievable accuracy standards, and treating improvement as an ongoing process._\n\n### 3. You and the LLM Are Partners\n\nDon't outsource your learning. Let the LLM **guide, question, and collaborate** — not do everything for you. Ask it for scaffolding, instructions, or alternatives, then build it yourself. That way, you stay in control and deepen your understanding. **If you can't maintain the code later, you're not really building.**\n\n\u003e _Experts recommend a human-in-the-loop mindset where you learn with the AI, not through it._\n\n---\n\n## Evaluation Principles\n\nLLM output is fuzzy. Define quality like this:\n\n- Does it meet the **task objective**?\n- Is the output **formatted correctly**?\n- Would a human say it's **reasonable**?\n- Can you detect regressions with **A/B comparisons**?\n\n### Perspective Shift\n\nLLM success isn’t about perfection — it’s about clarity, consistency, and feedback loops.\\\n**Reliable \u003e perfect.** Iterate like a product, not like a test.\n\n---\n\n## Common Failure Modes\n\n| Symptom         | Likely Cause                             |\n| --------------- | ---------------------------------------- |\n| Hallucination   | Vague or underspecified prompts          |\n| Repetition      | Poor constraint or unclear output format |\n| Refusal         | Misalignment between task and role       |\n| Loss of context | Too much history or poor summarization   |\n\n---\n\n## Debugging Checklist (When Output Fails)\n\n- ❓ Is the **task clearly stated**?\n- 🧩 Are you using the right **prompting pattern**?\n- 🧠 Is there a clear **role and structure**?\n- 🧱 Could the context window be too full?\n- 📣 Try asking the model: _“Why did you respond this way?”_\n\n---\n\n## Structured Output \u0026 Tool Use (Advanced)\n\nSome models support **function calling** or **structured tool use** — great for API responses or JSON output.\n\n\u003e Example: OpenAI’s `function_call` or `response_format=\"json\"`\n\n---\n\n## Recommended Resources\n\n- [OpenAI Best Practices](https://platform.openai.com/docs/guides/prompt-engineering)\n- [LangChain Docs](https://python.langchain.com/docs/introduction)\n- [Ollama for Local Models](https://ollama.com)\n\n---\n\n## Minimal Python Example\n\n```python\nimport os\n\nfrom openai import OpenAI\n\nclient = OpenAI(\n    api_key=os.environ.get(\"OPENAI_API_KEY\"),\n)\n\nresponse = client.chat.completions.create(\n    model=\"gpt-4o\",\n    messages=[\n        {\"role\": \"system\", \"content\": \"You are a concise technical writer.\"},\n        {\n            \"role\": \"user\",\n            \"content\": \"Explain what a vector database is in simple terms.\",\n        },\n    ],\n    temperature=0.3,  # Lower = more deterministic\n)\n\nprint(response.choices[0].message.content)\n```\n\n---\n\n## Final Thought\n\nThis guide helps you stay grounded when everything else is changing. Focus on clarity. Prompt with intent. And always think like an engineer.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlane%2Fllm-engineering-cheatsheet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlane%2Fllm-engineering-cheatsheet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlane%2Fllm-engineering-cheatsheet/lists"}