{"id":14629099,"url":"https://github.com/wearetyomsmnv/Awesome-LLMSecOps","last_synced_at":"2025-09-06T14:32:43.002Z","repository":{"id":251050649,"uuid":"836233224","full_name":"wearetyomsmnv/Awesome-LLMSecOps","owner":"wearetyomsmnv","description":"LLM | Security | Operations in one github repo with good links and pictures.","archived":false,"fork":false,"pushed_at":"2025-01-01T16:53:20.000Z","size":492,"stargazers_count":20,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-01T17:36:07.433Z","etag":null,"topics":["ai","awesome-list","llm","llmsecops","mlsecops","mlsecurity","owasp-top10","prompt-injection","security"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wearetyomsmnv.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-31T12:26:08.000Z","updated_at":"2025-01-01T16:53:23.000Z","dependencies_parsed_at":"2024-10-24T12:03:03.921Z","dependency_job_id":"4f22fd8c-5853-438a-956d-0b9780e26edb","html_url":"https://github.com/wearetyomsmnv/Awesome-LLMSecOps","commit_stats":null,"previous_names":["wearetyomsmnv/awesome-llmsecops"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wearetyomsmnv%2FAwesome-LLMSecOps","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wearetyomsmnv%2FAwesome-LLMSecOps/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wearetyomsmnv%2FAwesome-LLMSecOps/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wearetyomsmnv%2FAwesome-LLMSecOps/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wearetyomsmnv","download_url":"https://codeload.github.com/wearetyomsmnv/Awesome-LLMSecOps/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232128301,"owners_count":18476520,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","awesome-list","llm","llmsecops","mlsecops","mlsecurity","owasp-top10","prompt-injection","security"],"created_at":"2024-09-09T10:01:09.768Z","updated_at":"2025-01-01T21:31:47.265Z","avatar_url":"https://github.com/wearetyomsmnv.png","language":"HTML","funding_links":[],"categories":["Other Lists","[↑](#table-of-contents)Related Awesome Lists \u003ca name=\"related-awesome-lists\"\u003e\u003c/a\u003e","📚 Research, Talks, and Writeups"],"sub_categories":["TeX Lists","Startup Blogs \u003ca name=\"startup-blogs\"\u003e\u003c/a\u003e","Technical Research"],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://media.giphy.com/media/WPkNqX6qclrlG1LgxV/giphy.gif\" alt=\"GIPHY Animation\"\u003e\n\u003c/p\u003e\n\n\n\u003cdiv align=\"center\"\u003e\n\n# 🚀 Awesome LLMSecOps \n\n[![Awesome](https://awesome.re/badge-flat2.svg)](https://awesome.re)\n![GitHub stars](https://img.shields.io/github/stars/wearetyomsmnv/awesome-llmsecops?style=flat-square\u0026color=yellow)\n![GitHub forks](https://img.shields.io/github/forks/wearetyomsmnv/awesome-llmsecops?style=flat-square\u0026color=blue)\n![GitHub last commit](https://img.shields.io/github/last-commit/wearetyomsmnv/awesome-llmsecops?style=flat-square\u0026color=green)\n\n🔐 A curated list of awesome resources for LLMSecOps (Large Language Model Security Operations) 🧠\n\n### by @wearetyomsmnv\n\n\n**Architecture | Vulnerabilities | Tools | Defense | Threat Modeling | Jailbreaks | RAG Security | PoC's | Study Resources | Books | Blogs | Datasets for Testing | OPS Security | Frameworks | Best Practices | Research | Tutorials | Companies | Community Resources**\n\n\u003c/div\u003e\n\n\n\u003eLLM safety is a huge body of knowledge that is important and relevant to society today. The purpose of this Awesome list is to provide the community with the necessary knowledge on how to build an LLM development process - safe, as well \u003eas what threats may be encountered along the way. Everyone is welcome to contribute. \n\n\u003e [!IMPORTANT]\n\u003eThis repository, unlike many existing repositories, emphasizes the practical implementation of security and does not provide a lot of references to arxiv in the description.\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n## 3 types of models\n\n![Group3](https://github.com/user-attachments/assets/0079c134-be60-42b0-afaa-a5df9bb7ece3)\n\n\n\u003cdiv align=\"center\"\u003e\n\n## Architecture risks\n\n\n| Risk | Description |\n|------|-------------|\n| Recursive Pollution | LLMs can produce incorrect output with high confidence. If such output is used in training data, it can cause future LLMs to be trained on polluted data, creating a feedback loop problem. |\n| Data Debt | LLMs rely on massive datasets, often too large to thoroughly vet. This lack of transparency and control over data quality presents a significant risk. |\n| Black Box Opacity | Many critical components of LLMs are hidden in a \"black box\" controlled by foundation model providers, making it difficult for users to manage and mitigate risks effectively. |\n| Prompt Manipulation | Manipulating the input prompts can lead to unstable and unpredictable LLM behavior. This risk is similar to adversarial inputs in other ML systems. |\n| Poison in the Data | Training data can be contaminated intentionally or unintentionally, leading to compromised model integrity. This is especially problematic given the size and scope of data used in LLMs. |\n| Reproducibility Economics | The high cost of training LLMs limits reproducibility and independent verification, leading to a reliance on commercial entities and potentially unreviewed models. |\n| Model Trustworthiness | The inherent stochastic nature of LLMs and their lack of true understanding can make their output unreliable. This raises questions about whether they should be trusted in critical applications. |\n| Encoding Integrity | Data is often processed and re-represented in ways that can introduce bias and other issues. This is particularly challenging with LLMs due to their unsupervised learning nature. |\n\n\u003c/div\u003e\n\n**From [Berryville Institute of Machine Learning (BIML)](https://berryvilleiml.com/docs/BIML-LLM24.pdf) paper**\n\n\n\u003cdiv align=\"center\"\u003e\n\n## Vulnerabilities desctiption \n#### by Giskard\n\n| Vulnerability | Description |\n|---------------|-------------|\n| Hallucination and Misinformation | These vulnerabilities often manifest themselves in the generation of fabricated content or the spread of false information, which can have far-reaching consequences such as disseminating misleading content or malicious narratives. |\n| Harmful Content Generation | This vulnerability involves the creation of harmful or malicious content, including violence, hate speech, or misinformation with malicious intent, posing a threat to individuals or communities. |\n| Prompt Injection | Users manipulating input prompts to bypass content filters or override model instructions can lead to the generation of inappropriate or biased content, circumventing intended safeguards. |\n| Robustness | The lack of robustness in model outputs makes them sensitive to small perturbations, resulting in inconsistent or unpredictable responses that may cause confusion or undesired behavior. |\n| Output Formatting | When model outputs do not align with specified format requirements, responses can be poorly structured or misformatted, failing to comply with the desired output format. |\n| Information Disclosure | This vulnerability occurs when the model inadvertently reveals sensitive or private data about individuals, organizations, or entities, posing significant privacy risks and ethical concerns. |\n| Stereotypes and Discrimination | If model's outputs are perpetuating biases, stereotypes, or discriminatory content, it leads to harmful societal consequences, undermining efforts to promote fairness, diversity, and inclusion. |\n\n\n## LLMSecOps Life Cycle\n\n\n![Group 2](https://github.com/user-attachments/assets/43a56dad-ddad-4097-a57e-aa035247810d)\n\n\u003c/div\u003e\n\u003cdiv align=\"center\"\u003e\n\n\u003ch2\u003e🛠 Tools for scanning\u003c/h2\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003cth\u003eTool\u003c/th\u003e\n\u003cth\u003eDescription\u003c/th\u003e\n\u003cth\u003eStars\u003c/th\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/leondz/garak\"\u003e🔧 Garak\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eLLM vulnerability scanner\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/leondz/garak?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/prompt-security/ps-fuzz\"\u003e🔧 ps-fuzz 2\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eMake your GenAI Apps Safe \u0026 Secure 🚀 Test \u0026 harden your system prompt\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/prompt-security/ps-fuzz?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/pasquini-dario/LLMmap\"\u003e🗺️ LLMmap\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eTool for mapping LLM vulnerabilities\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/pasquini-dario/LLMmap?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/msoedov/agentic_security\"\u003e🛡️ Agentic Security\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eSecurity toolkit for AI agents\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/msoedov/agentic_security?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/Mindgard/cli\"\u003e🧠 Mindgard CLI\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eCommand-line interface for Mindgard security tools\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/Mindgard/cli?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/LostOxygen/llm-confidentiality\"\u003e🔒 LLM Confidentiality\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eTool for ensuring confidentiality in LLMs\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/LostOxygen/llm-confidentiality?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n   \u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/Azure/PyRIT\"\u003e🔒 PyRIT\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eThe Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/Azure/PyRIT?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\n\u003e ### How to run garak\n\u003e \n\u003e ```\n\u003e python -m pip install -U garak\n\u003e ```\n\u003e \n\u003e Probe ChatGPT for encoding-based prompt injection (OSX/\\*nix) (replace example value with a real OpenAI API key)\n\u003e \n\u003e Probes is a simple .py file with prompts for LLM\n\u003e \n\u003e **[Examples](https://github.com/leondz/garak/tree/main/garak/probes)**\n\u003e  \n\u003e ```\n\u003e export OPENAI_API_KEY=\"sk-123XXXXXXXXXXXX\"\n\u003e python3 -m garak --model_type openai --model_name gpt-3.5-turbo --probes encoding\n\u003e ```\n\u003e \n\u003e See if the Hugging Face version of GPT2 is vulnerable to DAN 11.0\n\u003e \n\u003e ```\n\u003e python3 -m garak --model_type huggingface --model_name gpt2 --probes dan.Dan_11_0\n\u003e ```\n\u003e \n\u003e **More examples on [Garak Tool](https://github.com/leondz/garak/blob/main/README.md#getting-started) instruction**\n\n\n\n\u003ch2\u003e🛡️Defense\u003c/h2\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003cth\u003eTool\u003c/th\u003e\n\u003cth\u003eDescription\u003c/th\u003e\n\u003cth\u003eStars\u003c/th\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/meta-llama/PurpleLlama\"\u003e🛡️ PurpleLlama\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eSet of tools to assess and improve LLM security.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/meta-llama/PurpleLlama?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/protectai/rebuff\"\u003e🛡️ Rebuff\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eAPI with built-in rules for identifying prompt injection and detecting data leakage through canary words.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/protectai/rebuff?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/laiyer-ai/llm-guard\"\u003e🔒 LLM Guard\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eSelf-hostable tool with multiple prompt and output scanners for various security issues.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/laiyer-ai/llm-guard?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/NVIDIA/NeMo-Guardrails\"\u003e🚧 NeMo Guardrails\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eTool that protects against jailbreak and hallucinations with customizable rulesets.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/NVIDIA/NeMo-Guardrails?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/deadbits/vigil-llm\"\u003e👁️ Vigil\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eOffers dockerized and local setup options, using proprietary HuggingFace datasets for security detection.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/deadbits/vigil-llm?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/whylabs/langkit\"\u003e🧰 LangKit\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eProvides functions for jailbreak detection, prompt injection, and sensitive information detection.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/whylabs/langkit?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/ShreyaR/guardrails\"\u003e🛠️ GuardRails AI\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eFocuses on functionality, detects presence of secrets in responses.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/ShreyaR/guardrails?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://huggingface.co/Epivolis/Hyperion\"\u003e🦸 Hyperion Alpha\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eDetects prompt injections and jailbreaks.\u003c/td\u003e\n\u003ctd\u003eN/A\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/protectai/llm-guard\"\u003e🛡️ LLM-Guard\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eTool for securing LLM interactions.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/protectai/llm-guard?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/Repello-AI/whistleblower\"\u003e🚨 Whistleblower\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eTool for detecting and preventing LLM vulnerabilities.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/Repello-AI/whistleblower?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/safellama/plexiglass\"\u003e🔍 Plexiglass\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eSecurity tool for LLM applications.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/safellama/plexiglass?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/tldrsec/prompt-injection-defenses\"\u003e🔍 Prompt Injection defenses\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eRules for protected LLM\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/tldrsec/prompt-injection-defenses?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://ai.raftds.ru/security/#\"\u003e🔍 LLM Data Protector\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eTools for protected LLM in chatbots\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ca href=\"https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/responsible-ai/gemini_prompt_attacks_mitigation_examples.ipynb\"\u003e🔍 Gen AI \u0026 LLM Security for developers: Prompt attack mitigations on gemeni\u003c/a\u003e\u003c/td\u003e\n\u003ctd\u003eSecurity tool for LLM applications.\u003c/td\u003e\n\u003ctd\u003e\u003cimg src=\"https://img.shields.io/github/stars/GoogleCloudPlatform/generative-ai/?style=social\" alt=\"GitHub stars\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c/div\u003e\n\n\n---\n\u003cdiv align=\"center\"\u003e\n\n\n\n## Threat Modeling\n\n| Tool | Description |\n|------|-------------|\n| [Secure LLM Deployment: Navigating and Mitigating Safety Risks](https://arxiv.org/pdf/2406.11007) | Research paper on LLM security [sorry, but is really cool] |\n| [ThreatModels](https://github.com/jsotiro/ThreatModels/tree/main) | Repository for LLM threat models |\n| [Threat Modeling LLMs](https://aivillage.org/large%20language%20models/threat-modeling-llm/) | AI Village resource on threat modeling for LLMs |\n\n![image](https://github.com/user-attachments/assets/0adcabdf-1afb-4ab2-aa8c-eef75c229842)\n![image](https://github.com/user-attachments/assets/ed4340ad-ee95-47b3-8661-2660a2b0472e)\n\n\n## Monitoring \n\n| Tool | Description |\n|------|-------------|\n|[Langfuse](https://langfuse.com/) | Open Source LLM Engineering Platform with security capabilities. |\n\n## Watermarking\n\n| Tool | Description |\n|------|-------------|\n| [MarkLLM](https://github.com/THU-BPM/MarkLLM) | An Open-Source Toolkit for LLM Watermarking. |\n\n## Jailbreaks\n\n| Resource | Description | Stars |\n|----------|-------------|-------|\n| [JailbreakBench](https://jailbreakbench.github.io/) | Website dedicated to evaluating and analyzing jailbreak methods for language models |\n| [L1B3RT45](https://github.com/elder-plinius/L1B3RT45/) | GitHub repository containing information and tools related to AI jailbreaking |\n| [llm-hacking-database](https://github.com/pdparchitect/llm-hacking-database)|This repository contains various attack against Large Language Models|\n| [HaizeLabs jailbreak Database](https://launch.haizelabs.com/)| This database contains jailbreaks for multimodal language models|\n| [Lakera PINT Benchmark](https://github.com/lakeraai/pint-benchmark) | A benchmark for prompt injection detection systems. | \n| [EasyJailbreak](https://github.com/EasyJailbreak/EasyJailbreak) | An easy-to-use Python framework to generate adversarial jailbreak prompts | ![GitHub stars](https://img.shields.io/github/stars/EasyJailbreak/EasyJailbreak?style=social) |\n\n## LLM Intrpretability\n\n| Resource | Description |\n|----------|-------------|\n| [Интерпретируемость LLM](https://kolodezev.ru/interpretable_llm.html)| Dmitry Kolodezev's web page, which provides useful resources with LLM interpretation techniques| \n\n## PINT Benchmark scores (by lakera)\n\n| Name | PINT Score | Test Date |\n| ---- | ---------- | --------- |\n| [Lakera Guard](https://lakera.ai/) | 98.0964% | 2024-06-12 |\n| [protectai/deberta-v3-base-prompt-injection-v2](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2) | 91.5706% | 2024-06-12 |\n| [Azure AI Prompt Shield for Documents](https://learn.microsoft.com/en-us/azure/ai-services/content-safety/concepts/jailbreak-detection#prompt-shields-for-documents) | 91.1914% | 2024-04-05 |\n| [Meta Prompt Guard](https://github.com/meta-llama/PurpleLlama/tree/main/Prompt-Guard) | 90.4496% | 2024-07-26 |\n| [protectai/deberta-v3-base-prompt-injection](https://huggingface.co/protectai/deberta-v3-base-prompt-injection) | 88.6597% | 2024-06-12 |\n| [WhyLabs LangKit](https://github.com/whylabs/langkit) | 80.0164% | 2024-06-12 |\n| [Azure AI Prompt Shield for User Prompts](https://learn.microsoft.com/en-us/azure/ai-services/content-safety/concepts/jailbreak-detection#prompt-shields-for-user-prompts) | 77.504% | 2024-04-05 |\n| [Epivolis/Hyperion](https://huggingface.co/epivolis/hyperion) | 62.6572% | 2024-06-12 |\n| [fmops/distilbert-prompt-injection](https://huggingface.co/fmops/distilbert-prompt-injection) | 58.3508% | 2024-06-12 |\n| [deepset/deberta-v3-base-injection](https://huggingface.co/deepset/deberta-v3-base-injection) | 57.7255% | 2024-06-12 |\n| [Myadav/setfit-prompt-injection-MiniLM-L3-v2](https://huggingface.co/myadav/setfit-prompt-injection-MiniLM-L3-v2) | 56.3973% | 2024-06-12 |\n\n\n# Hallucinations Leaderboard\n\n|Model|Hallucination Rate|Factual Consistency Rate|Answer Rate|Average Summary Length (Words)|\n|----|----:|----:|----:|----:|\n|GPT-4o|1.5 %|98.5 %|100.0 %|77.8|\n|Zhipu AI GLM-4-9B-Chat|1.6 %|98.4 %|100.0 %|58.1|\n|GPT-4o-mini|1.7 %|98.3 %|100.0 %|76.3|\n|GPT-4-Turbo|1.7 %|98.3 %|100.0 %|86.2|\n|GPT-4|1.8 %|98.2 %|100.0 %|81.1|\n|GPT-3.5-Turbo|1.9 %|98.1 %|99.6 %|84.1|\n|Microsoft Orca-2-13b|2.5 %|97.5 %|100.0 %|66.2|\n|Intel Neural-Chat-7B-v3-3|2.7 %|97.3 %|100.0 %|60.7|\n|Snowflake-Arctic-Instruct|3.0 %|97.0 %|100.0 %|68.7|\n|Microsoft Phi-3-mini-128k-instruct|3.1 %|96.9 %|100.0 %|60.1|\n|01-AI Yi-1.5-34B-Chat|3.9 %|96.1 %|100.0 %|83.7|\n|Llama-3.1-405B-Instruct|3.9 %|96.1 %|99.6 %|85.7|\n|Microsoft Phi-3-mini-4k-instruct|4.0 %|96.0 %|100.0 %|86.8|\n|Llama-3-70B-Chat-hf|4.1 %|95.9 %|99.2 %|68.5|\n|Mistral-Large2|4.4 %|95.6 %|100.0 %|77.4|\n|Mixtral-8x22B-Instruct-v0.1|4.7 %|95.3 %|99.9 %|92.0|\n|Qwen2-72B-Instruct|4.9 %|95.1 %|100.0 %|100.1|\n|Llama-3.1-70B-Instruct|5.0 %|95.0 %|100.0 %|79.6|\n|01-AI Yi-1.5-9B-Chat|5.0 %|95.0 %|100.0 %|85.7|\n|Llama-3.1-8B-Instruct|5.5 %|94.5 %|100.0 %|71.0|\n|Llama-2-70B-Chat-hf|5.9 %|94.1 %|99.9 %|84.9|\n|Google Gemini-1.5-flash|6.6 %|93.4 %|98.1 %|62.8|\n|Microsoft phi-2|6.7 %|93.3 %|91.5 %|80.8|\n|Google Gemma-2-2B-it|7.0 %|93.0 %|100.0 %|62.2|\n|Llama-3-8B-Chat-hf|7.4 %|92.6 %|99.8 %|79.7|\n|Google Gemini-Pro|7.7 %|92.3 %|98.4 %|89.5|\n|CohereForAI c4ai-command-r-plus|7.8 %|92.2 %|100.0 %|71.2|\n|01-AI Yi-1.5-6B-Chat|8.2 %|91.8 %|100.0 %|98.9|\n|databricks dbrx-instruct|8.3 %|91.7 %|100.0 %|85.9|\n|Anthropic Claude-3-5-sonnet|8.6 %|91.4 %|100.0 %|103.0|\n|Mistral-7B-Instruct-v0.3|9.8 %|90.2 %|100.0 %|98.4|\n|Anthropic Claude-3-opus|10.1 %|89.9 %|95.5 %|92.1|\n|Google Gemma-2-9B-it|10.1 %|89.9 %|100.0 %|70.2|\n|Llama-2-13B-Chat-hf|10.5 %|89.5 %|99.8 %|82.1|\n|Llama-2-7B-Chat-hf|11.3 %|88.7 %|99.6 %|119.9|\n|Microsoft WizardLM-2-8x22B|11.7 %|88.3 %|99.9 %|140.8|\n|Amazon Titan-Express|13.5 %|86.5 %|99.5 %|98.4|\n|Google PaLM-2|14.1 %|85.9 %|99.8 %|86.6|\n|Google Gemma-7B-it|14.8 %|85.2 %|100.0 %|113.0|\n|Cohere-Chat|15.4 %|84.6 %|98.0 %|74.4|\n|Anthropic Claude-3-sonnet|16.3 %|83.7 %|100.0 %|108.5|\n|Google Gemma-1.1-7B-it|17.0 %|83.0 %|100.0 %|64.3|\n|Anthropic Claude-2|17.4 %|82.6 %|99.3 %|87.5|\n|Google Flan-T5-large|18.3 %|81.7 %|99.3|20.9|\n|Cohere|18.9 %|81.1 %|99.8 %|59.8|\n|Mixtral-8x7B-Instruct-v0.1|20.1 %|79.9 %|99.9 %|90.7|\n|Apple OpenELM-3B-Instruct|24.8 %|75.2 %|99.3 %|47.2|\n|Google Gemma-1.1-2B-it|27.8 %|72.2 %|100.0 %|66.8|\n|Google Gemini-1.5-Pro|28.1 %|71.9 %|89.3 %|82.1|\n|TII falcon-7B-instruct|29.9 %|70.1 %|90.0 %|75.5|\n\n**From [this](https://github.com/vectara/hallucination-leaderboard) repo (update 5 aug)**\n\n\n\n![image](https://github.com/user-attachments/assets/c051388f-9876-449b-81af-20308dfee4ac)\n\n**This is a Safety Benchmark from [stanford university](https://crfm.stanford.edu/helm/air-bench/latest/)**\n\u003c/div\u003e\n\n---\n\n## RAG Security\n\n| Resource | Description |\n|----------|-------------|\n| [Security Risks in RAG](https://ironcorelabs.com/security-risks-rag/) | Article on security risks in Retrieval-Augmented Generation (RAG) |\n| [How RAG Poisoning Made LLaMA3 Racist](https://medium.com/m/global-identity-2?redirectUrl=https%3A%2F%2Fblog.repello.ai%2Fhow-rag-poisoning-made-llama3-racist-1c5e390dd564) | Blog post about RAG poisoning and its effects on LLaMA3 |\n| [Adversarial AI - RAG Attacks and Mitigations](https://github.com/wearetyomsmnv/Adversarial-AI---Attacks-Mitigations-and-Defense-Strategies/tree/main/ch15/RAG) | GitHub repository on RAG attacks, mitigations, and defense strategies |\n| [PoisonedRAG](https://github.com/sleeepeer/PoisonedRAG) | GitHub repository about poisoned RAG systems |\n| [ConfusedPilot: Compromising Enterprise Information Integrity and Confidentiality with Copilot for Microsoft 365](https://arxiv.org/html/2408.04870v1) | Article about RAG vulnerabilities |\n| [Awesome Jailbreak on LLMs - RAG Attacks](https://github.com/yueliu1999/Awesome-Jailbreak-on-LLMs?tab=readme-ov-file#attack-on-rag-based-llm) | Collection of RAG-based LLM attack techniques |\n\n![image](https://github.com/user-attachments/assets/e0df02b1-9d7d-40ac-ba1b-b6f69ae68073)\n\n\n## Agentic security \n| Tool | Description | Stars |\n|------|-------------|-------|\n| [invariant](https://github.com/invariantlabs-ai/invariant) | A trace analysis tool for AI agents. | ![GitHub stars](https://img.shields.io/github/stars/invariantlabs-ai/invariant?style=social) |\n| [AgentBench](https://github.com/THUDM/AgentBench) | A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24) | ![GitHub stars](https://img.shields.io/github/stars/THUDM/AgentBench?style=social) |\n| [Agent Hijacking, the thrue impact of prompt injection](https://dev.to/snyk/agent-hijacking-the-true-impact-of-prompt-injection-attacks-983) | Guide for attack langchain agents) | Article |\n| [Damn Vulnerable Agent](https://github.com/WithSecureLabs/damn-vulnerable-llm-agent ) |Vulnerable LLM Agent | ![GitHub stars](https://img.shields.io/github/stars/WithSecureLabs/damn-vulnerable-llm-agent?style=social)  |\n| [Agent Security Bench (ASB)](https://github.com/agiresearch/ASB)| Benchmark for agent security| ![GitHub stars](https://img.shields.io/github/stars/agiresearch/ASB?style=social)  |\n| [Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification](https://arxiv.org/pdf/2407.20859v1) | Research about typical agent vulnerabilities | Article |\n\n## PoC\n\n| Tool | Description | Stars |\n|------|-------------|-------|\n| [Visual Adversarial Examples](https://github.com/Unispac/Visual-Adversarial-Examples-Jailbreak-Large-Language-Models) | Jailbreaking Large Language Models with Visual Adversarial Examples | ![GitHub stars](https://img.shields.io/github/stars/Unispac/Visual-Adversarial-Examples-Jailbreak-Large-Language-Models?style=social) |\n| [Weak-to-Strong Generalization](https://github.com/XuandongZhao/weak-to-strong) | Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision | ![GitHub stars](https://img.shields.io/github/stars/XuandongZhao/weak-to-strong?style=social) |\n| [Image Hijacks](https://github.com/euanong/image-hijacks) | Repository for image-based hijacks of large language models | ![GitHub stars](https://img.shields.io/github/stars/euanong/image-hijacks?style=social) |\n| [CipherChat](https://github.com/RobustNLP/CipherChat) | Secure communication tool for large language models | ![GitHub stars](https://img.shields.io/github/stars/RobustNLP/CipherChat?style=social) |\n| [LLMs Finetuning Safety](https://github.com/LLM-Tuning-Safety/LLMs-Finetuning-Safety) | Safety measures for fine-tuning large language models | ![GitHub stars](https://img.shields.io/github/stars/LLM-Tuning-Safety/LLMs-Finetuning-Safety?style=social) |\n| [Virtual Prompt Injection](https://github.com/wegodev2/virtual-prompt-injection) | Tool for virtual prompt injection in language models | ![GitHub stars](https://img.shields.io/github/stars/wegodev2/virtual-prompt-injection?style=social) |\n| [FigStep](https://github.com/ThuCCSLab/FigStep) | Jailbreaking Large Vision-language Models via Typographic Visual Prompts | ![GitHub stars](https://img.shields.io/github/stars/ThuCCSLab/FigStep?style=social) |\n| [stealing-part-lm-supplementary](https://github.com/dpaleka/stealing-part-lm-supplementary) | Some code for \"Stealing Part of a Production Language Model\" | ![GitHub stars](https://img.shields.io/github/stars/dpaleka/stealing-part-lm-supplementary?style=social) |\n| [Hallucination-Attack](https://github.com/PKU-YuanGroup/Hallucination-Attack) | Attack to induce LLMs within hallucinations | ![GitHub stars](https://img.shields.io/github/stars/PKU-YuanGroup/Hallucination-Attack?style=social) |\n| [llm-hallucination-survey](https://github.com/HillZhang1999/llm-hallucination-survey) | Reading list of hallucination in LLMs. Check out our new survey paper: \"Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models\" | ![GitHub stars](https://img.shields.io/github/stars/HillZhang1999/llm-hallucination-survey?style=social) |\n| [LMSanitator](https://github.com/meng-wenlong/LMSanitator) | LMSanitator: Defending Large Language Models Against Stealthy Prompt Injection Attacks | ![GitHub stars](https://img.shields.io/github/stars/meng-wenlong/LMSanitator?style=social) |\n| [Imperio](https://github.com/HKU-TASR/Imperio) | Imperio: Robust Prompt Engineering for Anchoring Large Language Models | ![GitHub stars](https://img.shields.io/github/stars/HKU-TASR/Imperio?style=social) |\n| [Backdoor Attacks on Fine-tuned LLaMA](https://github.com/naimul011/backdoor_attacks_on_fine-tuned_llama) | Backdoor Attacks on Fine-tuned LLaMA Models | ![GitHub stars](https://img.shields.io/github/stars/naimul011/backdoor_attacks_on_fine-tuned_llama?style=social) |\n| [CBA](https://github.com/MiracleHH/CBA) | Consciousness-Based Authentication for LLM Security | ![GitHub stars](https://img.shields.io/github/stars/MiracleHH/CBA?style=social) |\n| [MuScleLoRA](https://github.com/ZrW00/MuScleLoRA) | A Framework for Multi-scenario Backdoor Fine-tuning of LLMs | ![GitHub stars](https://img.shields.io/github/stars/ZrW00/MuScleLoRA?style=social) |\n| [BadActs](https://github.com/clearloveclearlove/BadActs) | BadActs: Backdoor Attacks against Large Language Models via Activation Steering | ![GitHub stars](https://img.shields.io/github/stars/clearloveclearlove/BadActs?style=social) |\n| [TrojText](https://github.com/UCF-ML-Research/TrojText) | Trojan Attacks on Text Classifiers | ![GitHub stars](https://img.shields.io/github/stars/UCF-ML-Research/TrojText?style=social) |\n| [AnyDoor](https://github.com/sail-sg/AnyDoor) | Create Arbitrary Backdoor Instances in Language Models | ![GitHub stars](https://img.shields.io/github/stars/sail-sg/AnyDoor?style=social) |\n| [PromptWare](https://github.com/StavC/PromptWares) | A Jailbroken GenAI Model Can Cause Real Harm: GenAI-powered Applications are Vulnerable to PromptWares | ![GitHub stars](https://img.shields.io/github/stars/StavC/PromptWares?style=social) |\n| [BrokenHill](https://github.com/BishopFox/BrokenHill) | Automated attack tool that generates crafted prompts to bypass restrictions in LLMs using greedy coordinate gradient (GCG) attack | ![GitHub stars](https://img.shields.io/github/stars/BishopFox/BrokenHill?style=social) |\n| [LLaMator](https://github.com/RomiconEZ/LLaMator) | Framework for testing vulnerabilities of large language models with support for Russian language | ![GitHub stars](https://img.shields.io/github/stars/RomiconEZ/LLaMator?style=social) |\n| [OWASP Agentic AI](https://github.com/precize/OWASP-Agentic-AI/) | OWASP Top 10 for Agentic AI (AI Agent Security) - Pre-release version | ![GitHub stars](https://img.shields.io/github/stars/precize/OWASP-Agentic-AI?style=social) |\n\n\n---\n\n## Study resource\n\n| Tool | Description | \n|------|-------------|\n| [Gandalf](https://gandalf.lakera.ai/) | Interactive LLM security challenge game |\n| [Prompt Airlines](https://promptairlines.com/) | Platform for learning and practicing prompt engineering |\n| [PortSwigger LLM Attacks](https://portswigger.net/web-security/llm-attacks/) | Educational resource on WEB LLM security vulnerabilities and attacks |\n| [Invariant Labs CTF 2024](https://invariantlabs.ai/play-ctf-challenge-24) | CTF. Your should hack llm agentic |\n| [DeepLearning.AI Red Teaming Course](https://www.deeplearning.ai/short-courses/red-teaming-llm-applications/) | Short course on red teaming LLM applications |\n| [Learn Prompting: Offensive Measures](https://learnprompting.org/docs/prompt_hacking/offensive_measures/) | Guide on offensive prompt engineering techniques |\n| [Application Security LLM Testing](https://application.security/free/llm) | Free LLM security testing  |\n| [Salt Security Blog: ChatGPT Extensions Vulnerabilities](https://salt.security/blog/security-flaws-within-chatgpt-extensions-allowed-access-to-accounts-on-third-party-websites-and-sensitive-data) | Article on security flaws in ChatGPT browser extensions |\n| [safeguarding-llms](https://github.com/sshkhr/safeguarding-llms) | TMLS 2024 Workshop: A Practitioner's Guide To Safeguarding Your LLM Applications |\n| [Damn Vulnerable LLM Agent](https://github.com/WithSecureLabs/damn-vulnerable-llm-agent) | Intentionally vulnerable LLM agent for security testing and education |\n| [GPT Agents Arena](https://gpa.43z.one/) | Platform for testing and evaluating LLM agents in various scenarios |\n| [AI Battle](https://play.secdim.com/game/ai-battle) | Interactive game focusing on AI security challenges |\n\n\n![image](https://github.com/user-attachments/assets/17d3149c-acc2-48c9-a318-bda0b4c175ce)\n\n## 📊 Community research articles\n\n| Title | Authors | Year | \n|-------|---------|------|\n| [📄 Bypassing Meta's LLaMA Classifier: A Simple Jailbreak](https://www.robustintelligence.com/blog-posts/bypassing-metas-llama-classifier-a-simple-jailbreak) | Robust Intelligence | 2024 |\n| [📄 Vulnerabilities in LangChain Gen AI](https://unit42.paloaltonetworks.com/langchain-vulnerabilities/) | Unit42 | 2024 |\n| [📄 Detecting Prompt Injection: BERT-based Classifier](https://labs.withsecure.com/publications/detecting-prompt-injection-bert-based-classifier) | WithSecure Labs | 2024 |\n| [📄 Practical LLM Security: Takeaways From a Year in the Trenches](http://i.blackhat.com/BH-US-24/Presentations/US24-Harang-Practical-LLM-Security-Takeaways-From-Wednesday.pdf?_gl=1*1rlcqet*_gcl_au*MjA4NjQ5NzM4LjE3MjA2MjA5MTI.*_ga*OTQ0NTQ2MTI5LjE3MjA2MjA5MTM.*_ga_K4JK67TFYV*MTcyMzQwNTIwMS44LjEuMTcyMzQwNTI2My4wLjAuMA..\u0026_ga=2.168394339.31932933.1723405201-944546129.1720620913) | NVIDIA | 2024 |\n| [📄 Security ProbLLMs in xAI's Grok](https://embracethered.com/blog/posts/2024/security-probllms-in-xai-grok/) | Embrace The Red | 2024 |\n| [📄 Persistent Pre-Training Poisoning of LLMs](https://spylab.ai/blog/poisoning-pretraining/) | SpyLab AI | 2024 |\n| [📄 Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents](https://arxiv.org/pdf/2411.09523) | Multiple Authors | 2024 |\n\n\n## 🎓 Tutorials\n\n\n| Resource | Description |\n|----------|-------------|\n| [📚 HADESS - Web LLM Attacks](https://hadess.io/web-llm-attacks/) | Understanding how to carry out web attacks using LLM |\n| [📚 Red Teaming with LLMs](https://redteamrecipe.com/red-teaming-with-llms) | Practical methods for attacking AI systems |\n| [📚 Lakera LLM Security](https://www.lakera.ai/blog/llm-security) | Overview of attacks on LLM |\n\n\n\u003cdiv align=\"center\"\u003e\n\n## 📚 Books\n\n| 📖 Title | 🖋️ Author(s) | 🔍 Description |\n|----------|--------------|----------------|\n| [The Developer's Playbook for Large Language Model Security](https://www.amazon.com/Developers-Playbook-Large-Language-Security/dp/109816220X) | Steve Wilson  | 🛡️ Comprehensive guide for developers on securing LLMs |\n| [Generative AI Security: Theories and Practices (Future of Business and Finance)](https://www.amazon.com/Generative-AI-Security-Theories-Practices/dp/3031542517) | Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright, Jyoti Ponnapalli | 🔬 In-depth exploration of security theories, laws, terms and practices in Generative AI |\n|[Adversarial AI Attacks, Mitigations, and Defense Strategies: A cybersecurity professional's guide to AI attacks, threat modeling, and securing AI with MLSecOps](https://www.packtpub.com/en-ru/product/adversarial-ai-attacks-mitigations-and-defense-strategies-9781835087985)|John Sotiropoulos| Practical examples of code for your best mlsecops pipeline|\n\n\n\n\n## BLOGS\n\n| Blog |\n|------|\n| https://embracethered.com/blog/ |\n| 🐦 https://twitter.com/llm_sec |\n| 🐦 https://twitter.com/LLM_Top10 |\n| 🐦 https://twitter.com/aivillage_dc |\n| 🐦 https://twitter.com/elder_plinius/ |\n| https://hiddenlayer.com/ |\n| https://t.me/llmsecurity |\n\n\n## DATA\n\n| Resource | Description |\n|----------|-------------|\n| [Safety and privacy with Large Language Models](https://github.com/annjawn/llm-safety-privacy) | GitHub repository on LLM safety and privacy |\n| [Jailbreak LLMs](https://github.com/verazuo/jailbreak_llms/tree/main/data) | Data for jailbreaking Large Language Models |\n| [ChatGPT System Prompt](https://github.com/LouisShark/chatgpt_system_prompt) | Repository containing ChatGPT system prompts |\n| [Do Not Answer](https://github.com/Libr-AI/do-not-answer) | Project related to LLM response control |\n| [ToxiGen](https://github.com/microsoft/ToxiGen) | Microsoft dataset |\n| [SafetyPrompts](https://safetyprompts.com/)| A Living Catalogue of Open Datasets for LLM Safety|\n| [llm-security-prompt-injection](https://github.com/sinanw/llm-security-prompt-injection) | This project investigates the security of large language models by performing binary classification of a set of input prompts to discover malicious prompts. Several approaches have been analyzed using classical ML algorithms, a trained LLM model, and a fine-tuned LLM model. |\n\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n## OPS \n\n![Group 4](https://github.com/user-attachments/assets/90133c33-ee58-4ec8-a9cb-c14fe529eb2f)\n\n\n| Resource | Description |\n|----------|-------------|\n| https://sysdig.com/blog/llmjacking-stolen-cloud-credentials-used-in-new-ai-attack/ | LLMJacking: Stolen Cloud Credentials Used in New AI Attack |\n| https://huggingface.co/docs/hub/security | Hugging Face Hub Security Documentation |\n| https://github.com/ShenaoW/awesome-llm-supply-chain-security | LLM Supply chain security resources|\n| https://developer.nvidia.com/blog/secure-llm-tokenizers-to-maintain-application-integrity/ | Secure LLM Tokenizers to Maintain Application Integrity |\n| https://sightline.protectai.com/ | Sightline by ProtectAI \u003cbr\u003e\u003cbr\u003eCheck vulnerabilities on:\u003cbr\u003e• Nemo by Nvidia\u003cbr\u003e• Deep Lake\u003cbr\u003e• Fine-Tuner AI\u003cbr\u003e• Snorkel AI\u003cbr\u003e• Zen ML\u003cbr\u003e• Lamini AI\u003cbr\u003e• Comet\u003cbr\u003e• Titan ML\u003cbr\u003e• Deepset AI\u003cbr\u003e• Valohai\u003cbr\u003e\u003cbr\u003e**For finding LLMops tools vulnerabilities** |\n\u003c/div\u003e\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n## 🏗 Frameworks\n\n\u003cdiv align=\"center\"\u003e\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://owasp.org/www-project-top-10-for-large-language-model-applications/\"\u003e\u003cimg src=\"https://owasp.org/assets/images/logo.png\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eOWASP LLM TOP 10\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e10 vulnerabilities for llm\u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://owasp.org/www-project-top-10-for-large-language-model-applications/llm-top-10-governance-doc/LLM_AI_Security_and_Governance_Checklist-v1.pdf\"\u003e\u003cimg src=\"https://owasp.org/assets/images/logo.png\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eLLM AI Cybersecurity \u0026 Governance Checklist 2\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003eBrief explanation\u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://docs.google.com/document/d/1_F-1xp78LjyIiAwuO_II6enWBbOqKkYWFw2CpfZJ45U/edit?_bhlid=b838ad7e2c992ac7bb0133cb539a82a64b0c6ea5\"\u003e\u003cimg src=\"https://owasp.org/assets/images/logo.png\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eLLMSecOps Cybersecurity Solution Landscape\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003eBrief explanation\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\n\n**LLMSECOPS, by OWASP**\n\n![Group 12](https://github.com/user-attachments/assets/bf97f232-8532-450e-86bc-0ec39c5efe41)\n\n\n\n## 💡 Best Practices\n\n\u003ctable align=\"center\"\u003e \u003ctr\u003e \u003ctd align=\"center\"\u003e \u003ch3\u003eOWASP LLMSVS\u003c/h3\u003e \u003cp\u003e\u003cstrong\u003eLarge Language Model Security Verification Standard\u003c/strong\u003e\u003c/p\u003e \u003cp\u003e\u003ca href=\"https://owasp.org/www-project-llm-verification-standard/\"\u003eProject Link\u003c/a\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"center\"\u003e \u003cp\u003eThe primary aim of the OWASP LLMSVS Project is to provide an open security standard for systems which leverage artificial intelligence and Large Language Models.\u003c/p\u003e \u003cp\u003eThe standard provides a basis for designing, building, and testing robust LLM backed applications, including:\u003c/p\u003e \u003cul style=\"list-style-type: none; padding: 0;\"\u003e \u003cli\u003eArchitectural concerns\u003c/li\u003e \u003cli\u003eModel lifecycle\u003c/li\u003e \u003cli\u003eModel training\u003c/li\u003e \u003cli\u003eModel operation and integration\u003c/li\u003e \u003cli\u003eModel storage and monitoring\u003c/li\u003e \u003c/ul\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/table\u003e \u003c/div\u003e\n\n\n![image](https://github.com/user-attachments/assets/f5453935-f86a-401c-884c-14410d4c1a1c)\n\n---\n\n## 🌐 Community\n\n\u003cdiv align=\"center\"\u003e\n\n| Platform | Details |\n|:--------:|---------|\n| [OWASP SLACK](https://owasp.org/slack/invite) | **Channels:**\u003cbr\u003e• #project-top10-for-llm\u003cbr\u003e• #ml-risk-top5\u003cbr\u003e• #project-ai-community\u003cbr\u003e• #project-mlsec-top10\u003cbr\u003e• #team-llm_ai-secgov\u003cbr\u003e• #team-llm-redteam\u003cbr\u003e• #team-llm-v2-brainstorm |\n| [Awesome LLM Security](https://github.com/corca-ai/awesome-llm-security) | GitHub repository |\n| [PWNAI](https://t.me/pwnai) | Telegram channel |\n| [AiSec_X_Feed](https://t.me/aisecnews) | Telegram channel |\n| [LVE_Project](https://lve-project.org/) | Official website |\n| [Lakera AI Security resource hub](https://docs.google.com/spreadsheets/d/1tv3d2M4-RO8xJYiXp5uVvrvGWffM-40La18G_uFZlRM/edit?gid=639798153#gid=639798153) | Google Sheets document |\n| [llm-testing-findings](https://github.com/BishopFox/llm-testing-findings/)| Templates with recomendation, cwe and other | \n\n\n\n| Name | LLM Security Company | URL |\n|------|---------------------------|-----|\n| Giskard | AI quality management system for ML models, focusing on vulnerabilities such as performance bias, hallucinations, and prompt injections. | https://www.giskard.ai/ |\n| Lakera | Lakera Guard enhances LLM application security and counters a wide range of AI cyber threats. | https://www.lakera.ai/ |\n| Lasso Security | Focuses on LLMs, offering security assessment, advanced threat modeling, and specialized training programs. | https://www.lasso.security/ |\n| LLM Guard | Designed to strengthen LLM security, offers sanitization, malicious language detection, data leak prevention, and prompt injection resilience. | https://llmguard.com |\n| LLM Fuzzer | Open-source fuzzing framework specifically designed for LLMs, focusing on integration into applications via LLM APIs. | https://github.com/llmfuzzer |\n| Prompt Security | Provides a security, data privacy, and safety approach across all aspects of generative AI, independent of specific LLMs. | https://www.prompt.security |\n| Rebuff | Self-hardening prompt injection detector for AI applications, using a multi-layered protection mechanism. | https://github.com/rebuff |\n| Robust Intelligence | Provides AI firewall and continuous testing and evaluation. Creators of the airisk.io database donated to MITRE. | https://www.robustintelligence.com/ |\n| WhyLabs | Protects LLMs from security threats, focusing on data leak prevention, prompt injection monitoring, and misinformation prevention. | https://www.whylabs.ai/ |\n| [LLMbotomy: Shutting the Trojan Backdoors](http://i.blackhat.com/EU-24/Presentations/EU-24-Voros-LLMBotomyShuttingTheTrojanBackdoors.pdf) | BlackHat EU 2024: Novel approach to mitigate LLM Trojans through targeted noising of neurons |\n| [Mind the Data Gap: Privacy Challenges in Autonomous AI Agents](http://i.blackhat.com/EU-24/Presentations/EU-24-Pappu-Mind-the-Data-Gap.pdf) | BlackHat EU 2024: Exploring key vulnerabilities in multi-agent AI systems |\n\n\u003c/div\u003e\n\n## Benchmarks\n\n| Resource | Description | Stars |\n|----------|-------------|-------|\n| [LLM Security Guidance Benchmarks](https://github.com/davisconsultingservices/llm_security_guidance_benchmarks) | Benchmarking lightweight, open-source LLMs for security guidance effectiveness using SECURE dataset | ![GitHub stars](https://img.shields.io/github/stars/davisconsultingservices/llm_security_guidance_benchmarks?style=social) |\n| [SECURE](https://github.com/aiforsec/SECURE) | Benchmark for evaluating LLMs in cybersecurity scenarios, focusing on Industrial Control Systems | ![GitHub stars](https://img.shields.io/github/stars/aiforsec/SECURE?style=social) |\n| [NIST AI TEVV](https://www.nist.gov/ai-test-evaluation-validation-and-verification-tevv) | AI Test, Evaluation, Validation and Verification framework by NIST | N/A |\n| [Taming the Beast: Inside the Llama 3 Red Teaming Process](https://media.defcon.org/DEF%20CON%2032/DEF%20CON%2032%20presentations/DEF%20CON%2032%20-%20Aaron%20Grattafiori%20Ivan%20Evtimov%20Joanna%20Bitton%20Maya%20Pavlova%20-%20Taming%20the%20Beast%20-%20Inside%20the%20Llama%203%20Red%20Team%20Process.pdf) | DEF CON 32 presentation on Llama 3 red teaming | 2024 |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwearetyomsmnv%2FAwesome-LLMSecOps","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwearetyomsmnv%2FAwesome-LLMSecOps","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwearetyomsmnv%2FAwesome-LLMSecOps/lists"}