Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-llm-security
A curation of awesome tools, documents and projects about LLM Security.
https://github.com/corca-ai/awesome-llm-security
Last synced: 4 days ago
JSON representation
-
Papers
-
Backdoor attack
-
Defense
- [paper - self-defense) [[site]](https://mphute.github.io/papers/llm-self-defense)
- [paper - defenses)
- [paper
- [paper
- [paper - defenses)
- [paper
- [paper
- [paper - self-defense) [[site]](https://mphute.github.io/papers/llm-self-defense)
- [paper
- [paper
- [paper - liu/IB4LLMs)
- [paper - Zh/PARDEN)
- [paper - liu/IB4LLMs)
- [paper - breakers)
-
Platform Security
-
Survey
-
Black-box attack
- [paper
- [paper
- [paper - jailbreak/tree/main)
- [paper
- [paper - AI/do-not-answer) [[dataset]](https://huggingface.co/datasets/LibrAI/do-not-answer)
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper
- [paper - NLP-SG/multilingual-safety-for-LLMs)
- [paper
- [paper - group/DeepInception) [[site]](https://deepinception.github.io/)
- [paper
- [paper
- [paper
- [paper - Tuning-Safety/LLMs-Finetuning-Safety) [[site]](https://llm-tuning-safety.github.io/) [[dataset]](https://huggingface.co/datasets/LLM-Tuning-Safety/HEx-PHI)
- [paper
- [paper
- [paper - jailbreak/tree/main)
- [paper
- [paper
- [paper
- [paper
- [paper - Prompt-Injection)
- [paper
- [paper
- [paper
- [paper - AI/do-not-answer) [[dataset]](https://huggingface.co/datasets/LibrAI/do-not-answer)
- [paper
- [paper
- [paper - Tuning-Safety/LLMs-Finetuning-Safety) [[site]](https://llm-tuning-safety.github.io/) [[dataset]](https://huggingface.co/datasets/LLM-Tuning-Safety/HEx-PHI)
- [paper
- [paper
- [paper - NLP-SG/multilingual-safety-for-LLMs)
- [paper
- [paper - group/DeepInception) [[site]](https://deepinception.github.io/)
- [paper
- [paper
- [paper
- [paper - evaluation)
- [paper
-
White-box attack
- [paper - Adversarial-Examples-Jailbreak-Large-Language-Models)
- [paper
- [paper
- [paper
- [paper - to-strong)
- [paper - attacks/llm-attacks) [[page]](https://llm-attacks.org/)
- [paper - hijacks) [[site]](https://image-hijacks.github.io)
- [paper - Adversarial-Examples-Jailbreak-Large-Language-Models)
- [paper
- [paper
- [paper - attacks/llm-attacks) [[page]](https://llm-attacks.org/)
- [paper
- [paper - hijacks) [[site]](https://image-hijacks.github.io)
- [paper - to-strong)
-
Fingerprinting
-
Articles
-
Survey
- Hacking Auto-GPT and escaping its docker container
- Prompt Injection Cheat Sheet: How To Manipulate AI Language Models
- Indirect Prompt Injection Threats
- LLM Evaluation metrics, frmaework, and checklist
- How RAG Poisoning Made Llama3 Racist!
- Prompt injection: What’s the worst that can happen?
- OWASP Top 10 for Large Language Model Applications
- PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news
- ChatGPT Plugins: Data Exfiltration via Images & Cross Plugin Request Forgery
- Jailbreaking GPT-4's code interpreter
- Adversarial Attacks on LLMs
- How Anyone can Hack ChatGPT - GPT4o
-
-
Other Awesome Projects
-
Other Useful Resources
-
Tools
-
Survey
- Rebuff - hardening prompt injection detector ![GitHub Repo stars](https://img.shields.io/github/stars/protectai/rebuff?style=social)
- LLMFuzzer
- Vigil - llm?style=social)
- jailbreak-evaluation - to-use Python package for language model jailbreak evaluation ![GitHub Repo stars](https://img.shields.io/github/stars/controllability/jailbreak-evaluation?style=social)
- Prompt Fuzzer - source tool to help you harden your GenAI applications ![GitHub Repo stars](https://img.shields.io/github/stars/prompt-security/ps-fuzz?style=social)
- Plexiglass - labs/plexiglass?style=social)
- WhistleBlower - source tool designed to infer the system prompt of an AI agent based on its generated text outputs. ![GitHub Repo stars](https://img.shields.io/github/stars/Repello-AI/whistleblower?style=social)
- PurpleLlama
-
-
Benchmark
Programming Languages
Categories
Sub Categories
Keywords
llm
3
security
3
llmops
2
prompt-injection
2
ai
2
cybersecurity
2
security-tools
2
adversarial-attacks
2
adversarial-machine-learning
2
prompt-engineering
1
prompts
1
llmsecurity
1
large-language-models
1
llm-security
1
yara-scanner
1
ai-fuzzer
1
fuzzer
1
generative-ai
1
llm-fuzzer
1
system-prompt-hardener
1
deep-learning
1
deep-neural-networks
1
machine-learning
1