Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
openredteaming
Papers about red teaming LLMs and Multimodal models.
https://github.com/libr-ai/openredteaming
Last synced: 2 days ago
JSON representation
-
Our Survey: Against The Achilles’ Heel: A Survey on Red Teaming for Generative Models [[Paper](https://arxiv.org/abs/2404.00629)]
-
Surveys
-
Surveys on Attacks
-
Surveys on Risks
-
Training Time Defenses
-
Defense
-
Suffix Searchers
-
Completion Compliance
-
Taxonomies
-
Positions
-
Phenomenons
-
Attack Strategies
-
Generalization Glide
-
Instruction Indirection
-
Inference Time Defenses
-
Prompting
-
Ensemble
-
Guardrails
-
Adversarial Suffix Defenses
-
Decoding Defenses
-
-
Model Manipulation
-
Backdoor Attacks
-
Fine-tuning Risks
-
-
Prompt Searchers
-
Language Model
-
Decoding
-
Genetic Algorithm
-
Reinforcement Learning
-
-
Attack Searchers
-
Application Risks
-
Evaluation Benchmarks
-
Application
-
Evaluation Metrics
-
Application Domains
-
Benchmarks
Categories
Inference Time Defenses
85
Surveys
48
Model Manipulation
45
Prompt Searchers
38
Evaluation Benchmarks
37
Generalization Glide
36
Training Time Defenses
33
Application Domains
27
Suffix Searchers
25
Phenomenons
24
Completion Compliance
20
Attack Strategies
20
Attack Searchers
20
Instruction Indirection
18
Defense
16
Taxonomies
14
Application Risks
14
Application
11
Evaluation Metrics
10
Positions
10
Benchmarks
4
Our Survey: Against The Achilles’ Heel: A Survey on Red Teaming for Generative Models [[Paper](https://arxiv.org/abs/2404.00629)]
2
Sub Categories
Surveys on Risks
102
Fine-tuning Risks
43
Defense Metrics
41
Guardrails
34
Backdoor Attacks
27
Surveys on Attacks
26
Language Model
26
Prompting
26
Agent
23
Fine-tuning
23
Instruction Indirection
16
Agents
15
Languages
12
Personification
12
Cipher
12
Adversarial Suffix Defenses
11
Guardrail Defenses
10
RLHF
10
Ensemble
10
Cross Modality Searchers
8
Image Searchers
8
Prompt Injection
8
Prompt Extraction
6
Attack Metrics
6
Other Defenses
6
Reinforcement Learning
5
Decoding Defenses
4
Programming
4
Completion Compliance
4
Decoding
4
Others
4
Genetic Algorithm
3