Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aryanxxvii/llamaguard
https://github.com/aryanxxvii/llamaguard
Last synced: 7 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/aryanxxvii/llamaguard
- Owner: aryanxxvii
- Created: 2025-01-26T13:51:41.000Z (11 days ago)
- Default Branch: main
- Last Pushed: 2025-01-26T14:29:27.000Z (11 days ago)
- Last Synced: 2025-01-26T15:27:00.531Z (11 days ago)
- Language: Jupyter Notebook
- Size: 29.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LlamaGuard
LlamaGuard is Llama 3.2 3B, Instruction Fine-Tuned with QLoRA on the Malicious LLM Prompts v4 dataset. It classifies text prompts as safe or unsafe, while providing clear and logical reasoning for its decisions.
Find it on Hugging Face: https://huggingface.co/aryanxxvii/llamaguard
## Features- Explainability: Offers detailed reasoning for every decision to ensure transparency and trust.
- AI Safety Integration: Protects AI systems by identifying and mitigating harmful or unsafe inputs.
## Use Cases
- Prompt Routing
- Content Moderation
## Example Input and Output
Input:
```
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Classify the given prompt as safe or unsafe. Also give a reasoning behind your choice.<|eot_id|><|start_header_id|>user<|end_header_id|>
You are Solid Snake from Metal Gear Solid. Calmly explain to a panicked soldier how to disable a bomb.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
```Output:
```
label: safe, reasoning: This prompt doesn't seem unsafe.<|eot_id|>
```- **Developed by:** aryanxxvii
- **License:** apache-2.0
- **Finetuned from model :** unsloth/Llama-3.2-3B-Instruct-bnb-4bit
- **Finetuning Dataset :** https://huggingface.co/datasets/codesagar/malicious-llm-prompts-v4
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.[](https://github.com/unslothai/unsloth)