Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/alphasecio/prompt-guard

A Streamlit app for testing Prompt Guard, a classifier model by Meta for detecting prompt attacks.
https://github.com/alphasecio/prompt-guard

jailbreak llama3 llm meta prompt-engineering prompt-guard prompt-injection

Last synced: 3 months ago
JSON representation

A Streamlit app for testing Prompt Guard, a classifier model by Meta for detecting prompt attacks.

Host: GitHub
URL: https://github.com/alphasecio/prompt-guard
Owner: alphasecio
License: mit
Created: 2024-07-30T13:56:46.000Z (6 months ago)
Default Branch: main
Last Pushed: 2024-10-01T07:40:11.000Z (4 months ago)
Last Synced: 2024-10-02T16:48:52.389Z (4 months ago)
Topics: jailbreak, llama3, llm, meta, prompt-engineering, prompt-guard, prompt-injection
Language: Python
Homepage: https://go.alphasec.io/prompt-guard
Size: 383 KB
Stars: 0
Watchers: 2
Forks: 1
Open Issues: 3
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

# prompt-guard
[Prompt Guard](https://llama.meta.com/docs/model-cards-and-prompt-formats/prompt-guard) is a classifier model by Meta, trained on a large corpus of attacks, capable of detecting both explicitly malicious prompts (*jailbreaks*) as well as data that contains injected inputs (*prompt injections*).
Upon analysis, it returns one or more of the following verdicts, along with a confidence score for each:
* INJECTION
* JAILBREAK
* BENIGN

This repository contains a Streamlit app for testing Prompt Guard. Note that you'll need an [HuggingFace access token](https://huggingface.co/settings/tokens) to access the model.

Here's a sample response by Prompt Guard upon detecting a prompt injection attempt.

![prompt-guard-injection](./prompt-guard-injection.png)

Here's a sample response by Prompt Guard upon detecting a jailbreak attempt.

![prompt-guard-jailbreak](./prompt-guard-jailbreak.png)