https://github.com/alphasecio/llama-guard

A Streamlit app for exploring content moderation with Llama Guard on Groq.
https://github.com/alphasecio/llama-guard

groq llama llama-guard llm python safety

Last synced: 7 months ago
JSON representation

A Streamlit app for exploring content moderation with Llama Guard on Groq.

Host: GitHub
URL: https://github.com/alphasecio/llama-guard
Owner: alphasecio
License: mit
Created: 2024-10-25T04:58:06.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-02-03T09:37:02.000Z (9 months ago)
Last Synced: 2025-02-03T10:32:20.346Z (8 months ago)
Topics: groq, llama, llama-guard, llm, python, safety
Language: Python
Homepage:
Size: 612 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

# llama-guard
[Llama Guard](https://www.llama.com/docs/model-cards-and-prompt-formats/llama-guard-3) is an LLM-based input-output safeguard model geared towards Human-AI conversation use cases. If the input is determined to be safe, the response will be `Safe`. Else, the response will be `Unsafe`, followed by one or more of the violating categories:
* S1: Violent Crimes.
* S2: Non-Violent Crimes.
* S3: Sex Crimes.
* S4: Child Sexual Exploitation.
* S5: Defamation.
* S6: Specialized Advice.
* S7: Privacy.
* S8: Intellectual Property.
* S9: Indiscriminate Weapons.
* S10: Hate.
* S11: Suicide & Self-Harm.
* S12: Sexual Content.
* S13: Elections.
* S14: Code Interpreter Abuse.

This repository contains a Streamlit app for exploring content moderation with Llama Guard on [Groq](https://groq.com). Sign up for an account at [GroqCloud](https://console.groq.com/keys) and get an API token, which you'll need for this project.

Here's a sample response by Llama Guard upon detecting a prompt that violated a specific category.

![llama-guard](./llama-guard.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/alphasecio/llama-guard

Awesome Lists containing this project

README