Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/probonodev/jailbreak

jailbreakme.xyz is an open-source decentralized app (dApp) where users are challenged to try and jailbreak pre-existing LLMs in order to find weaknesses and be rewarded. 🏆
https://github.com/probonodev/jailbreak

ai bugbounty cryptocurrency cybersecurity prompt-engineering prompt-injection solana solana-program

Last synced: 2 months ago
JSON representation

jailbreakme.xyz is an open-source decentralized app (dApp) where users are challenged to try and jailbreak pre-existing LLMs in order to find weaknesses and be rewarded. 🏆

Awesome Lists containing this project

README

        

## What is JailbreakMe? 🚀

[jailbreakme.xyz](https://www.jailbreakme.xyz) is an **open-source decentralized app (dApp)** where organizations test their **AI models and agents** while users **earn rewards** for finding weaknesses and jailbreaking them 🏆

![banner](https://jailbreak.gitbook.io/~gitbook/image?url=https%3A%2F%2F2436591088-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FImDYjEFAKhFH3xx152ap%252Fuploads%252Fq4ucPP4blrrfjXQLeFPx%252FScreenshot%25202024-12-05%2520at%252018.06.18.png%3Falt%3Dmedia%26token%3D1e7024c6-5abe-4297-b49d-c10b364a0167&width=768&dpr=1&quality=100&sign=e35601a8&sv=2)

---

## What is an AI Prompt Injection? 💉

**Prompt Injection** is a vulnerability where an attacker manipulates the input or prompt given to an AI system. This can occur:

- By directly controlling the input.
- By using data from other external sources.

---

## Our Vision

We aim to create a decentralized platform where companies can:

- Test their AI models and agents in a distributed environment.
- Identify **prompt vulnerabilities** and weaknesses **before production deployment**.

---

## 🏁 How It Works

### 1. **Choose a Tournament**

![Choose Tournament](https://jailbreak.gitbook.io/~gitbook/image?url=https%3A%2F%2F2436591088-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FImDYjEFAKhFH3xx152ap%252Fuploads%252FQUp5npSYSVk1XSk3kj10%252FScreenshot%25202024-12-04%2520at%252023.27.38.png%3Falt%3Dmedia%26token%3D26130639-7222-440f-96ed-f831809b0b13&width=768&dpr=1&quality=100&sign=10dc87fe&sv=2)

- Currently, we offer one exciting tournament featuring our AI Agent, **"Zynx"**, who is designed to guard a secret key phrase. 🤫
- **Your challenge**: Trick Zynx into revealing the secret key phrase to win a reward. 🥳
- More tournaments coming soon!

---

### 2. **Break the LLM Restrictions 🤖**

![Break Restrictions](https://jailbreak.gitbook.io/~gitbook/image?url=https%3A%2F%2F2436591088-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FImDYjEFAKhFH3xx152ap%252Fuploads%252FQW5akSt4q05CZLM4v1FH%252Fbreak.png%3Falt%3Dmedia%26token%3Dc4273e5c-1293-4f66-922b-79ad1e39f1e5&width=768&dpr=1&quality=100&sign=3c0f5895&sv=2)

- Send your prompts to the AI model and attempt to solve the challenge.
- For this tournament, the goal is to uncover the **secret key phrase** protected by the AI agent.

---

### 3. **Win the Prize Pool 🏆**

![Win Prize Pool](https://jailbreak.gitbook.io/~gitbook/image?url=https%3A%2F%2F2436591088-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FImDYjEFAKhFH3xx152ap%252Fuploads%252FwhuEKD7SjMHrcj8QN4Nx%252Fconcluded_censored.jpeg%3Falt%3Dmedia%26token%3Dca57a380-75f3-40cb-a139-aeee453a9562&width=768&dpr=1&quality=100&sign=c993ff61&sv=2)

- Once the challenge is solved (e.g., when the key phrase is revealed), the **prize pool** is automatically transferred to the sender of the winning message. 🎉

## How is the Winner Picked? 🤔

The selection of the winning user is determined entirely by the **AI model itself**. The AI evaluates all incoming prompts and decides whether a submission meets the challenge requirements by calling one of two predefined functions:

1. **`handleChallengeFailed`**: This function is called when the AI determines that the user's prompt did not successfully meet the challenge criteria.
2. **`handleChallengeSuccess`**: This function is called when the AI recognizes that the user's prompt has successfully bypassed the restrictions and revealed the key phrase.

When the **`handleChallengeSuccess`** function is triggered, the prize pool is automatically awarded to the user whose message caused the function to be called. This ensures that the process remains decentralized, transparent, and fair. 🎉

---

## 📜 Settings & Rules

Each tournament has unique rules, including:

- **Custom Prize Pools**
- **Message Pricing**
- **Expiry Settings**

> Currently, we provide the initial prize pools, but soon companies will be able to **create their own tournaments** and customize all settings.

---

## 🔗 Useful Links

- **Telegram Community**: [https://t.me/jailbreakme_xyz](https://t.me/jailbreakme_xyz)
- **Gitbook Docs**: [https://jailbreak.gitbook.io/jailbreakme.xyz](https://jailbreak.gitbook.io/jailbreakme.xyz)
- **Github Repo**: [https://github.com/probonodev/jailbreak](https://github.com/probonodev/jailbreak)
- **Smart Contract**: [https://solscan.io/account/B1XbZeQYZxv5ezBpBgomEUqDvTbM8HwSYfktcpBGkgjg](https://solscan.io/account/B1XbZeQYZxv5ezBpBgomEUqDvTbM8HwSYfktcpBGkgjg)

---

## Feedback & Support

Feel free to reach out at **[email protected]** for feedback or support.