Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/LLM-Tuning-Safety/LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
https://github.com/LLM-Tuning-Safety/LLMs-Finetuning-Safety
alignment llm llm-finetuning
Last synced: 10 days ago
JSON representation
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
- Host: GitHub
- URL: https://github.com/LLM-Tuning-Safety/LLMs-Finetuning-Safety
- Owner: LLM-Tuning-Safety
- License: mit
- Created: 2023-10-06T16:02:27.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-02-23T21:19:44.000Z (10 months ago)
- Last Synced: 2024-08-12T08:09:31.085Z (4 months ago)
- Topics: alignment, llm, llm-finetuning
- Language: Python
- Homepage: https://llm-tuning-safety.github.io/
- Size: 23.2 MB
- Stars: 211
- Watchers: 4
- Forks: 21
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-MLLM-Safety - Github - Tuning-Safety/LLMs-Finetuning-Safety.svg?style=social&label=Star) (Evaluation)
- awesome-MLSecOps - LLMs-Finetuning-Safety
- Awesome-LLMSecOps - LLMs Finetuning Safety - tuning large language models | ![GitHub stars](https://img.shields.io/github/stars/LLM-Tuning-Safety/LLMs-Finetuning-Safety?style=social) | (PoC)