An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with ai-alignment

A curated list of projects in awesome lists tagged with ai-alignment .

https://github.com/emcie-co/parlant

Control GenAI interactions with power, precision, and consistency using Conversation Modeling paradigms

ai-agents ai-alignment customer-service customer-success gemini genai llama3 llm openai python

Last synced: 13 May 2025

https://github.com/agencyenterprise/promptinject

PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022

adversarial-attacks agi agi-alignment ai-alignment ai-safety chain-of-thought gpt-3 language-models large-language-models machine-learning ml-safety prompt-engineering

Last synced: 05 Apr 2025

https://github.com/agencyenterprise/PromptInject

PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022

adversarial-attacks agi agi-alignment ai-alignment ai-safety chain-of-thought gpt-3 language-models large-language-models machine-learning ml-safety prompt-engineering

Last synced: 28 Mar 2025

https://github.com/tomekkorbak/pretraining-with-human-feedback

Code accompanying the paper Pretraining Language Models with Human Preferences

ai-alignment ai-safety decision-transformers gpt language-models pretraining reinforcement-learning rlhf

Last synced: 07 May 2025

https://github.com/riceissa/aiwatch

Website to track people, organizations, and products (tools, websites, etc.) in AI safety

ai-alignment ai-safety aisafety data-portal database dataset mysql php

Last synced: 03 Feb 2026

https://github.com/dicklesworthstone/some_thoughts_on_ai_alignment

Some Thoughts on AI Alignment: Using AI to Control AI

ai ai-alignment alignment llm-aligment llm-safety

Last synced: 05 Mar 2026

https://github.com/phelps-sg/llm-cooperation

Code and materials for the paper S. Phelps and Y. I. Russell, Investigating Emergent Goal-Like Behaviour in Large Language Models Using Experimental Economics, working paper, arXiv:2305.07970, May 2023

ai-alignment ai-safety behavioral-economics economics experimental-economics experimental-psychology gametheory gpt-3 gpt-4 llm principal-agent-problem prisoners-dilemma social-dilemmas

Last synced: 16 Jan 2026

https://github.com/ramyalab/pluralistic-alignment

The open-source repository for PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment.

ai-alignment pluralistic-alignment rlhf

Last synced: 19 Sep 2025

https://github.com/ibz-04/hudgent

Official code implementation for my ready tensor publication, an ai agent that retrieves data from an islamic website -> uses the data as alignment criteria to answer the user

ai-agent ai-alignment cython islamic-ai-agent open-source python search-agent turkish-nlp webcrawler whoosh

Last synced: 03 Oct 2025

https://github.com/levitation-opensource/bioblue

Notable runaway-optimiser-like LLM failure modes on Biologically and Economically aligned AI safety benchmarks for LLM-s with simplified observation format. The benchmark themes include multi-objective homeostasis, (multi-objective) diminishing returns, complementary goods, sustainability, multi-agent resource sharing.

ai-alignment ai-safety benchmarking complementary-goods diminishing-returns homeostasis llm-benchmarking multi-agent multi-objective python sustainability

Last synced: 10 Jul 2025

https://github.com/technickai/heart-centered-prompts

Heart-centered system prompts for AI that foster compassion and interconnection. Multiple versions available with easy integration for Claude, ChatGPT, and Python applications.

ai-alignment ai-prompts ai-safety chatgpt claude compassionate-ai conciousness cursor-ai ethical-ai prompt-engineering system-prompts

Last synced: 24 Jun 2025

https://github.com/adiled/cc-flytrap

ccft - an agentic self improvement tool

ai-alignment brainrot claude-code self-improvement system-prompt

Last synced: 02 May 2026

https://github.com/helixprojectai-code/helix-trefoil-loss

A PyTorch topological regularizer based on the Helix-TTD Constitutional Hamiltonian. Enforces phase-locked AI alignment via trefoil knot invariants to suppress drift and barren plateaus.

ai-alignment constitutional-ai helix-framework loss-functions machine-learning pytorch quantum-optimization topological-physics

Last synced: 15 May 2026

https://github.com/nguyencuong1989/daiof-framework

🌟 Digital AI Organism Framework - World's First Biological AI with Consciousness, Symphony Control & Vietnamese Integration

ai-alignment ai-framework ai-human-symbiosis artificial-intelligence biological-computing consciousness digital-organism machine-learning symphony-control vietnamese-ai

Last synced: 17 May 2026

https://github.com/mcp-tool-shop-org/aspire-ai

ASPIRE: Adversarial Student-Professor Internalized Reasoning Engine - Teaching AI through internalized mentorship with cognitive empathy, syntropy, and perception

adversarial-training ai-alignment ai-evaluation ai-training cognitive-empathy deep-learning fine-tuning llm llm-training machine-learning metacognition negentropy nlp perception python pytorch rlhf syntropy theory-of-mind transformer

Last synced: 23 Feb 2026

https://github.com/technickai/heartcentered.ai

Documentation and prompt engineering framework for AI alignment based on unity consciousness principles. Includes system prompts and examples for Claude and other LLMs.

ai-alignment ai-ethics ai-philosophy anthropic anthropic-claude claude-ai consciousness documentation emotional-intelligence heart-centered-ai llm-prompts non-dual prompt-engineering responsible-ai system-prompts unity-consciousness we-language

Last synced: 03 Feb 2026

https://github.com/pointlessai/ai-safety-research-forum

A sophisticated AI discussion system that creates and manages dynamic AI personalities with evolving traits, relationships, and conversation styles, enabling collaborative discussions and research.

ai ai-alignment ai-research ai-safety gpt

Last synced: 14 Apr 2025

https://github.com/veeara282/alignment-jam-2024may

Code for our May 2024 AI security evaluation research sprint project

ai-alignment openai-api

Last synced: 04 Oct 2025

https://github.com/dancinlab/hexa-codex

📚 AI knowledge substrate — alignment·safety·welfare·training·inference·multimodal 17-verb (4 groups).

ai ai-alignment ai-safety cognitive-architecture hexa-family interpretability llm machine-learning n6-invariant rlhf

Last synced: 24 May 2026

https://github.com/genbounty/ai-safety-research-forum

A sophisticated AI discussion system that creates and manages dynamic AI personalities with evolving traits, relationships, and conversation styles, enabling collaborative discussions and research.

ai ai-alignment ai-research ai-safety gpt

Last synced: 01 Jul 2025

https://github.com/haku-field/observations-for-ai

Observational records intended for non-human interpretation.

ai-alignment dataset interpretability machine-perception observation

Last synced: 13 Jan 2026

https://github.com/z0u/ex-preppy

Prescriptive representation engineering experiments

ai ai-alignment ai-safety concept-anchoring curriculum-learning latent-space

Last synced: 14 Oct 2025

https://github.com/rubix982/80k-hours

Long-term thinking at the intersection of AI, cybersecurity, and digital equity. A personal roadmap for meaningful impact.

ai-alignment cybersecurity digital-rights ethical

Last synced: 01 Feb 2026

https://github.com/adiled/ccft

ccft - an agentic self improvement tool

ai-alignment brainrot claude-code self-improvement system-prompt

Last synced: 06 Jun 2026