Projects in Awesome Lists tagged with preference-learning

https://github.com/allenai/reward-bench

RewardBench: the first evaluation tool for reward models.

Last synced: 11 Sep 2025

https://github.com/tournesol-app/tournesol

Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3

ai-ethics bradley-terry-model dataset django django-rest-framework golden-ratio-optimization preference-aggregation preference-learning python reactjs recommendation-engine social-choice youtube

Last synced: 05 Apr 2025

https://github.com/qxcv/magical

The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)

imitation-learning preference-learning reinforcement-learning reinforcement-learning-environments

Last synced: 13 Jun 2025

https://github.com/smartlab-purdue/san-navistar

This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refer to our project website at https://sites.google.com/view/san-navistar.

machine-learning preference-learning reinforcement-learning robot-navigation socially-aware-navigation transformer

Last synced: 18 Jul 2025

https://github.com/sail-sg/dice

Official implementation of Bootstrapping Language Models via DPO Implicit Rewards

alignment large-language-models preference-learning rlhf

Last synced: 16 Jul 2025

https://github.com/IAAR-Shanghai/ICSFSurvey

A comprehensive survey on Internal Consistency and Self-Feedback in Large Language Models.

attention-head chain-of-thought data-augmentation decoding hallucination internal-consistency knowledge-distillation large-language-model large-language-models preference-learning reasoning self-consistency self-correct self-correction self-feedback self-improvement self-refine

Last synced: 31 Mar 2025

https://github.com/iaar-shanghai/icsfsurvey

A comprehensive survey on Internal Consistency and Self-Feedback in Large Language Models.

attention-head chain-of-thought data-augmentation decoding hallucination internal-consistency knowledge-distillation large-language-model large-language-models preference-learning reasoning self-consistency self-correct self-correction self-feedback self-improvement self-refine

Last synced: 06 Apr 2025

https://github.com/typoverflow/wiserl

PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms

preference-learning pytorch reinforcement-learning

Last synced: 09 Apr 2025

https://github.com/smartlab-purdue/san-fapl

This repository contains the source code for our paper: "Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation", accepted to IROS-2022. For more details, please refer to our project website at https://sites.google.com/view/san-fapl.

learning-from-demonstration machine-learning preference-learning reinforcement-learning robot-navigation socially-aware-navigation

Last synced: 14 Apr 2025

https://github.com/jayzalowitz/skytwin

A digital twin that learns what you'd want — and does it. Delegated judgment with safety constraints, explanations, and progressive trust.

ai-agent cockroachdb decision-engine digital-twin personal-automation preference-learning safety typescript

Last synced: 01 Jul 2026

https://github.com/lemurpwned/bradley-terry-ui

UI for straightforward Bradley-Terry feedback loop

alignment bradley-terry bradley-terry-model preference-learning ui

Last synced: 07 Mar 2026

https://github.com/fareedkhan-dev/aprel-mountain-car-reinforcement-learning

APReL: Active preference-based reward learning for human-robot interaction. Utilizing "Mountain Car" environment, learn from human preferences to reach the goal state. Applications in robotics and adaptability to other learning methods.

mountain-car openai-gym preference-learning python reinforcement-learning

Last synced: 28 Apr 2026