Projects in Awesome Lists tagged with preference-learning
A curated list of projects in awesome lists tagged with preference-learning .
https://github.com/allenai/reward-bench
RewardBench: the first evaluation tool for reward models.
Last synced: 11 Sep 2025
https://github.com/tournesol-app/tournesol
Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3
ai-ethics bradley-terry-model dataset django django-rest-framework golden-ratio-optimization preference-aggregation preference-learning python reactjs recommendation-engine social-choice youtube
Last synced: 05 Apr 2025
https://github.com/qxcv/magical
The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)
imitation-learning preference-learning reinforcement-learning reinforcement-learning-environments
Last synced: 13 Jun 2025
https://github.com/smartlab-purdue/san-navistar
This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refer to our project website at https://sites.google.com/view/san-navistar.
machine-learning preference-learning reinforcement-learning robot-navigation socially-aware-navigation transformer
Last synced: 18 Jul 2025
https://github.com/sail-sg/dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
alignment large-language-models preference-learning rlhf
Last synced: 16 Jul 2025
https://github.com/IAAR-Shanghai/ICSFSurvey
A comprehensive survey on Internal Consistency and Self-Feedback in Large Language Models.
attention-head chain-of-thought data-augmentation decoding hallucination internal-consistency knowledge-distillation large-language-model large-language-models preference-learning reasoning self-consistency self-correct self-correction self-feedback self-improvement self-refine
Last synced: 31 Mar 2025
https://github.com/iaar-shanghai/icsfsurvey
A comprehensive survey on Internal Consistency and Self-Feedback in Large Language Models.
attention-head chain-of-thought data-augmentation decoding hallucination internal-consistency knowledge-distillation large-language-model large-language-models preference-learning reasoning self-consistency self-correct self-correction self-feedback self-improvement self-refine
Last synced: 06 Apr 2025
https://github.com/typoverflow/wiserl
PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms
preference-learning pytorch reinforcement-learning
Last synced: 09 Apr 2025
https://github.com/smartlab-purdue/san-fapl
This repository contains the source code for our paper: "Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation", accepted to IROS-2022. For more details, please refer to our project website at https://sites.google.com/view/san-fapl.
learning-from-demonstration machine-learning preference-learning reinforcement-learning robot-navigation socially-aware-navigation
Last synced: 14 Apr 2025
https://github.com/lemurpwned/bradley-terry-ui
UI for straightforward Bradley-Terry feedback loop
alignment bradley-terry bradley-terry-model preference-learning ui
Last synced: 07 Mar 2026
https://github.com/fareedkhan-dev/aprel-mountain-car-reinforcement-learning
APReL: Active preference-based reward learning for human-robot interaction. Utilizing "Mountain Car" environment, learn from human preferences to reach the goal state. Applications in robotics and adaptability to other learning methods.
mountain-car openai-gym preference-learning python reinforcement-learning
Last synced: 28 Apr 2026
https://github.com/aleksa-sukovic/iclr2024-reward-design-for-justifiable-rl
Code for the paper "Reward Design for Justifiable Sequential Decision-Making"; ICLR 2024
alignment preference-based-reinforcement-learning preference-learning reinforcement-learning reward-design
Last synced: 17 Jan 2026
https://github.com/rowlandseymour/bsbt
Bayesian Spatial Bradley--Terry
bayesian-inference bradley-terry comparative-judgement preference-learning
Last synced: 22 Oct 2025
https://github.com/martimfasantos/custompos-for-slms
Novel Preference Optimization Algorithms for state-of-the-art small LMs, enhancing performance in GenAI and NLP tasks
evaluation gen-ai human-preferences llms nlp preference-learning preference-optimization
Last synced: 29 Oct 2025