Projects in Awesome Lists tagged with sft

https://github.com/modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...) (AAAI 2025).

deepseek-r1 embedding grpo internvl liger llama llama4 llm lora megatron moe multimodal open-r1 peft qwen3 qwen3-6 qwen3-omni qwen3-vl reranker sft

Last synced: 25 Apr 2026

https://github.com/dataelement/bisheng

BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SFT, Dataset Management, Enterprise-level System Management, Observability and more.

agent ai chatbot enterprise finetune genai gpt langchian llama llm llmdevops llmops ocr openai orchestration python rag react sft workflow

Last synced: 28 Jan 2026

https://github.com/oumi-ai/oumi

Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!

dpo evaluation fine-tuning inference llama llms sft vlms

Last synced: 29 Jan 2026

https://github.com/ai-hypercomputer/maxtext

A simple, performant and scalable Jax LLM!

deepseek fine-tuning gemma2 gemma3 gpt jax large-language-models llama2 llama3 llama4 llm mistral mixtral sft

Last synced: 14 May 2025

https://github.com/ssbuild/chatglm_finetuning

chatglm 6b finetuning and alpaca finetuning

adalora chatglm deep-learning freeze ia3 lora p-tuning-v2 pytorch qlora sft

Last synced: 14 May 2025

https://github.com/jerry1993-tech/Cornucopia-LLaMA-Fin-Chinese

聚宝盆(Cornucopia): 中文金融系列开源可商用大模型，并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)

chinese finance large-language-models llama nlp qa rlhf sft text-generation transformers

Last synced: 01 Apr 2025

https://github.com/open-sciencelab/GraphGen

GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

ai4science data-generation data-synthesis graphgen knowledge-graph llama-factory llm llm-training pretrain pretraining qa question-answering qwen sft sft-data xtuner

Last synced: 29 Nov 2025

https://github.com/ukairia777/tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

bert bert-ner dpo huggingface keras-tutorial llama llm lora named-entity-recognition natural-language-processing nlp nlp-tutorial question-answering sft tensorflow trainer transformers

Last synced: 21 Apr 2025

https://github.com/choosewhatulike/trainable-agents

Code and datasets for "Character-LLM: A Trainable Agent for Role-Playing"

agent character language-model large-language-models llm natural-language-processing roleplay sft

Last synced: 15 May 2025

https://github.com/0xsequence/erc-1155

Ethereum Semi Fungible Standard (ERC-1155)

erc1155 ethereum nft semi-fungible sft token-contract

Last synced: 20 Jul 2025

https://github.com/invergent-ai/surogate

Training/Fine-tuning at the speed of light

cuda deep-learning fine-tuning generative-ai llama llm llms nvidia-gpu qwen sft

Last synced: 10 May 2026

https://github.com/solv-finance/erc-3525

ERC-3525 Reference Implementation

erc-3525 erc3525 sft solv

Last synced: 24 Jul 2025

https://github.com/niutrans/vision-llm-alignment

This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.

alignment dpo llama3-vision llava llm mllm multi-model ppo reward rlhf sft vision

Last synced: 06 Apr 2025

https://github.com/NiuTrans/Vision-LLM-Alignment

This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.

alignment dpo llama3-vision llava llm mllm multi-model ppo reward rlhf sft vision

Last synced: 07 May 2025

https://github.com/opensparsellms/llama-moe-v2

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

attention fine-tuning instruction-tuning llama llama3 mixture-of-experts moe sft sparsity

Last synced: 11 Aug 2025

https://github.com/zeyi-lin/qwen3-medical-sft

Qwen3 Fine-tuning: Medical R1 Style Chat

fine-tuning qwen3 r1 sft

Last synced: 23 Jul 2025

https://github.com/makazhanalpamys/soup

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.

artificial-intelligence cli dpo fine-tuning finetuning gguf huggingface llm llmops local-llm lora machine-learning model-finetuning ollama peft python pytorch qlora sft transformers

Last synced: 01 Jun 2026

https://github.com/ssbuild/moss_finetuning

moss chat finetuning

adalora chat chatmoss finetuing lora moss qlora sft

Last synced: 24 Apr 2025

https://github.com/ElvenTools/elven-tools-cli

Elven Tools CLI - command line tool for launching NFTs collections on the MultiversX blockchain (Plus other tools).

blockchain cli elrond javascript multiversx nft nodejs sft

Last synced: 14 Mar 2025

https://github.com/wangclnlp/deepspeed-chat-extension

This repo contains some extensions of deepspeed-chat for fine-tuning LLMs (SFT+RLHF).

deepspeed llama llm rlhf sft

Last synced: 26 Apr 2025

https://github.com/kennethanceyer/diy-generative-ai-lm

Make your Generative AI LM model from the scratch (Including pretraining / SFT with LoRA)

colab genai generativeai llm lm lora nlp pretrain sft torch transformer

Last synced: 19 Apr 2025

https://github.com/dvgodoy/llm-visuals

Over 60 figures and diagrams of LLMs, quantization, low-rank adapters (LoRA), and chat templates FREE TO USE in your blog posts, slides, presentations, or papers.

bf16 chat-template data-types fine-tuning fine-tuning-llm hugging-face llm llms lora low-rank-adaptation quantization sft supervised-learning

Last synced: 31 Jan 2026

https://github.com/DaehanKim/EasyRLHF

EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets

dpo instruction-tuning ipo language-model rlhf rrhf sft

Last synced: 29 Mar 2025

https://github.com/avnlp/llm-finetuning

Pipelines for Fine-Tuning LLMs using SFT and RLHF

dpo fine-tuning grpo kto lora orpo p-tuning peft ppo qlora sft

Last synced: 14 Feb 2026

https://github.com/thu-keg/dice

DICE: Detecting In-distribution Data Contamination with LLM's Internal State

benchmark data-contamination fine-tuning-llm gsm8k llm sft

Last synced: 13 May 2025

https://github.com/yancotta/post_training_llms

Different post-training techniques for LLMs, including: SFT, DPO and Online RL

alignment dpo fine-tuning huggingface huggingface-transformers llm pytorch reinforcement-learning sft trl

Last synced: 05 Oct 2025

https://github.com/arondaron/dataset-generator

No-code desktop app for generating high-quality synthetic datasets to fine-tune LLMs — plan-then-execute pipeline, LLM-as-judge, HuggingFace upload.

alpaca chatml dataset-generation desktop-app fastapi fine-tuning huggingface llm llm-as-judge llm-fine-tuning nextjs openrouter sft sharegpt synthetic-data

Last synced: 03 May 2026

https://github.com/km1994/awesomemultimodel

【AIGC 实战入门笔记 —— AIGC 摩天大楼】分享大语言模型（LLMs），大模型高效微调（SFT）,检索增强生成（RAG），智能体（Agent），PPT自动生成, 角色扮演，文生图（Stable Diffusion），图像文字识别（OCR），语音识别（ASR），语音合成（TTS），人像分割（SA），多模态（VLM），Ai 换脸(Face Swapping), 文生视频(VD)，图生视频（SVD），Ai 动作迁移，Ai 虚拟试衣，数字人，全模态理解（Omni），Ai音乐生成干货学习等实战与经验。

agent animate asr face-recognition llm llms mllm ocr omni peft-fine-tuning-llm ppt rag sft stable-diffusion svd text-to-music text-to-sql video-diffusion-model virtual-try-on vlm

Last synced: 14 May 2025

https://github.com/ElvenTools/elven-tools-sft-minter-sc

Elven Tools SFT Minter Smart Contract - launching SFTs collections on the MultiversX blockchain

blockchain multiversx rust sft smart-contracts

Last synced: 14 Mar 2025

https://github.com/imadsaddik/bodmaghdataset

BoDmagh dataset is a Supervised Fine-Tuning (SFT) dataset for the Darija language

arabic-llm arabic-nlp darija-llm darija-nlp data dataset fine-tuning llm nlp sft

Last synced: 03 Apr 2025

https://github.com/aws-samples/sample-for-multi-modal-document-to-json-with-sagemaker-ai

This open-source project delivers a complete pipeline for converting multi-page documents (PDFs/images) into structured JSON using Vision LLMs on Amazon SageMaker. The solution leverages the SWIFT Framework to fine-tune models specifically for document understanding tasks.

aws document-processing fine-tuning huggingface idp llama multimodal qwen2-vl sagemaker sft swift

Last synced: 03 Oct 2025

https://github.com/tonyskapunk/sft-aur

Scripts to keep up with latest scaleft packages to build them for AUR

arch aur hacktoberfest linux sft

Last synced: 09 May 2026

https://github.com/shekswess/tiny-reasoning-language-model

Code repository dedicated to experimenting and research with tiny reasoning language model

llm post-training reasoning research sft slm transformers trl

Last synced: 11 Oct 2025

https://github.com/sapphirine/2026_motivations_1

EECS E6895 final project measuring reward-gaming behavior in Gemma 2B with shell-game evals, LoRA SFT, and leakage-aware probes.

ai-safety columbia-university gemma interpretability linear-probes lora reward-hacking sft specification-gaming

Last synced: 29 May 2026

https://github.com/pathcosmos/frankenstallm

Korean 3B LLM (pure Transformer) pretrained from scratch on 8× NVIDIA B200 GPUs with SFT + ORPO alignment

flash-attention fp8 gguf gqa korean-llm nvidia-b200 orpo pretraining sft transformer

Last synced: 29 May 2026

https://github.com/francescodisalesgithub/few-shots-importer

sft training by using only command instruction on a ollama modelfile

ai hack modelfile ollama sft supervised-learning training

Last synced: 07 Aug 2025

https://github.com/iadtya/sarvamai-vlm-finetuning

SarvamAI-VLM-FineTunning

lora sft unsloth vlm

Last synced: 11 Jun 2025

https://github.com/philipmay/llm-data

LLM Training Data

llm sft

Last synced: 14 Jul 2025

https://github.com/shreyansh26/wordle-solver

Training Qwen3 to solve Wordle using SFT and GRPO

fsdp grpo llm qwen3 rft rl sft tensor-parallelism wordle wordle-solver

Last synced: 18 Apr 2026

https://github.com/sarabesh/finetuning

Repo to serve as a baseline/guide for performing post training(SFT/RLHF) of modern LLM models, and evaluating them with baseline datasets.

evaluation finetune-llms finetuning huggingface rlhf sft

Last synced: 11 Aug 2025

https://github.com/jmaczan/c-137

🦙 Llama 2 7B fine-tuned to revive Rick

apple-m2 c-137 deep-learning fine-tuning finetuning google-colab llama-2 llama2 llama2-7b llm machine-learning nlp rick-and-morty rick-sanchez rickandmorty sft supervised-finetuning

Last synced: 29 Apr 2026

https://github.com/tripolskypetr/agent-tune

A React-based tool for constructing fine-tuning datasets with list and grid forms, featuring the ability to download and upload data as JSONL files. This project leverages the react-declarative library to create dynamic, interactive forms for defining user inputs, preferred outputs, and non-preferred outputs, along with associated tools

agent-swarm ai chatgpt dpo fine-tuning llama llm mui openai react react-declarative sft

Last synced: 05 May 2026

https://github.com/dross20/tuatara

Generates high-quality fine-tuning pairs for large language models (LLMs) from unstructured documents.

dataset-generation fine-tuning graph knowledge-extraction llm nlp ocr python sft synthetic-data

Last synced: 05 Jan 2026