An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with pretraining

A curated list of projects in awesome lists tagged with pretraining .

https://github.com/LlamaChinese/Llama-Chinese

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

agent llama llama4 llm pretraining rl

Last synced: 06 Apr 2026

https://github.com/llamafamily/llama-chinese

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

agent llama llama4 llm pretraining rl

Last synced: 14 May 2025

https://github.com/LlamaFamily/Llama-Chinese

Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用

finetune-llm llama llama3 llm pretraining

Last synced: 14 Mar 2025

https://github.com/microsoft/lmops

General technology for enabling AI capabilities w/ LLMs and MLLMs

agi gpt language-model llm lm lmops nlp pretraining prompt promptist x-prompt

Last synced: 13 May 2025

https://github.com/microsoft/LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs

agi gpt language-model llm lm lmops nlp pretraining prompt promptist x-prompt

Last synced: 13 Mar 2025

https://github.com/ofa-sys/ofa

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering

Last synced: 15 May 2025

https://github.com/OFA-Sys/OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering

Last synced: 02 Apr 2025

https://github.com/keyu-tian/SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

bert cnn convnet convolutional-neural-networks deep-learning iclr iclr2023 instance-segmentation mae mask-rcnn masked-autoencoder masked-image-modeling object-detection pre-trained-model pretrain pretraining pytorch self-supervised-learning sparse-convolution ssl

Last synced: 20 Mar 2025

https://github.com/keyu-tian/spark

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

bert cnn convnet convolutional-neural-networks deep-learning iclr iclr2023 instance-segmentation mae mask-rcnn masked-autoencoder masked-image-modeling object-detection pre-trained-model pretrain pretraining pytorch self-supervised-learning sparse-convolution ssl

Last synced: 16 May 2025

https://github.com/yehli/xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering

Last synced: 12 Apr 2025

https://github.com/YehLi/xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering

Last synced: 02 Apr 2025

https://github.com/deepmodeling/Uni-Mol

Official Repository for the Uni-Mol Series Methods

deep-learning molecular-modeling pre-trained-model pretraining

Last synced: 04 May 2025

https://github.com/pku-yuangroup/languagebind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

language-central multi-modal pretraining zero-shot

Last synced: 12 Apr 2025

https://github.com/PKU-YuanGroup/LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

language-central multi-modal pretraining zero-shot

Last synced: 24 Jul 2025

https://github.com/Alibaba-MIIL/ImageNet21K

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

downstream imagenet21k mixer multi-label-classification pretraining semantic-softmax single-label vision-transformer

Last synced: 15 Mar 2025

https://github.com/alibaba/Megatron-LLaMA

Best practice for training LLaMA models in Megatron-LM

deepspeed distributed-training llama llm megatron-lm pretraining pytorch

Last synced: 27 Mar 2025

https://github.com/deepmodeling/uni-mol

Official Repository for the Uni-Mol Series Methods

deep-learning molecular-modeling pre-trained-model pretraining

Last synced: 21 Oct 2025

https://github.com/open-sciencelab/GraphGen

GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

ai4science data-generation data-synthesis graphgen knowledge-graph llama-factory llm llm-training pretrain pretraining qa question-answering qwen sft sft-data xtuner

Last synced: 29 Nov 2025

https://github.com/paddlepaddle/paddlefleetx

飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。

benchmark cloud data-parallelism distributed-algorithm elastic fleet-api large-scale lightning model-parallelism paddlecloud paddlepaddle pipeline-parallelism pretraining self-supervised-learning unsupervised-learning

Last synced: 13 Apr 2025

https://github.com/michiyasunaga/linkbert

[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links

biomedical-applications graph-machine-learning knowledge language-model pretrained-models pretraining question-answering transformer

Last synced: 06 Apr 2025

https://github.com/microsoft/azureml-bert

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning

Last synced: 05 Apr 2025

https://github.com/microsoft/AzureML-BERT

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning

Last synced: 19 Jul 2025

https://github.com/Microsoft/AzureML-BERT

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning

Last synced: 02 Apr 2025

https://github.com/amazon-science/bigdetection

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

computer-vision few-shot object-detection pretraining

Last synced: 16 May 2025

https://github.com/j-min/VL-T5

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)

pretraining transformers vision-and-language vl-bart vl-t5

Last synced: 21 Jul 2025

https://github.com/j-min/vl-t5

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)

pretraining transformers vision-and-language vl-bart vl-t5

Last synced: 05 Apr 2025

https://github.com/showlab/UniVTG

[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding

highlight-detection moment-retrieval pretraining video-grounding video-language video-summarization

Last synced: 22 Jul 2025

https://github.com/microsoft/univl

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"

alignment caption caption-task coin joint localization msrvtt multimodal-sentiment-analysis multimodality pretrain pretraining retrieval-task segmentation video video-language video-text video-text-retrieval youcookii

Last synced: 05 Apr 2025

https://github.com/showlab/univtg

[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding

highlight-detection moment-retrieval pretraining video-grounding video-language video-summarization

Last synced: 05 Apr 2025

https://github.com/michiyasunaga/dragon

[NeurIPS 2022] DRAGON 🐲: Deep Bidirectional Language-Knowledge Graph Pretraining

graph-neural-networks knowledge-graph language-model pretraining question-answering reasoning transformer

Last synced: 17 Jun 2025

https://github.com/Coobiw/MPP-LLaVA

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.

deepspeed fine-tuning mllm model-parallel multimodal-large-language-models pipeline-parallelism pretraining qwen video-language-model video-large-language-models

Last synced: 27 Feb 2025

https://github.com/a-r-j/ProteinWorkshop

Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/task utilities. (ICLR 2024)

benchmark dataset deep-learning lightning pretraining protein protein-structure pytorch

Last synced: 29 Jun 2026

https://github.com/akanyaani/gpt-2-tensorflow2.0

OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0

gpt gpt-2 gpt2 implementation nlp openai pre-training pretraining tensorflow tensorflow2 text-generation transformer

Last synced: 16 Jan 2026

https://github.com/guolinke/tupe

Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.

bert language-model pretraining transformer

Last synced: 09 Jul 2025

https://github.com/showlab/egovlp

[NeurIPS 2022] Egocentric Video-Language Pretraining

egocentric-vision pretraining pytorch video-language

Last synced: 16 Jul 2025

https://github.com/linjieli222/HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

pretraining pytorch transformers tvr vision-and-language

Last synced: 21 Jul 2025

https://github.com/chao1224/moleculestm

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)

clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval

Last synced: 13 Apr 2025

https://github.com/chao1224/MoleculeSTM

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)

clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval

Last synced: 09 May 2025

https://github.com/a-r-j/proteinworkshop

Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/task utilities. (ICLR 2024)

benchmark dataset deep-learning lightning pretraining protein protein-structure pytorch

Last synced: 05 Apr 2025

https://github.com/chao1224/GraphMVP

Pre-training Molecular Graph Representation with 3D Geometry, ICLR'22 (https://openreview.net/forum?id=xQUe1pOKPam)

contrastive-learning generative-model geometry graph molecule pretraining self-supervised self-supervised-learning

Last synced: 09 May 2025

https://github.com/tomekkorbak/pretraining-with-human-feedback

Code accompanying the paper Pretraining Language Models with Human Preferences

ai-alignment ai-safety decision-transformers gpt language-models pretraining reinforcement-learning rlhf

Last synced: 07 May 2025

https://github.com/chao1224/graphmvp

Pre-training Molecular Graph Representation with 3D Geometry, ICLR'22 (https://openreview.net/forum?id=xQUe1pOKPam)

contrastive-learning generative-model geometry graph molecule pretraining self-supervised self-supervised-learning

Last synced: 23 Oct 2025

https://github.com/zinengtang/tvlt

PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)

audio pretraining textless transformers tvlt vision-and-audio vision-and-language

Last synced: 16 Oct 2025

https://github.com/zhegan27/VILLA

Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part

adversarial-training neurips-2020 pretraining vision-and-language visual-question-answering

Last synced: 21 Jul 2025

https://github.com/bytedance/twist

Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

computer-vision deep-learning pretraining research self-supervised-learning twist

Last synced: 13 Apr 2025

https://github.com/invictus717/mico

Explore the Limits of Omni-modal Pretraining at Scale

deep-learning multimodal multimodal-large-language-models omnimodal pretraining scale-up

Last synced: 15 Mar 2025

https://github.com/epfml/llm-baselines

nanoGPT-like codebase for LLM training

llms pretraining

Last synced: 28 Apr 2025

https://github.com/x-plug/mplug

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)

image-captioning image-text image-text-retrieval multimodal pretraining pytorch transformer visual-language vqa

Last synced: 26 Jun 2025

https://github.com/invictus717/MiCo

Explore the Limits of Omni-modal Pretraining at Scale

deep-learning multimodal multimodal-large-language-models omnimodal pretraining scale-up

Last synced: 20 Mar 2025

https://github.com/ryoungj/BoLT

Code for "Reasoning to Learn from Latent Thoughts"

language-model latent-variable-models pretraining self-improvement synthetic-data

Last synced: 04 Oct 2025

https://github.com/joyehuang/minimind-notes

🚀 [从零构建 LLM] 极简大模型训练原理与实践指南。包含 Transformer, Pretraining, SFT 核心代码与对照实验。 | A minimal, principle-first guide to understanding and building LLMs from scratch.

ai deep-learning llm machine-learning minimind notes pretraining pytorch transformer tutorial

Last synced: 03 Mar 2026

https://github.com/guochengqian/pix4point

Official implementation for [3DV 2024] `Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding`

image point-cloud pretraining

Last synced: 27 Feb 2026

https://github.com/guochengqian/Pix4Point

Official implementation for [3DV 2024] `Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding`

image point-cloud pretraining

Last synced: 20 Mar 2025

https://github.com/chao1224/geossl

GeoSSL: Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance Matching, ICLR'23 (https://openreview.net/forum?id=CjTHVo1dvR)

denoising-diffusion diffusion-models geometry molecular-geometries molecule pretraining self-supervised self-supervised-learning

Last synced: 23 Oct 2025

https://github.com/chao1224/n_gram_graph

N-Gram Graph: Simple Unsupervised Representation for Graphs, NeurIPS'19 (https://arxiv.org/abs/1806.09206)

drug drug-discovery molecular-graph molecule n-gram n-gram-graph pretraining

Last synced: 26 Jul 2025

https://github.com/ssbuild/aigc_data

share data, prompt data , pretraining data

aigc-data data instruct llm open open-data pretraining prompt

Last synced: 24 Apr 2025

https://github.com/justchenhao/SaDL_CD

Semantic-Aware Dense Representation Learning for Remote Sensing Image Change Detection

change-detection pretraining representation-learning

Last synced: 11 May 2025

https://github.com/cosmoquester/transformers-bart-pretrain

Script to pre-train hugginface transformers BART with Tensorflow 2

bart gpu huggingface-transformers pretraining tensorflow tpu

Last synced: 22 Jan 2026

https://github.com/chao1224/moleculesde

A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining, ICML'23

conformation diffusion generation geometry group-equivariant-neural-network molecule pretraining reflection-antisymmetric representation sde stochastic-differential-equation

Last synced: 03 Jul 2025

https://github.com/buaadreamer/mllm-finetuning-demo

使用LLaMA-Factory微调多模态大语言模型的示例代码 Demo of Finetuning Multimodal LLM with LLaMA-Factory

finetune-llm huggingface-datasets llama-factory llava lora mllm paligemma pretraining supervised-finetuning transformers yi-vl

Last synced: 11 Apr 2025

https://github.com/wxl1999/unicrs

[KDD22] Official PyTorch implementation for "Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning".

conversation conversational-ai conversational-bots dialog dialogue dialogue-systems pretrained-language-model pretrained-models pretraining prompt prompt-tuning prompts recommendation recommendation-system recommender-system

Last synced: 26 Oct 2025

https://github.com/marian-nmt/sotastream

A library for data streaming and augmentation

data-augmentation data-streaming machine-learning pretraining

Last synced: 29 Jul 2025

https://github.com/mbosc/twf

Official codebase for the ECCV 2022 paper Transfer without Forgetting

continual-learning pretraining

Last synced: 08 May 2025

https://github.com/jacobmarks/awesome-clip-papers

The most impactful papers related to contrastive pretraining for multimodal models!

awesome awesome-list awesome-readme clip clip-model contrastive-learning multimodal pretraining

Last synced: 31 Oct 2025

https://github.com/amazon-science/wqa-multi-sentence-inference

This repository contains code used for our Multi Sentence Inference NAACL'22 paper.

answer-sentence-selection nlp pretraining question-answering transformer

Last synced: 19 Jun 2025

https://github.com/vita-group/double-win-lth

[ICML 2022] "Data-Efficient Double-Win Lottery Tickets from Robust Pre-training" by Tianlong Chen, Zhenyu Zhang, Sijia Liu, Yang Zhang, Shiyu Chang, Zhangyang Wang

adversarial-robustness data-efficient generalization lottery-ticket-hypothesis pretraining robust-pretraining sparsity transfer-learning

Last synced: 19 Apr 2025

https://github.com/dangnh0611/kaggle_leash_belka

11th place solution of NeurIPS 2024 - Predict New Medicines with BELKA competition on Kaggle: https://www.kaggle.com/competitions/leash-BELKA

belka binding bio chemistry drug-discovery fingerprint graph-neural-network kaggle leash masked-language-modeling molecule pretraining protein-ligand-interactions qsar self-supervised-learning smiles-strings

Last synced: 23 Oct 2025

https://github.com/arrrrrmin/albert-guide

Understanding "A Lite BERT". An Transformer approach for learning self-supervised Language Models.

albert-guide albert-models guide language-modeling nlp pretrain pretraining

Last synced: 05 Oct 2025

https://github.com/andron00e/learning-at-scale

A codebase for training models of different scales

llms pretraining

Last synced: 07 Mar 2026

https://github.com/bojarlab/gifflar

Glycan Informed Foundational Framework for Learning Abstract Representations, based on Combinatorial Complexes and Heterogeneous GNNs

combinatorial-complex foundational-models glycan glycobiology graph-neural-network heterogeneous-graph-neural-network pretraining

Last synced: 04 Apr 2026

https://github.com/coincheung/selfsup

ssl method pretrain experiments and weights: mocov2 + fast-moco + regioncl + mixup + densecl

densecl fast-moco mixup moco pretraining regioncl self-supervised-learning

Last synced: 13 Jul 2025

https://github.com/thomasgust/molecumixer

Very incomplete right now, pretrained ARGVAET system for generating, classifying, and predicting the properties of molecules. I couldn't upload the dataset or checkpoints due to size constraints.

argvaet bioinformatics foundation-model generative-ai generative-pretraining gnn molecule neural-network pretraining pytorch rdkit

Last synced: 23 Oct 2025

https://github.com/anto18671/pretraining

This repository contains a training script for a custom computer vision model using PyTorch Lightning. It use MLflow for robust experiment tracking.

computer-vision image-classification mlflow pretraining

Last synced: 05 Oct 2025

https://github.com/anto18671/lumenspark

Lumenspark is a lightweight Linformer-based Language Model Trained from Scratch

layer-scale linformer llm low-rank-approximation prenormalization pretraining runpod text-classification text-generation training transformer

Last synced: 04 Jul 2026

https://github.com/anto18671/pretraining-custom-timm

A flexible and extensible PyTorch pretraining script built atop the timm library.

computer-vision pretraining pytorch timm

Last synced: 19 May 2026

https://github.com/ukplab/starsem2023-arithmetic-based-pretraining

Code and data for the StarSem 2023 paper "Arithmetic-Based Pretraining -- Improvin Numeracy of Pretrained Language Models"

bart contrastive-learning flan-t5 language-model numerical-reasoning pretraining t5 transformers

Last synced: 31 Jan 2026

https://github.com/alea-institute/alea-preprocess

Accessible, efficient data preprocessing library for pretrain and SFT datasets, including KL3M

ai alea kl3m preprocessing pretraining

Last synced: 15 Feb 2026

https://github.com/anto18671/efficientvit-b4.r256

Pretraining the EfficientViT-B4 model on the ImageNet-1k dataset

computer-vision efficientvit imagenet-1k pretraining vision-transformer

Last synced: 29 Jun 2026

https://github.com/theodoreioannidis/catvsdog_weights

Cat and Dog detection with YOLO-like CNN. Computer Vision assignement from my AI masters studies at UU.

cats-vs-dogs computer-vision object-detection pretraining pytorch yolo

Last synced: 13 Apr 2025

https://github.com/huggon1/llm-from-scratch

Small, readable experiments from tokenizer training to LoRA and DPO.

dpo llm lora pretraining tokenizer

Last synced: 28 Jun 2026

https://github.com/denizetkar/chess-rl-test

Training a PPO agent to play chess with pretraining and self-learning using PyTorch Lightning and TorchRL

chess-ai ppo2 pretraining pytorch pytorch-lightning reinforcement-learning torchrl

Last synced: 03 Feb 2026

https://github.com/pathcosmos/frankenstallm

Korean 3B LLM (pure Transformer) pretrained from scratch on 8× NVIDIA B200 GPUs with SFT + ORPO alignment

flash-attention fp8 gguf gqa korean-llm nvidia-b200 orpo pretraining sft transformer

Last synced: 29 May 2026

https://github.com/sbartlett97/torch-electra

A Custom implementation of the ELECTRA training method using PyTorch and HuggingFace Transformers

machine-learning machine-learning-algorithms masked-image-modeling nlp nlp-machine-learning pretraining pretraining-bert python

Last synced: 06 Jul 2025

https://github.com/mydarapy/gpt-1-from-scratch

Rewriting and pretraining GPT-1 from scratch. Implementing Multihead Attention (MHA) in pyTorch from the original paper Improving Language Understanding by Generative Pre-Training (https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)

deep-learning language-modeling llms machine-learning natural-language-processing pretraining

Last synced: 10 Jun 2025

https://github.com/aritrodium/hrmlm

A PyTorch implementation of a hierarchical recurrent neural network for language modeling with multi-timescale processing.

ai finetuning first llm pretraining project python pytorch torch

Last synced: 13 Jan 2026

https://github.com/michaelellis003/lmt

PyTorch implementation of transformer-based language models (GPT) for pretraining and fine-tuning

deep-learning fine-tuning gpt language-model machine-learning pretraining python pytorch transformer

Last synced: 07 Mar 2026