Projects in Awesome Lists tagged with pretraining
A curated list of projects in awesome lists tagged with pretraining .
https://github.com/LlamaChinese/Llama-Chinese
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
agent llama llama4 llm pretraining rl
Last synced: 06 Apr 2026
https://github.com/llamafamily/llama-chinese
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
agent llama llama4 llm pretraining rl
Last synced: 14 May 2025
https://github.com/LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
finetune-llm llama llama3 llm pretraining
Last synced: 14 Mar 2025
https://github.com/microsoft/lmops
General technology for enabling AI capabilities w/ LLMs and MLLMs
agi gpt language-model llm lm lmops nlp pretraining prompt promptist x-prompt
Last synced: 13 May 2025
https://github.com/microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
agi gpt language-model llm lm lmops nlp pretraining prompt promptist x-prompt
Last synced: 13 Mar 2025
https://github.com/ofa-sys/ofa
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering
Last synced: 15 May 2025
https://github.com/OFA-Sys/OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering
Last synced: 02 Apr 2025
https://github.com/x-plug/mplug-owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
alpaca chatbot chatgpt damo dialogue gpt gpt4 gpt4-api huggingface instruction-tuning large-language-models llama mplug mplug-owl multimodal pretraining pytorch transformer video visual-recognition
Last synced: 10 Apr 2025
https://github.com/X-PLUG/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
alpaca chatbot chatgpt damo dialogue gpt gpt4 gpt4-api huggingface instruction-tuning large-language-models llama mplug mplug-owl multimodal pretraining pytorch transformer video visual-recognition
Last synced: 19 Apr 2025
https://github.com/keyu-tian/SparK
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
bert cnn convnet convolutional-neural-networks deep-learning iclr iclr2023 instance-segmentation mae mask-rcnn masked-autoencoder masked-image-modeling object-detection pre-trained-model pretrain pretraining pytorch self-supervised-learning sparse-convolution ssl
Last synced: 20 Mar 2025
https://github.com/keyu-tian/spark
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
bert cnn convnet convolutional-neural-networks deep-learning iclr iclr2023 instance-segmentation mae mask-rcnn masked-autoencoder masked-image-modeling object-detection pre-trained-model pretrain pretraining pytorch self-supervised-learning sparse-convolution ssl
Last synced: 16 May 2025
https://github.com/yehli/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering
Last synced: 12 Apr 2025
https://github.com/YehLi/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering
Last synced: 02 Apr 2025
https://github.com/deepmodeling/Uni-Mol
Official Repository for the Uni-Mol Series Methods
deep-learning molecular-modeling pre-trained-model pretraining
Last synced: 04 May 2025
https://github.com/pku-yuangroup/languagebind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
language-central multi-modal pretraining zero-shot
Last synced: 12 Apr 2025
https://github.com/PKU-YuanGroup/LanguageBind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
language-central multi-modal pretraining zero-shot
Last synced: 24 Jul 2025
https://github.com/Alibaba-MIIL/ImageNet21K
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
downstream imagenet21k mixer multi-label-classification pretraining semantic-softmax single-label vision-transformer
Last synced: 15 Mar 2025
https://github.com/qqlu/Entity
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
cnn computer-vision condinst deep-learning detectron2 fcos image-segmentation instance-segmentation object-detection panoptic-segmentation pretrained-models pretrained-weights pretraining pytorch segmentation semantic-segmentation
Last synced: 28 Mar 2025
https://github.com/alibaba/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
deepspeed distributed-training llama llm megatron-lm pretraining pytorch
Last synced: 27 Mar 2025
https://github.com/deepmodeling/uni-mol
Official Repository for the Uni-Mol Series Methods
deep-learning molecular-modeling pre-trained-model pretraining
Last synced: 21 Oct 2025
https://github.com/open-sciencelab/GraphGen
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
ai4science data-generation data-synthesis graphgen knowledge-graph llama-factory llm llm-training pretrain pretraining qa question-answering qwen sft sft-data xtuner
Last synced: 29 Nov 2025
https://github.com/paddlepaddle/paddlefleetx
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
benchmark cloud data-parallelism distributed-algorithm elastic fleet-api large-scale lightning model-parallelism paddlecloud paddlepaddle pipeline-parallelism pretraining self-supervised-learning unsupervised-learning
Last synced: 13 Apr 2025
https://github.com/michiyasunaga/linkbert
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links
biomedical-applications graph-machine-learning knowledge language-model pretrained-models pretraining question-answering transformer
Last synced: 06 Apr 2025
https://github.com/microsoft/azureml-bert
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning
Last synced: 05 Apr 2025
https://github.com/microsoft/AzureML-BERT
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning
Last synced: 19 Jul 2025
https://github.com/Microsoft/AzureML-BERT
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning
Last synced: 02 Apr 2025
https://github.com/amazon-science/bigdetection
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
computer-vision few-shot object-detection pretraining
Last synced: 16 May 2025
https://github.com/j-min/VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
pretraining transformers vision-and-language vl-bart vl-t5
Last synced: 21 Jul 2025
https://github.com/j-min/vl-t5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
pretraining transformers vision-and-language vl-bart vl-t5
Last synced: 05 Apr 2025
https://github.com/showlab/UniVTG
[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding
highlight-detection moment-retrieval pretraining video-grounding video-language video-summarization
Last synced: 22 Jul 2025
https://github.com/microsoft/univl
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
alignment caption caption-task coin joint localization msrvtt multimodal-sentiment-analysis multimodality pretrain pretraining retrieval-task segmentation video video-language video-text video-text-retrieval youcookii
Last synced: 05 Apr 2025
https://github.com/showlab/univtg
[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding
highlight-detection moment-retrieval pretraining video-grounding video-language video-summarization
Last synced: 05 Apr 2025
https://github.com/michiyasunaga/dragon
[NeurIPS 2022] DRAGON 🐲: Deep Bidirectional Language-Knowledge Graph Pretraining
graph-neural-networks knowledge-graph language-model pretraining question-answering reasoning transformer
Last synced: 17 Jun 2025
https://github.com/x-plug/chatplug
A Chinese Open-Domain Dialogue System
chat chatbot chatgpt chinese dialogue encoder-decoder instruction-finetuning knowledge-augment large-language-models open-domain-dialogue-system personality pretraining
Last synced: 26 Jun 2025
https://github.com/Coobiw/MPP-LLaVA
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
deepspeed fine-tuning mllm model-parallel multimodal-large-language-models pipeline-parallelism pretraining qwen video-language-model video-large-language-models
Last synced: 27 Feb 2025
https://github.com/X-PLUG/ChatPLUG
A Chinese Open-Domain Dialogue System
chat chatbot chatgpt chinese dialogue encoder-decoder instruction-finetuning knowledge-augment large-language-models open-domain-dialogue-system personality pretraining
Last synced: 09 May 2025
https://github.com/a-r-j/ProteinWorkshop
Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/task utilities. (ICLR 2024)
benchmark dataset deep-learning lightning pretraining protein protein-structure pytorch
Last synced: 29 Jun 2026
https://github.com/akanyaani/gpt-2-tensorflow2.0
OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0
gpt gpt-2 gpt2 implementation nlp openai pre-training pretraining tensorflow tensorflow2 text-generation transformer
Last synced: 16 Jan 2026
https://github.com/guolinke/tupe
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
bert language-model pretraining transformer
Last synced: 09 Jul 2025
https://github.com/showlab/egovlp
[NeurIPS 2022] Egocentric Video-Language Pretraining
egocentric-vision pretraining pytorch video-language
Last synced: 16 Jul 2025
https://github.com/linjieli222/HERO
Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
pretraining pytorch transformers tvr vision-and-language
Last synced: 21 Jul 2025
https://github.com/chao1224/moleculestm
Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval
Last synced: 13 Apr 2025
https://github.com/chao1224/MoleculeSTM
Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
clip computation-chemistry drug-discovery editing foundation-model molecule-editing moleculeclip moleculestm pretraining retrieval
Last synced: 09 May 2025
https://github.com/a-r-j/proteinworkshop
Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/task utilities. (ICLR 2024)
benchmark dataset deep-learning lightning pretraining protein protein-structure pytorch
Last synced: 05 Apr 2025
https://github.com/chao1224/GraphMVP
Pre-training Molecular Graph Representation with 3D Geometry, ICLR'22 (https://openreview.net/forum?id=xQUe1pOKPam)
contrastive-learning generative-model geometry graph molecule pretraining self-supervised self-supervised-learning
Last synced: 09 May 2025
https://github.com/tomekkorbak/pretraining-with-human-feedback
Code accompanying the paper Pretraining Language Models with Human Preferences
ai-alignment ai-safety decision-transformers gpt language-models pretraining reinforcement-learning rlhf
Last synced: 07 May 2025
https://github.com/chao1224/graphmvp
Pre-training Molecular Graph Representation with 3D Geometry, ICLR'22 (https://openreview.net/forum?id=xQUe1pOKPam)
contrastive-learning generative-model geometry graph molecule pretraining self-supervised self-supervised-learning
Last synced: 23 Oct 2025
https://github.com/zjunlp/OntoProtein
[ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding
bert gene-ontology iclr iclr2022 knowledge-graph nlp ontoprotein pretrained-models pretraining protein protein-function-prediction protein-pretraining protein-protein-interaction protein-structure-prediction pytorch
Last synced: 21 Jul 2025
https://github.com/zjunlp/ontoprotein
[ICLR 2022] OntoProtein: Protein Pretraining With Gene Ontology Embedding
bert gene-ontology iclr iclr2022 knowledge-graph nlp ontoprotein pretrained-models pretraining protein protein-function-prediction protein-pretraining protein-protein-interaction protein-structure-prediction pytorch
Last synced: 13 Jun 2025
https://github.com/zinengtang/tvlt
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
audio pretraining textless transformers tvlt vision-and-audio vision-and-language
Last synced: 16 Oct 2025
https://github.com/amazon-science/mix-generation
MixGen: A New Multi-Modal Data Augmentation
data-augmentation data-efficiency multimodal pretraining vision-language
Last synced: 03 Jul 2025
https://github.com/zhegan27/VILLA
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part
adversarial-training neurips-2020 pretraining vision-and-language visual-question-answering
Last synced: 21 Jul 2025
https://github.com/bytedance/twist
Official codes: Self-Supervised Learning by Estimating Twin Class Distribution
computer-vision deep-learning pretraining research self-supervised-learning twist
Last synced: 13 Apr 2025
https://github.com/invictus717/mico
Explore the Limits of Omni-modal Pretraining at Scale
deep-learning multimodal multimodal-large-language-models omnimodal pretraining scale-up
Last synced: 15 Mar 2025
https://github.com/epfml/llm-baselines
nanoGPT-like codebase for LLM training
Last synced: 28 Apr 2025
https://github.com/x-plug/mplug
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
image-captioning image-text image-text-retrieval multimodal pretraining pytorch transformer visual-language vqa
Last synced: 26 Jun 2025
https://github.com/wvangansbeke/revisiting-contrastive-ssl
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]
clustering contrastive-learning moco neurips neurips2021 pretraining representation-learning self-supervised-learning transfer-learning unsupervised-learning
Last synced: 12 Apr 2025
https://github.com/invictus717/MiCo
Explore the Limits of Omni-modal Pretraining at Scale
deep-learning multimodal multimodal-large-language-models omnimodal pretraining scale-up
Last synced: 20 Mar 2025
https://github.com/ryoungj/BoLT
Code for "Reasoning to Learn from Latent Thoughts"
language-model latent-variable-models pretraining self-improvement synthetic-data
Last synced: 04 Oct 2025
https://github.com/joyehuang/minimind-notes
🚀 [从零构建 LLM] 极简大模型训练原理与实践指南。包含 Transformer, Pretraining, SFT 核心代码与对照实验。 | A minimal, principle-first guide to understanding and building LLMs from scratch.
ai deep-learning llm machine-learning minimind notes pretraining pytorch transformer tutorial
Last synced: 03 Mar 2026
https://github.com/guochengqian/pix4point
Official implementation for [3DV 2024] `Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding`
Last synced: 27 Feb 2026
https://github.com/guochengqian/Pix4Point
Official implementation for [3DV 2024] `Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding`
Last synced: 20 Mar 2025
https://github.com/chao1224/geossl
GeoSSL: Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance Matching, ICLR'23 (https://openreview.net/forum?id=CjTHVo1dvR)
denoising-diffusion diffusion-models geometry molecular-geometries molecule pretraining self-supervised self-supervised-learning
Last synced: 23 Oct 2025
https://github.com/chao1224/n_gram_graph
N-Gram Graph: Simple Unsupervised Representation for Graphs, NeurIPS'19 (https://arxiv.org/abs/1806.09206)
drug drug-discovery molecular-graph molecule n-gram n-gram-graph pretraining
Last synced: 26 Jul 2025
https://github.com/ssbuild/aigc_data
share data, prompt data , pretraining data
aigc-data data instruct llm open open-data pretraining prompt
Last synced: 24 Apr 2025
https://github.com/justchenhao/SaDL_CD
Semantic-Aware Dense Representation Learning for Remote Sensing Image Change Detection
change-detection pretraining representation-learning
Last synced: 11 May 2025
https://github.com/cosmoquester/transformers-bart-pretrain
Script to pre-train hugginface transformers BART with Tensorflow 2
bart gpu huggingface-transformers pretraining tensorflow tpu
Last synced: 22 Jan 2026
https://github.com/chao1224/moleculesde
A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining, ICML'23
conformation diffusion generation geometry group-equivariant-neural-network molecule pretraining reflection-antisymmetric representation sde stochastic-differential-equation
Last synced: 03 Jul 2025
https://github.com/buaadreamer/mllm-finetuning-demo
使用LLaMA-Factory微调多模态大语言模型的示例代码 Demo of Finetuning Multimodal LLM with LLaMA-Factory
finetune-llm huggingface-datasets llama-factory llava lora mllm paligemma pretraining supervised-finetuning transformers yi-vl
Last synced: 11 Apr 2025
https://github.com/wxl1999/unicrs
[KDD22] Official PyTorch implementation for "Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning".
conversation conversational-ai conversational-bots dialog dialogue dialogue-systems pretrained-language-model pretrained-models pretraining prompt prompt-tuning prompts recommendation recommendation-system recommender-system
Last synced: 26 Oct 2025
https://github.com/marian-nmt/sotastream
A library for data streaming and augmentation
data-augmentation data-streaming machine-learning pretraining
Last synced: 29 Jul 2025
https://github.com/mbosc/twf
Official codebase for the ECCV 2022 paper Transfer without Forgetting
continual-learning pretraining
Last synced: 08 May 2025
https://github.com/ai-sandbox/iltm
iLTM: Integrated Large Tabular Model
deep-learning machine-learning meta-learning pretraining python pytorch tabular-data
Last synced: 16 Jan 2026
https://github.com/wxl1999/cfcrs
[KDD23] Official PyTorch implementation for "Improving Conversational Recommendation Systems via Counterfactual Data Simulation".
conversation conversational-ai conversational-bots conversational-recommendation conversational-recommender-system data-augmentation data-augmentation-strategies data-augmentations dialog dialogue dialogue-systems pretrained-language-model pretrained-models pretraining recommendation recommendation-system recommender-system
Last synced: 10 Oct 2025
https://github.com/jacobmarks/awesome-clip-papers
The most impactful papers related to contrastive pretraining for multimodal models!
awesome awesome-list awesome-readme clip clip-model contrastive-learning multimodal pretraining
Last synced: 31 Oct 2025
https://github.com/amazon-science/wqa-multi-sentence-inference
This repository contains code used for our Multi Sentence Inference NAACL'22 paper.
answer-sentence-selection nlp pretraining question-answering transformer
Last synced: 19 Jun 2025
https://github.com/vita-group/double-win-lth
[ICML 2022] "Data-Efficient Double-Win Lottery Tickets from Robust Pre-training" by Tianlong Chen, Zhenyu Zhang, Sijia Liu, Yang Zhang, Shiyu Chang, Zhangyang Wang
adversarial-robustness data-efficient generalization lottery-ticket-hypothesis pretraining robust-pretraining sparsity transfer-learning
Last synced: 19 Apr 2025
https://github.com/dangnh0611/kaggle_leash_belka
11th place solution of NeurIPS 2024 - Predict New Medicines with BELKA competition on Kaggle: https://www.kaggle.com/competitions/leash-BELKA
belka binding bio chemistry drug-discovery fingerprint graph-neural-network kaggle leash masked-language-modeling molecule pretraining protein-ligand-interactions qsar self-supervised-learning smiles-strings
Last synced: 23 Oct 2025
https://github.com/arrrrrmin/albert-guide
Understanding "A Lite BERT". An Transformer approach for learning self-supervised Language Models.
albert-guide albert-models guide language-modeling nlp pretrain pretraining
Last synced: 05 Oct 2025
https://github.com/andron00e/learning-at-scale
A codebase for training models of different scales
Last synced: 07 Mar 2026
https://github.com/bojarlab/gifflar
Glycan Informed Foundational Framework for Learning Abstract Representations, based on Combinatorial Complexes and Heterogeneous GNNs
combinatorial-complex foundational-models glycan glycobiology graph-neural-network heterogeneous-graph-neural-network pretraining
Last synced: 04 Apr 2026
https://github.com/coincheung/selfsup
ssl method pretrain experiments and weights: mocov2 + fast-moco + regioncl + mixup + densecl
densecl fast-moco mixup moco pretraining regioncl self-supervised-learning
Last synced: 13 Jul 2025
https://github.com/thomasgust/molecumixer
Very incomplete right now, pretrained ARGVAET system for generating, classifying, and predicting the properties of molecules. I couldn't upload the dataset or checkpoints due to size constraints.
argvaet bioinformatics foundation-model generative-ai generative-pretraining gnn molecule neural-network pretraining pytorch rdkit
Last synced: 23 Oct 2025
https://github.com/aberration-technology/burn_dragon
burn inference and training of dragon models 🔥🐉
burn continual-learning dragon libp2p linear-attention machine-learning mamba nca pretraining ttt
Last synced: 29 May 2026
https://github.com/nvlabs/rlp
RLP: Reinforcement as a Pretraining Objective
grpo language-modeling large-language-models policy-gradient pretraining reasoning reinforcement-learning
Last synced: 09 Oct 2025
https://github.com/anto18671/pretraining
This repository contains a training script for a custom computer vision model using PyTorch Lightning. It use MLflow for robust experiment tracking.
computer-vision image-classification mlflow pretraining
Last synced: 05 Oct 2025
https://github.com/vincentzed/decon
`decon`, but with python API binding.
benchmark data-pipeline data-processing data-science datacomp decontaminate deduplication evaluation instruction-tuning llm llm-eval llm-evaluation llms lm-evaluation nlp pretraining synthetic-data
Last synced: 14 Jan 2026
https://github.com/anto18671/lumenspark
Lumenspark is a lightweight Linformer-based Language Model Trained from Scratch
layer-scale linformer llm low-rank-approximation prenormalization pretraining runpod text-classification text-generation training transformer
Last synced: 04 Jul 2026
https://github.com/anto18671/pretraining-custom-timm
A flexible and extensible PyTorch pretraining script built atop the timm library.
computer-vision pretraining pytorch timm
Last synced: 19 May 2026
https://github.com/ukplab/starsem2023-arithmetic-based-pretraining
Code and data for the StarSem 2023 paper "Arithmetic-Based Pretraining -- Improvin Numeracy of Pretrained Language Models"
bart contrastive-learning flan-t5 language-model numerical-reasoning pretraining t5 transformers
Last synced: 31 Jan 2026
https://github.com/alea-institute/alea-preprocess
Accessible, efficient data preprocessing library for pretrain and SFT datasets, including KL3M
ai alea kl3m preprocessing pretraining
Last synced: 15 Feb 2026
https://github.com/anto18671/efficientvit-b4.r256
Pretraining the EfficientViT-B4 model on the ImageNet-1k dataset
computer-vision efficientvit imagenet-1k pretraining vision-transformer
Last synced: 29 Jun 2026
https://github.com/theodoreioannidis/catvsdog_weights
Cat and Dog detection with YOLO-like CNN. Computer Vision assignement from my AI masters studies at UU.
cats-vs-dogs computer-vision object-detection pretraining pytorch yolo
Last synced: 13 Apr 2025
https://github.com/huggon1/llm-from-scratch
Small, readable experiments from tokenizer training to LoRA and DPO.
dpo llm lora pretraining tokenizer
Last synced: 28 Jun 2026
https://github.com/denizetkar/chess-rl-test
Training a PPO agent to play chess with pretraining and self-learning using PyTorch Lightning and TorchRL
chess-ai ppo2 pretraining pytorch pytorch-lightning reinforcement-learning torchrl
Last synced: 03 Feb 2026
https://github.com/pathcosmos/frankenstallm
Korean 3B LLM (pure Transformer) pretrained from scratch on 8× NVIDIA B200 GPUs with SFT + ORPO alignment
flash-attention fp8 gguf gqa korean-llm nvidia-b200 orpo pretraining sft transformer
Last synced: 29 May 2026
https://github.com/sbartlett97/torch-electra
A Custom implementation of the ELECTRA training method using PyTorch and HuggingFace Transformers
machine-learning machine-learning-algorithms masked-image-modeling nlp nlp-machine-learning pretraining pretraining-bert python
Last synced: 06 Jul 2025
https://github.com/mydarapy/gpt-1-from-scratch
Rewriting and pretraining GPT-1 from scratch. Implementing Multihead Attention (MHA) in pyTorch from the original paper Improving Language Understanding by Generative Pre-Training (https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
deep-learning language-modeling llms machine-learning natural-language-processing pretraining
Last synced: 10 Jun 2025
https://github.com/aritrodium/hrmlm
A PyTorch implementation of a hierarchical recurrent neural network for language modeling with multi-timescale processing.
ai finetuning first llm pretraining project python pytorch torch
Last synced: 13 Jan 2026
https://github.com/michaelellis003/lmt
PyTorch implementation of transformer-based language models (GPT) for pretraining and fine-tuning
deep-learning fine-tuning gpt language-model machine-learning pretraining python pytorch transformer
Last synced: 07 Mar 2026