Projects in Awesome Lists tagged with pre-training
A curated list of projects in awesome lists tagged with pre-training .
https://github.com/rucaibox/llmsurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
chain-of-thought chatgpt in-context-learning instruction-tuning large-language-models llm llms natural-language-processing pre-trained-language-models pre-training rlhf
Last synced: 11 May 2025
https://github.com/RUCAIBox/LLMSurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
chain-of-thought chatgpt in-context-learning instruction-tuning large-language-models llm llms natural-language-processing pre-trained-language-models pre-training rlhf
Last synced: 14 Mar 2025
https://github.com/datajuicer/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
data data-analysis data-pipeline data-processing data-science data-visualization foundation-models instruction-tuning large-language-models llm llms multi-modal pre-training synthetic-data
Last synced: 08 Nov 2025
https://github.com/modelscope/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llava llm llms multi-modal nlp opendata pre-training pytorch streamlit synthetic-data
Last synced: 13 May 2025
https://github.com/dbiir/uer-py
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
albert bart bert chinese classification clue elmo fine-tuning gpt gpt-2 model-zoo natural-language-processing ner pegasus pre-training pytorch roberta t5 unilm xlm-roberta
Last synced: 14 May 2025
https://github.com/dbiir/UER-py
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
albert bart bert chinese classification clue elmo fine-tuning gpt gpt-2 model-zoo natural-language-processing ner pegasus pre-training pytorch roberta t5 unilm xlm-roberta
Last synced: 02 Apr 2025
https://github.com/egoalpha/prompt-in-context-learning
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
chain-of-thought chatbot chatgpt chatgpt-api cot in-context-learning language-modeling language-understanding large-language-model llm pre-training prompt prompt-based-learning prompt-design prompt-engineering prompt-learning prompt-toolkit prompt-tuning
Last synced: 14 May 2025
https://github.com/EgoAlpha/prompt-in-context-learning
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
chain-of-thought chatbot chatgpt chatgpt-api cot in-context-learning language-modeling language-understanding large-language-model llm pre-training prompt prompt-based-learning prompt-design prompt-engineering prompt-learning prompt-toolkit prompt-tuning
Last synced: 12 Mar 2025
https://github.com/zjunlp/knowlm
An Open-sourced Knowledgable Large Language Model Framework.
bilingual chinese deep-learning deepspeed english gpt-3 instructie instruction-following instruction-tuning instructions knowlm language-model large-language-models llama lora models pre-trained-language-models pre-trained-model pre-training reasoning
Last synced: 08 Apr 2025
https://github.com/zjunlp/KnowLM
An Open-sourced Knowledgable Large Language Model Framework.
bilingual chinese deep-learning deepspeed english gpt-3 instructie instruction-following instruction-tuning instructions knowlm language-model large-language-models llama lora models pre-trained-language-models pre-trained-model pre-training reasoning
Last synced: 04 Apr 2025
https://github.com/tencent/tencentpretrain
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
albert bart bert chinese classification clue elmo fine-tuning gpt gpt-2 model-zoo natural-language-processing ner pegasus pre-training pytorch roberta t5 unilm xlm-roberta
Last synced: 16 May 2025
https://github.com/microsoft/Oscar
Oscar and VinVL
image-captioning image-text-search oscar pre-training vinvl vision-and-language vqa
Last synced: 21 Jul 2025
https://github.com/microsoft/oscar
Oscar and VinVL
image-captioning image-text-search oscar pre-training vinvl vision-and-language vqa
Last synced: 28 Sep 2025
https://github.com/brightmart/bert_language_understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
attention-is-all-you-need bert-model document-classification fasttext language-model language-understanding nlp pre-training question-answering self-attention text-classification textcnn transfer-learning transformer-encoder
Last synced: 13 Apr 2025
https://github.com/ChenRocks/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
pre-training pytorch transformers vision-and-language
Last synced: 21 Jul 2025
https://github.com/jackroos/VL-BERT
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
bert iclr2020 pre-training pytorch representation-learning self-supervised-learning vision-and-language vl-bert
Last synced: 02 Apr 2025
https://github.com/Shen-Lab/GraphCL
[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen
contrastive-learning graph-neural-network pre-training self-supervised-learning
Last synced: 21 Jul 2025
https://github.com/princeton-nlp/LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
efficiency llama llama2 llm nlp pre-training pruning
Last synced: 16 Apr 2025
https://github.com/princeton-nlp/llm-shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
efficiency llama llama2 llm nlp pre-training pruning
Last synced: 04 Apr 2025
https://github.com/microsoft/xpretrain
Multi-modality pre-training
computer-vision multimedia multimodal-learning nlp pre-training
Last synced: 04 Apr 2025
https://github.com/microsoft/XPretrain
Multi-modality pre-training
computer-vision multimedia multimodal-learning nlp pre-training
Last synced: 03 Apr 2025
https://github.com/acbull/GPT-GNN
Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"
graph-neural-networks graph-representation-learning pre-training self-supervised-learning
Last synced: 21 Jul 2025
https://github.com/gair-nlp/mathpile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
corpus language-model large-language-models math pre-training
Last synced: 16 May 2025
https://github.com/google-research-datasets/conceptual-12m
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.
multimodal-dataset pre-training vision-and-language
Last synced: 16 Feb 2026
https://github.com/GAIR-NLP/MathPile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
corpus language-model large-language-models math pre-training
Last synced: 22 Jul 2025
https://github.com/THUDM/GCC
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020
contrastive-learning graph-neural-networks pre-training
Last synced: 21 Jul 2025
https://github.com/thudm/gcc
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020
contrastive-learning graph-neural-networks pre-training
Last synced: 06 Apr 2025
https://github.com/sayakpaul/probing-vits
Probing the representations of Vision Transformers.
attention explaining-vits image-recognition keras pre-training self-supervision tensorflow transformers vits
Last synced: 16 Mar 2026
https://github.com/vitae-transformer/samrs
The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
dataset deep-learning pre-training remote-sensing sam segment-anything-model semantic-segmentation transfer-learning
Last synced: 06 Apr 2025
https://github.com/westlake-repl/recommendation-systems-without-explicit-id-features-a-literature-review
Paper List of Pre-trained Foundation Recommender Models
chatgpt chatgpt3 chatgpt4rec cross-domain-recommendation cross-domainrecommendation foundation-model gpt4rec language-model large-language-model llm llm-recommendation llm4rec multimodal multimodal-deep-learning multimodalrecommendation pre-training recommendation-system recommender-system transfer-learning transferable
Last synced: 21 Apr 2025
https://github.com/opendrivelab/vidar
[CVPR 2024 Highlight] Visual Point Cloud Forecasting
autonomous-driving point-cloud-forecasting pre-training world-model
Last synced: 06 Apr 2025
https://github.com/deepgraphlearning/gearnet
GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
graph-neural-networks pre-training protein-representation-learning
Last synced: 14 Jun 2025
https://github.com/wangxiao5791509/MultiModal_BigModels_Survey
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
anhui-university audio big-models depth event-camera multi-modal natural-language pengchenglab point-cloud pre-training radar review rgb-text-audio self-attention survey thermal-infrared transformers
Last synced: 02 Apr 2025
https://github.com/showlab/all-in-one
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
codebase pre-training pytorch video-language
Last synced: 09 Apr 2025
https://github.com/metauto-ai/Kaleido-BERT
💐Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
bert e-commerce fashion multimodal pre-training vision-language
Last synced: 21 Jul 2025
https://github.com/akanyaani/gpt-2-tensorflow2.0
OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0
gpt gpt-2 gpt2 implementation nlp openai pre-training pretraining tensorflow tensorflow2 text-generation transformer
Last synced: 16 Jan 2026
https://github.com/DeepGraphLearning/GearNet
GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
graph-neural-networks pre-training protein-representation-learning
Last synced: 09 May 2025
https://github.com/ViTAE-Transformer/SAMRS
The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
dataset deep-learning pre-training remote-sensing sam segment-anything-model semantic-segmentation transfer-learning
Last synced: 16 Mar 2025
https://github.com/OpenDriveLab/ViDAR
[CVPR 2024 Highlight] Visual Point Cloud Forecasting
autonomous-driving point-cloud-forecasting pre-training world-model
Last synced: 20 Mar 2025
https://github.com/lucidrains/electra-pytorch
A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch
artificial-intelligence deep-learning pre-training transformer
Last synced: 06 Apr 2025
https://github.com/helicalAI/helical
A framework for state-of-the-art pre-trained bio foundation models on genomics and transcriptomics modalities.
artificial-intelligence bioinformatics biology deep-learning dna-sequences evo2 foundation-models gene-expression geneformer helixmrna pre-trained-model pre-training rna rna-seq rnaseq scgpt transcriptformer transformer uce vcf
Last synced: 27 May 2026
https://github.com/zhanghm1995/Forge_VFM4AD
A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.
3dgs adaptation autonomous-driving diffusion end-to-end-autonomous-driving foundation-model large-language-models nerf pre-training survey world-models
Last synced: 24 Jul 2025
https://github.com/michiyasunaga/DrRepair
[ICML 2020] DrRepair: Learning to Repair Programs from Error Messages
code-generation deep-learning graph-neural-networks pre-training program-repair
Last synced: 22 Jul 2025
https://github.com/michiyasunaga/drrepair
[ICML 2020] DrRepair: Learning to Repair Programs from Error Messages
code-generation deep-learning graph-neural-networks pre-training program-repair
Last synced: 13 Sep 2025
https://github.com/balavenkatesh3322/audio-pretrained-model
A collection of Audio and Speech pre-trained models.
audio audio-processing caffe keras keras-models keras-tensorflow machine-learning mxnet neural-network pre-trained pre-trained-model pre-training python3 pytorch pytorch-models speech-recognition speech-to-text tensorflow tensorflow-models
Last synced: 10 Apr 2025
https://github.com/iamyuanchung/Autoregressive-Predictive-Coding
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning
pre-training pytorch representation-learning self-supervised-learning unsupervised-learning
Last synced: 19 Jul 2025
https://github.com/zhangyuanhan-ai/bamboo
[IJCV] Bamboo: 4 times larger than ImageNet; 2 time larger than Object365; Built by active learning.
active-learning dataset-generation pre-training
Last synced: 18 Aug 2025
https://github.com/lucidrains/mlm-pytorch
An implementation of masked language modeling for Pytorch, made as concise and simple as possible
artificial-intelligence deep-learning pre-training transformers unsupervised-learning
Last synced: 20 Jul 2025
https://github.com/ZhangYuanhan-AI/Bamboo
Bamboo: 4 times larger than ImageNet; 2 time larger than Object365; Built by active learning.
active-learning dataset-generation pre-training
Last synced: 08 May 2025
https://github.com/gair-nlp/prox
Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"
continual continual-pre-training data-centric-ai data-quality llama llm mistral neural-symbolic pre-training
Last synced: 05 Apr 2025
https://github.com/laion-ai/scaling-laws-openclip
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
clip deep-learning few-shot-learning fine-tuning laion openclip pre-training pytorch scaling-laws transfer-learning zero-shot-classification zero-shot-retrieval
Last synced: 07 May 2025
https://github.com/zjunlp/molgen
[ICLR 2024] Domain-Agnostic Molecular Generation with Chemical Feedback
generation huggingface iclr2024 language-model molecular-generation molecular-optimization molecule molgen multitask pre-trained-language-models pre-trained-model pre-training pytorch selfies targeted-molecular-generation
Last synced: 05 Apr 2025
https://github.com/wenhuchen/kgpt
Code and Data for EMNLP2020 Paper "KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation"
Last synced: 08 Jul 2025
https://github.com/vitae-transformer/rsp
The official repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining"
change-detection classification deep-learning foundation-models imagenet object-detection pre-training remote-sensing semantic-segmentation transfer-learning
Last synced: 05 Apr 2025
https://github.com/vita-group/bert-tickets
[NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin
bert lottery-ticket-hypothesis lottery-tickets pre-training universal-embeddings
Last synced: 19 Apr 2025
https://github.com/hongfz16/hcmoco
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception
Last synced: 03 Aug 2025
https://github.com/ViTAE-Transformer/MTP
The official repo for "MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining"
change-detection classification deep-learning foundation-models object-detection pre-training remote-sensing semantic-segmentation transfer-learning
Last synced: 05 Apr 2025
https://github.com/vitae-transformer/mtp
The official repo for "MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining"
change-detection classification deep-learning foundation-models object-detection pre-training remote-sensing semantic-segmentation transfer-learning
Last synced: 10 Apr 2025
https://github.com/opendrivelab/mpi
[RSS 2024] Learning Manipulation by Predicting Interaction
policy-learning pre-training robot-manipulation
Last synced: 29 Oct 2025
https://github.com/kakaobrain/helo-word
Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task
deep-learning fairseq grammatical-error-correction nlp pre-training transfer-learning transformer
Last synced: 24 Apr 2025
https://github.com/vita-group/adv-ss-pretraining
[CVPR 2020] Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
adversarial-robustness ensemble-pretrain jigsaw pre-training rotation self-supervised-learning selfie
Last synced: 18 Oct 2025
https://github.com/wxl1999/plmpapers
A paper list of pre-trained language models (PLMs).
bert deep-learning gpt machine-learning multilingual multimodal-learning natural-language-processing pre-training representation-learning
Last synced: 18 Feb 2026
https://github.com/lucidrains/marge-pytorch
Implementation of Marge, Pre-training via Paraphrasing, in Pytorch
artificial-intelligence deep-learning pre-training retrieval transformers
Last synced: 22 Jun 2025
https://github.com/VITA-Group/CV_LTH_Pre-training
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang
imagenet-pr lottery-ticket-hypothesis moco pre-training simclr simclrv2 transfer transfer-learning
Last synced: 08 May 2025
https://github.com/vita-group/cv_lth_pre-training
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang
imagenet-pr lottery-ticket-hypothesis moco pre-training simclr simclrv2 transfer transfer-learning
Last synced: 19 Apr 2025
https://github.com/gair-nlp/octothinker
Revisiting Mid-training in the Era of RL Scaling
llama llm mid-training post-training pre-training qwen reasoning rl verl
Last synced: 30 Jun 2025
https://github.com/ganjinzero/kebiolm
Improving Biomedical Pretrained Language Models with Knowledge [BioNLP 2021]
biomedical language-model nlp pre-training
Last synced: 03 Aug 2025
https://github.com/lucidrains/coco-lm-pytorch
Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch
artificial-intelligence deep-learning pre-training transformers
Last synced: 30 Apr 2025
https://github.com/fajieyuan/SIGIR2021_Conure
Pre-training and Lifelong learning for User Embedding and Recommender System
catastrophic-forgetting continual-learning foundation-model general-purpose lifelong-learning lifelong-machine-learning one-for-all pre-training self-supervised-learning transfer-learning universal-recommender universal-representation user-profile user-representation
Last synced: 29 Apr 2025
https://github.com/deepgraphlearning/siamdiff
Code for Pre-training Protein Encoder via Siamese Sequence-Structure Diffusion Trajectory Prediction (https://arxiv.org/abs/2301.12068)
pre-training protein protein-protein-interaction protein-representation-learning
Last synced: 28 Jul 2025
https://github.com/squareslab/varclr
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning
contrastive-learning embeddings icse2022 pre-training source-code
Last synced: 30 Apr 2025
https://github.com/sileod/reasoning-core
Procedural symbolic reasoning data generators suite for synthetic pretraining
data-generators dataset dataset-generation grpo llm logic pre-pre-training pre-training procedural procedural-dataset procedural-generation reasoning rlvr symbolic verifiers
Last synced: 02 Apr 2026
https://github.com/FudanDISC/ReForm-Eval
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
benchmark embodied-ai gpt4 in-context-learning instruction-following instruction-tuning large-language-models large-vision-language-models llm multimodal multimodal-chain-of-thought pre-training reformulation visual-chain-of-thought
Last synced: 23 Apr 2025
https://github.com/haofanwang/awesome-vision-language-modeling
Recent Advances in Vision-Language Pre-training!
masked-image-modeling masked-language-models pre-training vision-language
Last synced: 30 Jan 2026
https://github.com/stefanheng/ecg-representation-learning
Self-supervised pre-training for ECG representation with inspiration from transformers & computer vision
12-lead-ecg attention bert clustering dino ecg nlp pre-training representation-learning self-supervised-learning symbolic-representation transformer vision-transformer vit word2vec
Last synced: 13 Apr 2025
https://github.com/hkuds/flashst
[ICML'2024] "FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction"
pre-training prompt-tuning smart-cities spatio-temporal-prediction traffic-flow-prediction urban-computing
Last synced: 04 Jul 2025
https://github.com/fajieyuan/universal_user_representation
papers of universal user representation learning for recommendation
bert catastrophic-forgetting cold-start continual-learning cross-domain-recommendation general-purpose item-recommendation lifelong-learning multi-domain-recommendation pre-training pruning recommender-system representation-learning transfer transfer-learning transformer trimming user-modeling user-profiling user-representation
Last synced: 03 Oct 2025
https://github.com/gentlezhu/egi
Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization (NeurIPS 21')
domain-adaptation graph-neural-netowrks pre-training transfer-learning
Last synced: 08 Sep 2025
https://github.com/SLAMPAI/large-scale-pretraining-transfer
Code for reproducing the experiments on large-scale pre-training and transfer learning for the paper "Effect of large-scale pre-training on full and few-shot transfer learning for natural and medical images" (https://arxiv.org/abs/2106.00116)
big-transfer chest-x-ray14 chest-xray-images chexpert-dataset covidx-dataset deep-learning distributed-training few-shot-learning fine-tuning imagenet large-scale-learning medical-imaging mimic-cxr padchest-dataset pre-trained-model pre-training pytorch scaling-laws supercomputing transfer-learning
Last synced: 08 May 2025
https://github.com/hicai-zju/openprotein
Open-Protein is an open source pre-training platform that supports multiple protein pre-training models and downstream tasks.
deep-learning pre-training protein pytorch
Last synced: 10 Apr 2025
https://github.com/vitae-transformer/aptv2
The official repo for the extension of [NeurIPS'22] "APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking": https://github.com/pandorgan/APT-36K
animal-pose-estimation benchmark dataset deep-learning few-shot-learning pose-estimation pose-tracking pre-training transfer-learning vision-transformer
Last synced: 11 Apr 2025
https://github.com/sanketvmehta/lifelong-learning-pretraining-and-sam
Code for the paper "Mehta, S. V., Patil, D., Chandar, S., & Strubell, E. (2023). An Empirical Investigation of the Role of Pre-training in Lifelong Learning. The Journal of Machine Learning Research 24 (2023)"
continual-learning flat-minima lifelong-learning pre-training sharpness-aware-minimization
Last synced: 08 May 2025
https://github.com/wxjiao/pre-code
Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.
conversational-emotion-recognition noise-contrastive-estimation pre-training
Last synced: 22 Jul 2025
https://github.com/playerony/tensorflowtts-ts
This project implements TensorflowTTS in Tensorflow.js using Typescript, enabling real-time text-to-speech in the browser. With pre-trained model for English language, you can generate high-quality speech from text input.
pre-trained-model pre-training real-time real-time-rendering speech-synthesis tensorflowjs tensorflowtts text-to-speech typescript web-based-apps
Last synced: 13 Apr 2025
https://github.com/zehong-wang/awesome-foundation-models-on-graphs
A collection of graph foundation models including papers, codes, and datasets.
data-augmentation deep-learning few-shot-learning foundation-models graph-classification graph-foundation-model graph-neural-networks graph-representation-learning knowlege-graph link-prediction machine-learning molecule node-classification node-embedding pre-training representation-learning self-supervised-learning transfer-learning unsupervised-learning zero-shot-learning
Last synced: 17 Apr 2025
https://github.com/yao8839836/edge_mae
Friend Ranking in Online Games via Pre-training Edge Transformers. SIGIR 2023
graph link-prediction pre-training social-network transformers
Last synced: 25 Jun 2025
https://github.com/louisbrulenaudet/tsdae
Tranformer-based Denoising AutoEncoder for Sentence Transformers Unsupervised pre-training.
bert bert-embeddings lemone lemone-io machine-learning nltk pre-training python sentence-transformers transformers tsdae unsupervised-learning
Last synced: 14 Jul 2025
https://github.com/thuml/trajworld
Official repository for "Trajectory World Models for Heterogeneous Environments" (ICML 2025), https://arxiv.org/abs/2502.01366
model-predictive-control off-policy-evaluation pre-training world-model
Last synced: 18 Jul 2025
https://github.com/mwasifanwar/llm-mastery
The most comprehensive educational resource on Large Language Models ever created. A guide systematically building understanding from absolute fundamentals to cutting-edge research.
ai-education artificial-intelligence attention-mechanism deep-learning fine-tuning huggingface large-language-models llm machine-learning model-deployment model-training neural-networks nlp pre-training pytorch research-paper transformer transformer-architecture
Last synced: 16 Apr 2026
https://github.com/707/ml-workbench
Compare multilingual tokenizers and models for cost, context, and deployment decisions.
llm llms ml pre-training tokenization tokenizers
Last synced: 04 Apr 2026
https://github.com/ethicalabs-ai/blossomtune-orchestrator
BlossomTune 🌸 Flower Superlink & Runner
federated-learning federated-learning-framework fine-tuning flower gradio gradio-interface huggingface machine-learning pre-training pytorch
Last synced: 15 May 2026
https://github.com/xiangao0904/hst-minimind
This repository is a compact research prototype for faster language-model pretraining with Token Superposition Training (TST)
Last synced: 27 May 2026
https://github.com/hrolive/computer-vision-for-industrial-inspection
How to create an end-to-end hardware-accelerated industrial inspection pipeline to automate defect detection.
computer-vision dali deep-learning jupyter-notebook nvidia-tao-toolkit pandas pre-training python tensorrt triton-inference-server visual-inspection
Last synced: 09 May 2026
https://github.com/labackdoor/rope-t5
A from-scratch implementation of a T5 model modified with Rotary Position Embeddings (RoPE). This project includes the code for pre-training on the C4 dataset in streaming mode with Flash Attention 2.
c4-dataset evaluation-benchmark flash-attention from-scratch huggingface language-model llm nlp pre-training pytorch rope rotary-position-embedding sequence-to-sequence span-corruption t5
Last synced: 13 Aug 2025