Projects in Awesome Lists tagged with llm-training
A curated list of projects in awesome lists tagged with llm-training .
https://github.com/gitleaks/gitleaks
Find secrets with Gitleaks 🔑
ai-powered ci-cd cicd cli data-loss-prevention devsecops dlp git gitleaks go golang hacktoberfest llm llm-inference llm-training open-source secret security security-tools
Last synced: 13 May 2025
https://github.com/liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
llm llm-inference llm-serving llm-training llmops
Last synced: 15 May 2025
https://github.com/ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch
Last synced: 13 May 2025
https://github.com/uber/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch
Last synced: 24 Apr 2025
https://github.com/skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 16+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
cloud-computing cloud-management cost-management cost-optimization data-science deep-learning distributed-training finops gpu hyperparameter-tuning job-queue job-scheduler llm-serving llm-training machine-learning ml-infrastructure ml-platform multicloud spot-instances tpu
Last synced: 12 May 2025
https://github.com/linkedin/liger-kernel
Efficient Triton Kernels for LLM Training
finetuning gemma2 llama llama3 llm-training llms mistral phi3 triton triton-kernels
Last synced: 13 May 2025
https://github.com/internlm/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
agent baichuan chatbot chatglm2 chatglm3 conversational-ai internlm large-language-models llama2 llama3 llava llm llm-training mixtral msagent peft phi3 qwen supervised-finetuning
Last synced: 11 May 2025
https://github.com/InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
agent baichuan chatbot chatglm2 chatglm3 conversational-ai internlm large-language-models llama2 llama3 llava llm llm-training mixtral msagent peft phi3 qwen supervised-finetuning
Last synced: 20 Mar 2025
https://github.com/h2oai/h2o-llmstudio
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
ai chatbot chatgpt fine-tuning finetuning generative generative-ai gpt llama llama2 llm llm-training
Last synced: 13 May 2025
https://github.com/linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
finetuning gemma2 llama llama3 llm-training llms mistral phi3 triton triton-kernels
Last synced: 20 Dec 2024
https://github.com/databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
databricks gen-ai generative-ai llm llm-inference llm-training mosaic-ai
Last synced: 15 May 2025
https://github.com/moonshotai/moba
MoBA: Mixture of Block Attention for Long-Context LLMs
flash-attention llm llm-serving llm-training moe pytorch transformer
Last synced: 14 May 2025
https://github.com/MoonshotAI/MoBA
MoBA: Mixture of Block Attention for Long-Context LLMs
flash-attention llm llm-serving llm-training moe pytorch transformer
Last synced: 31 Mar 2025
https://github.com/intelligent-machine-learning/dlrover
DLRover: An Automatic Distributed Deep Learning System
distributed-training hacktoberfest k8s llm-training
Last synced: 13 May 2025
https://github.com/utkuozdemir/nvidia_gpu_exporter
Nvidia GPU exporter for prometheus using nvidia-smi binary
ai cryptocurrency gaming llm llm-training monitoring nvidia nvidia-gpu nvidia-smi prometheus prometheus-exporter
Last synced: 12 Apr 2025
https://github.com/volcengine/vescale
A PyTorch Native LLM Training Framework
Last synced: 14 May 2025
https://github.com/sail-sg/Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit
Last synced: 05 Apr 2025
https://github.com/volcengine/veScale
A PyTorch Native LLM Training Framework
Last synced: 04 Dec 2024
https://github.com/rohan-paul/llm-finetuning-large-language-models
LLM (Large Language Model) FineTuning
gpt-3 gpt3-turbo large-language-models llama2 llm llm-finetuning llm-inference llm-serving llm-training mistral-7b open-source-llm pytorch
Last synced: 04 Apr 2025
https://github.com/rohan-paul/LLM-FineTuning-Large-Language-Models
LLM (Large Language Model) FineTuning
gpt-3 gpt3-turbo large-language-models llama2 llm llm-finetuning llm-inference llm-serving llm-training mistral-7b open-source-llm pytorch
Last synced: 02 Feb 2025
https://github.com/feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
attention-is-all-you-need deepspeed-ulysses llm-inference llm-training pytorch ring-attention
Last synced: 14 May 2025
https://github.com/mallorbc/finetune_llms
Repo for fine-tuning Casual LLMs
docker falcon gpt gpt-3 gpt-35-turbo gpt-4 gpt-j-6b llama llama2 llm llm-training mpt
Last synced: 05 Apr 2025
https://github.com/mallorbc/Finetune_LLMs
Repo for fine-tuning Casual LLMs
docker falcon gpt gpt-3 gpt-35-turbo gpt-4 gpt-j-6b llama llama2 llm llm-training mpt
Last synced: 12 Apr 2025
https://github.com/flagai-open/aquila2
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
llm llm-inference llm-training
Last synced: 15 May 2025
https://github.com/FlagAI-Open/Aquila2
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
llm llm-inference llm-training
Last synced: 08 Apr 2025
https://github.com/internlm/internevo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
910b deepspeed-ulysses flash-attention gemma internlm internlm2 llama3 llava llm-framework llm-training multi-modal pipeline-parallelism pytorch ring-attention sequence-parallelism tensor-parallelism transformers-models zero3
Last synced: 14 Apr 2025
https://github.com/InternLM/InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
910b deepspeed-ulysses flash-attention gemma internlm internlm2 llama3 llava llm-framework llm-training multi-modal pipeline-parallelism pytorch ring-attention sequence-parallelism tensor-parallelism transformers-models zero3
Last synced: 27 Mar 2025
https://github.com/promptslab/llmtuner
FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)
fine-tuning fine-tuning-llm finetune finetune-gpt finetune-llama finetune-llm finetune-llms finetune-whisper finetunechatgpt finetuning finetuning-large-language-models finetuning-rl llm llm-framework llm-inference llm-training llmops llmtuner whisper whisper-finetune
Last synced: 09 Apr 2025
https://github.com/yinizhilian/iclr2025-papers-with-code
历年ICLR论文和开源项目合集,包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.
deep-learning-paper gemmini gpt iclr2021 iclr2022 iclr2023 iclr2024 llama3 llm-agent llm-framework llm-reasoning llm-training llms machine-learning nlp-keywords-extraction nlp-machine-learning paperwithcode python transformer
Last synced: 05 Apr 2025
https://github.com/yinizhilian/iclr2024-papers-with-code
ICLR 2024 论文和开源项目合集
deep-learning-paper gemmini gpt iclr2021 iclr2022 iclr2023 iclr2024 llama3 llm-agent llm-framework llm-reasoning llm-training llms machine-learning nlp-keywords-extraction nlp-machine-learning paperwithcode python transformer
Last synced: 24 Jan 2025
https://github.com/armbues/SiLLM
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
apple-silicon dpo large-language-models llm llm-inference llm-training lora mlx
Last synced: 25 Nov 2024
https://github.com/shivendrra/smalllanguagemodel
a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model
bert-model decoder-model gpt llm-cookbook llm-training llms machine-learning neural-networks transformer
Last synced: 12 Apr 2025
https://github.com/shivendrra/SmallLanguageModel
a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model
bert-model decoder-model gpt llm-cookbook llm-training llms machine-learning neural-networks transformer
Last synced: 14 Mar 2025
https://github.com/itachi-uchiha581/auto-data
Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).
ai data finetuning-large-language-models finetuning-llms generative-ai llm llm-training python python3
Last synced: 06 Apr 2025
https://github.com/dsdanielpark/open-llm-datasets
Repository for organizing datasets and papers used in Open LLM.
datasets large-language-models llm llm-datasets llm-training natural-language-processing
Last synced: 04 Mar 2025
https://github.com/slai-labs/get-beam
Run GPU inference and training jobs on serverless infrastructure that scales with you.
artificial-intelligence cloud-computing cost-optimization data-science deep-learning distributed-computing gpu-acceleration gpu-computing hpc llm-serving llm-training machine-learning ml-infrastructure mlops python serverless serverless-architectures
Last synced: 18 Apr 2025
https://github.com/simplifine-llm/simplifine
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen
Last synced: 16 Feb 2025
https://github.com/simplifine-llm/Simplifine
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen
Last synced: 04 Dec 2024
https://github.com/tatevkaren/babygpt-build_gpt_from_scratch
BabyGPT: Build Your Own GPT Large Language Model from Scratch Pre-Training Generative Transformer Models: Building GPT from Scratch with a Step-by-Step Guide to Generative AI in PyTorch and Python
attention-is-all-you-need dropout-layers gpt gpt-3 large-language-models layer-normalization llm-training llms multi-head-self-attention neural-networks python pytorch residual-connections transformers
Last synced: 10 Apr 2025
https://github.com/microsoft/llf-bench
A benchmark for evaluating learning agents based on just language feedback
large-language-models llm llm-training llms machine-learning natural-language-processing reinforcement-learning
Last synced: 07 Apr 2025
https://github.com/smerkyg/gptcore
Fast modular code to create and train cutting edge LLMs
gpt llm-training llms machine-learning
Last synced: 12 May 2025
https://github.com/lamm-mit/Graph-Aware-Transformers
Graph-Aware Attention for Adaptive Dynamics in Transformers
ai4science attention-mechanism graph graph-aware huggingface-transformers language llm-training llms materials-informatics materials-science materiomics
Last synced: 11 Apr 2025
https://github.com/hkust-nlp/dart-math
Official implementation for the paper *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
deep-learning llm llm-evaluation llm-inference llm-training mathematics nlp
Last synced: 20 Nov 2024
https://github.com/microsoft/LLF-Bench
A benchmark for evaluating learning agents based on just language feedback
large-language-models llm llm-training llms machine-learning natural-language-processing reinforcement-learning
Last synced: 18 Apr 2025
https://github.com/microsoft/mathoctopus
This repository contains resources for accessing the official benchmarks, codes, and checkpoints of the paper: "[**Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations**]".
Last synced: 05 Feb 2025
https://github.com/microsoft/MathOctopus
This repository contains resources for accessing the official benchmarks, codes, and checkpoints of the paper: "[**Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations**]".
Last synced: 02 Feb 2025
https://github.com/adithya-s-k/companionllm
CompanionLLM - A framework to finetune LLMs to be your own sentient conversational companion
fine-tuning finetuning hacktoberfest hacktoberfest-accepted hacktoberfest2023 huggingface llama llama2 llamacpp llm llm-inference llm-training lora mit-license open-source peft
Last synced: 03 Dec 2024
https://github.com/prismadic/magnet
the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly
apple-silicon claude distributed-computing distributed-systems embeddings fine-tuning finetuning-llms gemini huggingface inference-api langchain llm-training milvus mistral mlx nats nats-messaging nats-streaming sentence-splitting tokenizers
Last synced: 13 Apr 2025
https://github.com/ziming/laravel-scrapingbee
A PHP Laravel Library for Scrapingbee Web Scraping API
ai artificial-intelligence hacktoberfest laravel llm llm-training llms php scraping scraping-bee scrapingbee scrapingbee-api web-scraping webscraping
Last synced: 07 Apr 2025
https://github.com/mofheka/llama-megatron
A LLaMA1/LLaMA12 Megatron implement.
llama llama2 llm llm-training megatron megatron-lm pytorch
Last synced: 23 Apr 2025
https://github.com/wassemgtk/llm.scala
Extensible implementation of a Language Model (LLM) training framework in Scala.
gpt llm llm-training transformer transformers-library
Last synced: 19 Apr 2025
https://github.com/arcee-ai/arcee-python
The Arcee client for executing domain-adpated language model routines https://pypi.org/project/arcee-py/
ai llm llm-inference llm-training llmops
Last synced: 21 Apr 2025
https://github.com/Prismadic/magnet
the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly
apple-silicon claude distributed-computing distributed-systems embeddings fine-tuning finetuning-llms gemini huggingface inference-api langchain llm-training milvus mistral mlx nats nats-messaging nats-streaming sentence-splitting tokenizers
Last synced: 12 Dec 2024
https://github.com/armbues/SiLLM-examples
Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon
apple-silicon dpo large-language-models llm llm-inference llm-training lora mlx
Last synced: 10 Apr 2025
https://github.com/ubos-tech/ai-chatbot-starter-kit
AI Chatbot Starter Kit: An open-source, extensible framework for rapidly developing custom AI chatbots with integrations for popular data sources, messaging platforms, LLM models, and CRM systems. Ideal for developers looking for a minimal boilerplate solution.
chatgpt chatgpt-api chatgpt-bot facebook-bot instagram-chatbot llm llm-agent llm-training low-code lowcode-editor node-red openai openai-api pinecone support-bot telegram-chat-bot ubos-tech whatsapp-bot
Last synced: 15 May 2025
https://github.com/aklinker1/vitepress-knowledge
Free, self-hosted LLM chatbot trained on your VitePress website.
ai llm-training vitepress vitepress-plugin
Last synced: 16 Mar 2025
https://github.com/erogol/blagpt
Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration.
attention-mechanisms deep-learning gpt gpt-2 hymba large-language-models llm llm-training machine-learning position-embedding pytorch transformers
Last synced: 06 Jan 2025
https://github.com/levitation-opensource/manipulative-expression-recognition
MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. MER benchmarks language models for manipulative expressions, fostering development of transparency and safety in AI. It also supports manipulation victims by detecting manipulative patterns in human communication.
benchmarking conversation-analysis conversation-analytics expression-recognition fraud-detection fraud-prevention human-computer-interaction human-robot-interaction llm llm-security llm-test llm-training manipulation misinformation prompt-engineering prompt-injection psychometrics sentiment-analysis sentiment-classification transparency
Last synced: 16 Mar 2025
https://github.com/moinulmoin/free-llmstxt-generator
converts webpage content into Markdown format, optimized for LLM training and context
crawling llm-context llm-training llmstxt markdown
Last synced: 23 Mar 2025
https://github.com/raumberg/myllm
LLM Training Framework
deep-neural-networks deepspeed framework huggingface huggingface-transformers llm llm-training python reinforcement-learning torch
Last synced: 11 Apr 2025
https://github.com/riccorl/llama-trainer
Llama Trainer Utility
huggingface llama llm llm-inference llm-training llms transformer
Last synced: 08 Mar 2025
https://github.com/rs-py/howtofinetunellama3.1
Quick tutorial showing how to fine-tune Llama3.1 with nothing but free tools and text data. All code included in ipynb. For a step by step walkthrough take a look at the tutorial below on medium.
fine-tuning finetuning huggingface llama3 llm llm-training
Last synced: 24 Apr 2025
https://github.com/mewmix/gh_llm_loader
clone GitHub repositories and prepare their data for ingestion for LLMs.
context data data-structures github llm llm-training python
Last synced: 12 Jan 2025
https://github.com/amazon-science/llm-code-preference
Training and Benchmarking LLMs for Code Preference.
code-generation llm-evaluation llm-training llms-benchmarking
Last synced: 03 May 2025
https://github.com/endevsols/long-trainer
Introducing LongTrainer, a sophisticated extension of the LangChain framework designed specifically for managing multiple bots and providing isolated, context-aware chat sessions. Ideal for developers and businesses looking to integrate complex conversational AI into their systems, LongTrainer simplifies the deployment and customization of LLMs.
gpt langchain langchain-python llm-training longtrainer openai rag
Last synced: 01 May 2025
https://github.com/prismadic/tractor-beam
high-efficiency text & file scraper with smart tracking, client/server networking for building language model datasets fast
botnet cluster data file-downloader llm llm-finetuning llm-training mass-downloader scraping
Last synced: 07 May 2025
https://github.com/altunenes/rustysozluk
Efficiently fetch and perform sentiment analysis (Turkish Only) on eksisozluk.com entries using Rust
duyguanalizi eksi-sozluk eksisozluk llm-datasets llm-training reqwest rust rust-lang rust-scraping scraper sentiment-analysis turkish webscraping
Last synced: 17 Jan 2025
https://github.com/spider-rs/readability
The readability library for LLM's
clean-data data-cleaning llm-training readability rust safari-reader
Last synced: 05 Apr 2025
https://github.com/stoyan-stoyanov/transformers-calculator
Transformer Calculator - Estimate training time for transformer models.
Last synced: 23 Mar 2025
https://github.com/puneetkakkar/bitnet-1.58b
Bitnet 1.58b: This project implements the innovative 1-bit LLM architecture described in recent whitepapers, focusing on efficient training, inference, and open-source collaboration.
1-bit-quantization deep-learning large-language-models llm-training llms machine-learning nlp pytorch research-and-development
Last synced: 29 Dec 2024
https://github.com/Rs-py/HowToFineTuneLlama3.1
Quick tutorial showing how to fine-tune Llama3.1 with nothing but free tools and text data. All code included in ipynb. For a step by step walkthrough take a look at the tutorial below on medium.
fine-tuning finetuning huggingface llama3 llm llm-training
Last synced: 06 Jan 2025
https://github.com/maris205/llama-gene
A General-purpose Gene Task Large Language Model Based on Instruction Fine-tuning
genetic-algorithm llama llm-training
Last synced: 10 Apr 2025
https://github.com/simranjeet97/learn_rag_from_scratch_llm
Learn Retrieval-Augmented Generation (RAG) from Scratch using LLMs from Hugging Face and Langchain or Python
artificial-intelligence datascience-machinelearning genai-domain genai-usecase generative-ai llm-apps llm-evaluation llm-framework llm-training rag rag-application rag-chatbot rag-embeddings rag-evaluation rag-implementation rag-llm rag-model rag-pipeline retrieval-augmented-generation
Last synced: 15 Apr 2025
https://github.com/amanpriyanshu/generatorpromptkit
GeneratorPromptKit: A Python Library and Framework for Automated Generator Prompting and Dataset Generation
augmentation automated-prompt-engineering data data-augmentation data-science dataset dataset-generation diverse-data llm llm-training llms prompt-engineering synthetic-data synthetic-dataset-generation
Last synced: 22 Mar 2025
https://github.com/amazon-science/mezo_svrg
Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"
deep-learning fine-tuning language-model large-language-models llm-training llms machine-learning machine-learning-algorithms optimization optimization-algorithms svrg variance-reduction zero-order-methods
Last synced: 03 May 2025
https://github.com/ethicalabs-ai/kurtis
Kurtis is a fine-tuning, inference and evaluation tool built for SLMs (Small Language Models), such as Huggingface's SmolLM2.
ai huggingface-peft huggingface-smollm2 huggingface-transformers llm llm-inference llm-training machine-learning mental-health mental-health-awareness ml question-answering question-answering-model slms small-language-models smollm smollm2 transformers
Last synced: 04 Apr 2025
https://github.com/rahulunair/simpsons_llm_xpu
Finetune an LLM on intel discrete GPUs to generate dialogues based on the simpsons dataset
huggingface-transformers intel-arc intel-gpu intel-gpu-max ipex llm-inference llm-training lora pytorch
Last synced: 03 Mar 2025
https://github.com/anonym0uswork1221/python-code-docstring-scraper
A multi-threaded GitHub scraper to collect Python code with docstrings from public repositories, creating a well-documented dataset for the JaraConverse LLM model.
causal-language-modeling data-scraping dataset dataset-generation dataset-scripts docst docstring-generator github-scraper llm llm-training nlp nlp-machine-learning python-code python-dataset python3 scraper script
Last synced: 11 Jan 2025
https://github.com/tousif47/mini-llm
Training and testing a few types of small LLM from scratch for practice
keras-neural-networks llm llm-inference llm-training python3
Last synced: 25 Feb 2025
https://github.com/aravinda-1402/legal-query-ai-assistant
Legal Query AI Assistant is a chatbot that leverages LLMs like OpenAI GPT and LLaMA, with RAG to retrieve and summarize legal documents efficiently.
chatbot gpt llm llm-training llma2 natural-language-processing natural-language-understanding openai-api python
Last synced: 11 Apr 2025
https://github.com/fork123aniket/llm-rag-powered-qa-app
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
context-aware-system eleutherai fine-tuning large-language-models llm-inference llm-serving llm-training llmops parameter-efficient-fine-tuning question-answering ray ray-serve retrieval-augmented-generation
Last synced: 16 Jan 2025
https://github.com/visheshc14/llm-fastapi
NimbleBox Apprenticeship ML Engineer Task - 1. This project demonstrates the implementation of a Language Model Server using FastAPI and gRPC. It leverages a large language model to generate coherent text based on user input.
fastapi grpc llm-training multithreading
Last synced: 12 Mar 2025
https://github.com/leo848/deversai
Quelltext für das Jugend forscht-Projekt "DEversAI: Training und Visualisierung deutsch lokalisierter direktionalkomplementärer LLMs"
ai german-language gpt-2 jugend-forscht llm llm-training pytorch
Last synced: 11 Apr 2025
https://github.com/aleefbilal/finetuning-in-docker
This project provides a Docker setup for fine-tuning the Llama 3.1 model. The Dockerfile installs the necessary dependencies and sets up the environment for training. This guide will help you understand how to build and run the Docker container for your fine-tuning tasks using an interactive jupyter notebook.
docker finetuning jupyter-notebook llm-training
Last synced: 30 Apr 2025
https://github.com/anshulranjan2004/microrwkv
Implementation of a custom architecture on nanoRWKV: A nanoGPT-style adaptation of the RWKV Language Model, which combines the simplicity of RNNs with GPT-level performance for large language models (LLMs).
Last synced: 19 Apr 2025
https://github.com/0xnu/multicollinearity_llm
A multicollinearity-based compression C program, identifies and removes highly correlated weights in neural networks, thereby reducing redundancy.
compression compression-algorithm llm llm-training multicollinearity
Last synced: 08 Feb 2025
https://github.com/gurpreetkaurjethra/quantize-llm-using-awq
Quantize LLM using AWQ
awq generative-ai large-language-models llm-training llms quantize
Last synced: 22 Nov 2024
https://github.com/gurpreetkaurjethra/llms-inference-and-fine-tuning
Estimate Memory Consumption of LLMs Inference and Fine Tuning
fine-tuning generative-ai large-language-models llm-inference llm-training llms memory-allocation
Last synced: 22 Nov 2024
https://github.com/sreeeswaran/train-your-llm
This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library.
artificial-intelligence deep-learning language-model large-language-model large-language-models llm llm-training llms machine-learning model-training nlp pretrained-language-model pretrained-models training
Last synced: 22 Nov 2024
https://github.com/shreyanmitra/candyllm
A simple, easy-to-use framework for HuggingFace and OpenAI text-generation models focused on explainability.
alpaca falcon gpt-4 huggingface llama3 llm-training llms mistral openai transformers vicuna
Last synced: 22 Dec 2024
https://github.com/firojalam/llamalens
This repository contains the resources, code, and documentation for LlamaLens, a specialized multilingual large language model (LLM) designed to analyze news and social media content effectively. LlamaLens supports multiple languages, including Arabic, English, and Hindi, and is tailored for diverse tasks such as sentiment analysis, misinformation.
arabic downstream-tasks emotion-detection english hindi llm llm-inference llm-training newsmedia sentiment-classification social-media
Last synced: 28 Dec 2024
https://github.com/hrolive/poland-end-to-end-llm-bootcamp
This bootcamp is designed to give NLP researchers an end-to-end overview on the fundamentals of NVIDIA NeMo framework, complete solution for building large language models. It will also have hands-on exercises complimented by tutorials, code snippets, and presentations to help researchers kick-start with NeMo LLM Service and Guardrails.
gpt llama2 llm llm-inference llm-training nemo-guardrails nvidia nvidia-nemo p-tuning prompt-tuning tensorrt triton
Last synced: 20 Apr 2025
https://github.com/ethicalabs-ai/flowertune-qwen2.5-coder-0.5b-instruct
FlowerTune LLM on Coding Dataset
ai federated-learning federated-learning-framework llm-fine-tuning llm-finetuning llm-training machine-learning ml qwen2 qwen2-5 transformers transformers-models
Last synced: 04 Apr 2025
https://github.com/shreyanmitra/CandyLLM
A simple, easy-to-use framework for HuggingFace and OpenAI text-generation models focused on explainability.
alpaca falcon gpt-4 huggingface llama3 llm-training llms mistral openai transformers vicuna
Last synced: 19 Feb 2025
https://github.com/lamm-mit/graph-aware-transformers
Graph-Aware Attention for Adaptive Dynamics in Transformers
ai4science attention-mechanism graph graph-aware huggingface-transformers language llm-training llms materials-informatics materials-science materiomics
Last synced: 02 Mar 2025
https://github.com/agents4good/masterchef-ai
Acesse: https://agents4good.github.io/MasterChef-AI/
artificial-intelligence distillation distillation-model llm llm-inference llm-training open-source ufcg
Last synced: 11 Mar 2025
https://github.com/dev-d-gr8/storyscape
A storytelling (generates stories with pictures) generative AI based iOS application based on custom fine tuned LLaMA 3.2 3B-Instruct model on Hindi stories (Provision to generate English stories via call to OpenAI GPT-4o).
app aws django docker generative-ai generative-art ios jenkins llm llm-inference llm-training llmops mobile-development python sagemaker swift swiftui
Last synced: 03 Mar 2025
https://github.com/bhaskarr103/legaldoc_ai
A research driven approach on building AI-powered legal assistant leveraging FAISS for efficient legal document retrieval and Hugging Face model for intelligent response generation.
faiss-vector-database langchain llm-training
Last synced: 22 Mar 2025
https://github.com/jianzhnie/scaletorch
A PyTorch toolkit for large model training
distributed llm-training torch
Last synced: 28 Mar 2025
https://github.com/dkimjpg/ai-large-language-tagger-using-hidden-markov-models
An implementation of a language part-of-speech (POS) tagger using Hidden Markov Models. Basically, it takes English text as input and tries to tag each word as a noun, verb, adjective, etc. based on the input's sequence.
ai hmm-viterbi-algorithm llm-training
Last synced: 09 Mar 2025