Projects in Awesome Lists tagged with llm-training

https://github.com/gitleaks/gitleaks

Find secrets with Gitleaks 🔑

ai-powered ci-cd cicd cli data-loss-prevention devsecops dlp git gitleaks go golang hacktoberfest llm llm-inference llm-training open-source secret security security-tools

Last synced: 13 May 2025

https://github.com/liguodongiot/llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

llm llm-inference llm-serving llm-training llmops

Last synced: 15 May 2025

https://github.com/ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch

Last synced: 13 May 2025

https://github.com/uber/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch

Last synced: 24 Apr 2025

https://github.com/skypilot-org/skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 16+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

cloud-computing cloud-management cost-management cost-optimization data-science deep-learning distributed-training finops gpu hyperparameter-tuning job-queue job-scheduler llm-serving llm-training machine-learning ml-infrastructure ml-platform multicloud spot-instances tpu

Last synced: 12 May 2025

https://github.com/linkedin/liger-kernel

Efficient Triton Kernels for LLM Training

finetuning gemma2 llama llama3 llm-training llms mistral phi3 triton triton-kernels

Last synced: 13 May 2025

https://github.com/internlm/xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

agent baichuan chatbot chatglm2 chatglm3 conversational-ai internlm large-language-models llama2 llama3 llava llm llm-training mixtral msagent peft phi3 qwen supervised-finetuning

Last synced: 11 May 2025

https://github.com/InternLM/xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

agent baichuan chatbot chatglm2 chatglm3 conversational-ai internlm large-language-models llama2 llama3 llava llm llm-training mixtral msagent peft phi3 qwen supervised-finetuning

Last synced: 20 Mar 2025

https://github.com/h2oai/h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/

ai chatbot chatgpt fine-tuning finetuning generative generative-ai gpt llama llama2 llm llm-training

Last synced: 13 May 2025

https://github.com/linkedin/Liger-Kernel

Efficient Triton Kernels for LLM Training

finetuning gemma2 llama llama3 llm-training llms mistral phi3 triton triton-kernels

Last synced: 20 Dec 2024

https://github.com/databricks/dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

databricks gen-ai generative-ai llm llm-inference llm-training mosaic-ai

Last synced: 15 May 2025

https://github.com/moonshotai/moba

MoBA: Mixture of Block Attention for Long-Context LLMs

flash-attention llm llm-serving llm-training moe pytorch transformer

Last synced: 14 May 2025

https://github.com/MoonshotAI/MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

flash-attention llm llm-serving llm-training moe pytorch transformer

Last synced: 31 Mar 2025

https://github.com/intelligent-machine-learning/dlrover

DLRover: An Automatic Distributed Deep Learning System

distributed-training hacktoberfest k8s llm-training

Last synced: 13 May 2025

https://github.com/utkuozdemir/nvidia_gpu_exporter

Nvidia GPU exporter for prometheus using nvidia-smi binary

ai cryptocurrency gaming llm llm-training monitoring nvidia nvidia-gpu nvidia-smi prometheus prometheus-exporter

Last synced: 12 Apr 2025

https://github.com/volcengine/vescale

A PyTorch Native LLM Training Framework

llm-training pytorch

Last synced: 14 May 2025

https://github.com/sail-sg/Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit

Last synced: 05 Apr 2025

https://github.com/volcengine/veScale

A PyTorch Native LLM Training Framework

llm-training pytorch

Last synced: 04 Dec 2024

https://github.com/rohan-paul/llm-finetuning-large-language-models

LLM (Large Language Model) FineTuning

gpt-3 gpt3-turbo large-language-models llama2 llm llm-finetuning llm-inference llm-serving llm-training mistral-7b open-source-llm pytorch

Last synced: 04 Apr 2025

https://github.com/rohan-paul/LLM-FineTuning-Large-Language-Models

LLM (Large Language Model) FineTuning

gpt-3 gpt3-turbo large-language-models llama2 llm llm-finetuning llm-inference llm-serving llm-training mistral-7b open-source-llm pytorch

Last synced: 02 Feb 2025

https://github.com/feifeibear/long-context-attention

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

attention-is-all-you-need deepspeed-ulysses llm-inference llm-training pytorch ring-attention

Last synced: 14 May 2025

https://github.com/mallorbc/finetune_llms

Repo for fine-tuning Casual LLMs

docker falcon gpt gpt-3 gpt-35-turbo gpt-4 gpt-j-6b llama llama2 llm llm-training mpt

Last synced: 05 Apr 2025

https://github.com/mallorbc/Finetune_LLMs

Repo for fine-tuning Casual LLMs

docker falcon gpt gpt-3 gpt-35-turbo gpt-4 gpt-j-6b llama llama2 llm llm-training mpt

Last synced: 12 Apr 2025

https://github.com/flagai-open/aquila2

The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.

llm llm-inference llm-training

Last synced: 15 May 2025

https://github.com/FlagAI-Open/Aquila2

The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.

llm llm-inference llm-training

Last synced: 08 Apr 2025

https://github.com/internlm/internevo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

910b deepspeed-ulysses flash-attention gemma internlm internlm2 llama3 llava llm-framework llm-training multi-modal pipeline-parallelism pytorch ring-attention sequence-parallelism tensor-parallelism transformers-models zero3

Last synced: 14 Apr 2025

https://github.com/InternLM/InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

910b deepspeed-ulysses flash-attention gemma internlm internlm2 llama3 llava llm-framework llm-training multi-modal pipeline-parallelism pytorch ring-attention sequence-parallelism tensor-parallelism transformers-models zero3

Last synced: 27 Mar 2025

https://github.com/promptslab/llmtuner

FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)

fine-tuning fine-tuning-llm finetune finetune-gpt finetune-llama finetune-llm finetune-llms finetune-whisper finetunechatgpt finetuning finetuning-large-language-models finetuning-rl llm llm-framework llm-inference llm-training llmops llmtuner whisper whisper-finetune

Last synced: 09 Apr 2025

https://github.com/yinizhilian/iclr2025-papers-with-code

历年ICLR论文和开源项目合集，包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.

deep-learning-paper gemmini gpt iclr2021 iclr2022 iclr2023 iclr2024 llama3 llm-agent llm-framework llm-reasoning llm-training llms machine-learning nlp-keywords-extraction nlp-machine-learning paperwithcode python transformer

Last synced: 05 Apr 2025

https://github.com/yinizhilian/iclr2024-papers-with-code

ICLR 2024 论文和开源项目合集

deep-learning-paper gemmini gpt iclr2021 iclr2022 iclr2023 iclr2024 llama3 llm-agent llm-framework llm-reasoning llm-training llms machine-learning nlp-keywords-extraction nlp-machine-learning paperwithcode python transformer

Last synced: 24 Jan 2025

https://github.com/armbues/SiLLM

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.

apple-silicon dpo large-language-models llm llm-inference llm-training lora mlx

Last synced: 25 Nov 2024

https://github.com/shivendrra/smalllanguagemodel

a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model

bert-model decoder-model gpt llm-cookbook llm-training llms machine-learning neural-networks transformer

Last synced: 12 Apr 2025

https://github.com/shivendrra/SmallLanguageModel

a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model

bert-model decoder-model gpt llm-cookbook llm-training llms machine-learning neural-networks transformer

Last synced: 14 Mar 2025

https://github.com/itachi-uchiha581/auto-data

Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).

ai data finetuning-large-language-models finetuning-llms generative-ai llm llm-training python python3

Last synced: 06 Apr 2025

https://github.com/dsdanielpark/open-llm-datasets

Repository for organizing datasets and papers used in Open LLM.

datasets large-language-models llm llm-datasets llm-training natural-language-processing

Last synced: 04 Mar 2025

https://github.com/slai-labs/get-beam

Run GPU inference and training jobs on serverless infrastructure that scales with you.

artificial-intelligence cloud-computing cost-optimization data-science deep-learning distributed-computing gpu-acceleration gpu-computing hpc llm-serving llm-training machine-learning ml-infrastructure mlops python serverless serverless-architectures

Last synced: 18 Apr 2025

https://github.com/simplifine-llm/simplifine

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨

ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen

Last synced: 16 Feb 2025

https://github.com/simplifine-llm/Simplifine

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨

ai cloud fine-tuning fine-tuning-llm finetuning-llms gpt instruction-tuning large-language-models llama llama3 llm llm-training lora mistral moe open-source peft phi qwen

Last synced: 04 Dec 2024

https://github.com/tatevkaren/babygpt-build_gpt_from_scratch

BabyGPT: Build Your Own GPT Large Language Model from Scratch Pre-Training Generative Transformer Models: Building GPT from Scratch with a Step-by-Step Guide to Generative AI in PyTorch and Python

attention-is-all-you-need dropout-layers gpt gpt-3 large-language-models layer-normalization llm-training llms multi-head-self-attention neural-networks python pytorch residual-connections transformers

Last synced: 10 Apr 2025

https://github.com/microsoft/llf-bench

A benchmark for evaluating learning agents based on just language feedback

large-language-models llm llm-training llms machine-learning natural-language-processing reinforcement-learning

Last synced: 07 Apr 2025

https://github.com/smerkyg/gptcore

Fast modular code to create and train cutting edge LLMs

gpt llm-training llms machine-learning

Last synced: 12 May 2025

https://github.com/lamm-mit/Graph-Aware-Transformers

Graph-Aware Attention for Adaptive Dynamics in Transformers

ai4science attention-mechanism graph graph-aware huggingface-transformers language llm-training llms materials-informatics materials-science materiomics

Last synced: 11 Apr 2025

https://github.com/hkust-nlp/dart-math

Official implementation for the paper *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*

deep-learning llm llm-evaluation llm-inference llm-training mathematics nlp

Last synced: 20 Nov 2024

https://github.com/microsoft/LLF-Bench

A benchmark for evaluating learning agents based on just language feedback

large-language-models llm llm-training llms machine-learning natural-language-processing reinforcement-learning

Last synced: 18 Apr 2025

https://github.com/microsoft/mathoctopus

This repository contains resources for accessing the official benchmarks, codes, and checkpoints of the paper: "[**Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations**]".

llm-training

Last synced: 05 Feb 2025

https://github.com/microsoft/MathOctopus

This repository contains resources for accessing the official benchmarks, codes, and checkpoints of the paper: "[**Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations**]".

llm-training

Last synced: 02 Feb 2025

https://github.com/adithya-s-k/companionllm

CompanionLLM - A framework to finetune LLMs to be your own sentient conversational companion

fine-tuning finetuning hacktoberfest hacktoberfest-accepted hacktoberfest2023 huggingface llama llama2 llamacpp llm llm-inference llm-training lora mit-license open-source peft

Last synced: 03 Dec 2024

https://github.com/prismadic/magnet

the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly

apple-silicon claude distributed-computing distributed-systems embeddings fine-tuning finetuning-llms gemini huggingface inference-api langchain llm-training milvus mistral mlx nats nats-messaging nats-streaming sentence-splitting tokenizers

Last synced: 13 Apr 2025

https://github.com/ziming/laravel-scrapingbee

A PHP Laravel Library for Scrapingbee Web Scraping API

ai artificial-intelligence hacktoberfest laravel llm llm-training llms php scraping scraping-bee scrapingbee scrapingbee-api web-scraping webscraping

Last synced: 07 Apr 2025

https://github.com/mofheka/llama-megatron

A LLaMA1/LLaMA12 Megatron implement.

llama llama2 llm llm-training megatron megatron-lm pytorch

Last synced: 23 Apr 2025

https://github.com/wassemgtk/llm.scala

Extensible implementation of a Language Model (LLM) training framework in Scala.

gpt llm llm-training transformer transformers-library

Last synced: 19 Apr 2025

https://github.com/arcee-ai/arcee-python

The Arcee client for executing domain-adpated language model routines https://pypi.org/project/arcee-py/

ai llm llm-inference llm-training llmops

Last synced: 21 Apr 2025

https://github.com/Prismadic/magnet

the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly

apple-silicon claude distributed-computing distributed-systems embeddings fine-tuning finetuning-llms gemini huggingface inference-api langchain llm-training milvus mistral mlx nats nats-messaging nats-streaming sentence-splitting tokenizers

Last synced: 12 Dec 2024

https://github.com/armbues/SiLLM-examples

Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon

apple-silicon dpo large-language-models llm llm-inference llm-training lora mlx

Last synced: 10 Apr 2025

https://github.com/ubos-tech/ai-chatbot-starter-kit

AI Chatbot Starter Kit: An open-source, extensible framework for rapidly developing custom AI chatbots with integrations for popular data sources, messaging platforms, LLM models, and CRM systems. Ideal for developers looking for a minimal boilerplate solution.

chatgpt chatgpt-api chatgpt-bot facebook-bot instagram-chatbot llm llm-agent llm-training low-code lowcode-editor node-red openai openai-api pinecone support-bot telegram-chat-bot ubos-tech whatsapp-bot

Last synced: 15 May 2025

https://github.com/aklinker1/vitepress-knowledge

Free, self-hosted LLM chatbot trained on your VitePress website.

ai llm-training vitepress vitepress-plugin

Last synced: 16 Mar 2025

https://github.com/erogol/blagpt

Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration.

attention-mechanisms deep-learning gpt gpt-2 hymba large-language-models llm llm-training machine-learning position-embedding pytorch transformers

Last synced: 06 Jan 2025

https://github.com/levitation-opensource/manipulative-expression-recognition

MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. MER benchmarks language models for manipulative expressions, fostering development of transparency and safety in AI. It also supports manipulation victims by detecting manipulative patterns in human communication.

benchmarking conversation-analysis conversation-analytics expression-recognition fraud-detection fraud-prevention human-computer-interaction human-robot-interaction llm llm-security llm-test llm-training manipulation misinformation prompt-engineering prompt-injection psychometrics sentiment-analysis sentiment-classification transparency

Last synced: 16 Mar 2025

https://github.com/moinulmoin/free-llmstxt-generator

converts webpage content into Markdown format, optimized for LLM training and context

crawling llm-context llm-training llmstxt markdown

Last synced: 23 Mar 2025

https://github.com/raumberg/myllm

LLM Training Framework

deep-neural-networks deepspeed framework huggingface huggingface-transformers llm llm-training python reinforcement-learning torch

Last synced: 11 Apr 2025

https://github.com/riccorl/llama-trainer

Llama Trainer Utility

huggingface llama llm llm-inference llm-training llms transformer

Last synced: 08 Mar 2025

https://github.com/rs-py/howtofinetunellama3.1

Quick tutorial showing how to fine-tune Llama3.1 with nothing but free tools and text data. All code included in ipynb. For a step by step walkthrough take a look at the tutorial below on medium.

fine-tuning finetuning huggingface llama3 llm llm-training

Last synced: 24 Apr 2025

https://github.com/mewmix/gh_llm_loader

clone GitHub repositories and prepare their data for ingestion for LLMs.

context data data-structures github llm llm-training python

Last synced: 12 Jan 2025

https://github.com/amazon-science/llm-code-preference

Training and Benchmarking LLMs for Code Preference.

code-generation llm-evaluation llm-training llms-benchmarking

Last synced: 03 May 2025

https://github.com/endevsols/long-trainer

Introducing LongTrainer, a sophisticated extension of the LangChain framework designed specifically for managing multiple bots and providing isolated, context-aware chat sessions. Ideal for developers and businesses looking to integrate complex conversational AI into their systems, LongTrainer simplifies the deployment and customization of LLMs.

gpt langchain langchain-python llm-training longtrainer openai rag

Last synced: 01 May 2025

https://github.com/prismadic/tractor-beam

high-efficiency text & file scraper with smart tracking, client/server networking for building language model datasets fast

botnet cluster data file-downloader llm llm-finetuning llm-training mass-downloader scraping

Last synced: 07 May 2025

https://github.com/altunenes/rustysozluk

Efficiently fetch and perform sentiment analysis (Turkish Only) on eksisozluk.com entries using Rust

duyguanalizi eksi-sozluk eksisozluk llm-datasets llm-training reqwest rust rust-lang rust-scraping scraper sentiment-analysis turkish webscraping

Last synced: 17 Jan 2025

https://github.com/spider-rs/readability

The readability library for LLM's

clean-data data-cleaning llm-training readability rust safari-reader

Last synced: 05 Apr 2025

https://github.com/stoyan-stoyanov/transformers-calculator

Transformer Calculator - Estimate training time for transformer models.

llm llm-training transformers

Last synced: 23 Mar 2025

https://github.com/puneetkakkar/bitnet-1.58b

Bitnet 1.58b: This project implements the innovative 1-bit LLM architecture described in recent whitepapers, focusing on efficient training, inference, and open-source collaboration.

1-bit-quantization deep-learning large-language-models llm-training llms machine-learning nlp pytorch research-and-development

Last synced: 29 Dec 2024

https://github.com/Rs-py/HowToFineTuneLlama3.1

Quick tutorial showing how to fine-tune Llama3.1 with nothing but free tools and text data. All code included in ipynb. For a step by step walkthrough take a look at the tutorial below on medium.

fine-tuning finetuning huggingface llama3 llm llm-training

Last synced: 06 Jan 2025

https://github.com/maris205/llama-gene

A General-purpose Gene Task Large Language Model Based on Instruction Fine-tuning

genetic-algorithm llama llm-training

Last synced: 10 Apr 2025

https://github.com/simranjeet97/learn_rag_from_scratch_llm

Learn Retrieval-Augmented Generation (RAG) from Scratch using LLMs from Hugging Face and Langchain or Python

artificial-intelligence datascience-machinelearning genai-domain genai-usecase generative-ai llm-apps llm-evaluation llm-framework llm-training rag rag-application rag-chatbot rag-embeddings rag-evaluation rag-implementation rag-llm rag-model rag-pipeline retrieval-augmented-generation

Last synced: 15 Apr 2025

https://github.com/amanpriyanshu/generatorpromptkit

GeneratorPromptKit: A Python Library and Framework for Automated Generator Prompting and Dataset Generation

augmentation automated-prompt-engineering data data-augmentation data-science dataset dataset-generation diverse-data llm llm-training llms prompt-engineering synthetic-data synthetic-dataset-generation

Last synced: 22 Mar 2025

https://github.com/amazon-science/mezo_svrg

Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"

deep-learning fine-tuning language-model large-language-models llm-training llms machine-learning machine-learning-algorithms optimization optimization-algorithms svrg variance-reduction zero-order-methods

Last synced: 03 May 2025

https://github.com/ethicalabs-ai/kurtis

Kurtis is a fine-tuning, inference and evaluation tool built for SLMs (Small Language Models), such as Huggingface's SmolLM2.

ai huggingface-peft huggingface-smollm2 huggingface-transformers llm llm-inference llm-training machine-learning mental-health mental-health-awareness ml question-answering question-answering-model slms small-language-models smollm smollm2 transformers

Last synced: 04 Apr 2025

https://github.com/rahulunair/simpsons_llm_xpu

Finetune an LLM on intel discrete GPUs to generate dialogues based on the simpsons dataset

huggingface-transformers intel-arc intel-gpu intel-gpu-max ipex llm-inference llm-training lora pytorch

Last synced: 03 Mar 2025

https://github.com/anonym0uswork1221/python-code-docstring-scraper

A multi-threaded GitHub scraper to collect Python code with docstrings from public repositories, creating a well-documented dataset for the JaraConverse LLM model.

causal-language-modeling data-scraping dataset dataset-generation dataset-scripts docst docstring-generator github-scraper llm llm-training nlp nlp-machine-learning python-code python-dataset python3 scraper script

Last synced: 11 Jan 2025

https://github.com/tousif47/mini-llm

Training and testing a few types of small LLM from scratch for practice

keras-neural-networks llm llm-inference llm-training python3

Last synced: 25 Feb 2025

https://github.com/aravinda-1402/legal-query-ai-assistant

Legal Query AI Assistant is a chatbot that leverages LLMs like OpenAI GPT and LLaMA, with RAG to retrieve and summarize legal documents efficiently.

chatbot gpt llm llm-training llma2 natural-language-processing natural-language-understanding openai-api python

Last synced: 11 Apr 2025

https://github.com/fork123aniket/llm-rag-powered-qa-app

A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App

context-aware-system eleutherai fine-tuning large-language-models llm-inference llm-serving llm-training llmops parameter-efficient-fine-tuning question-answering ray ray-serve retrieval-augmented-generation

Last synced: 16 Jan 2025

https://github.com/visheshc14/llm-fastapi

NimbleBox Apprenticeship ML Engineer Task - 1. This project demonstrates the implementation of a Language Model Server using FastAPI and gRPC. It leverages a large language model to generate coherent text based on user input.

fastapi grpc llm-training multithreading

Last synced: 12 Mar 2025

https://github.com/leo848/deversai

Quelltext für das Jugend forscht-Projekt "DEversAI: Training und Visualisierung deutsch lokalisierter direktionalkomplementärer LLMs"

ai german-language gpt-2 jugend-forscht llm llm-training pytorch

Last synced: 11 Apr 2025

https://github.com/aleefbilal/finetuning-in-docker

This project provides a Docker setup for fine-tuning the Llama 3.1 model. The Dockerfile installs the necessary dependencies and sets up the environment for training. This guide will help you understand how to build and run the Docker container for your fine-tuning tasks using an interactive jupyter notebook.

docker finetuning jupyter-notebook llm-training

Last synced: 30 Apr 2025

https://github.com/anshulranjan2004/microrwkv

Implementation of a custom architecture on nanoRWKV: A nanoGPT-style adaptation of the RWKV Language Model, which combines the simplicity of RNNs with GPT-level performance for large language models (LLMs).

llm llm-training pytorch rwkv

Last synced: 19 Apr 2025

https://github.com/0xnu/multicollinearity_llm

A multicollinearity-based compression C program, identifies and removes highly correlated weights in neural networks, thereby reducing redundancy.

compression compression-algorithm llm llm-training multicollinearity

Last synced: 08 Feb 2025

https://github.com/gurpreetkaurjethra/quantize-llm-using-awq

Quantize LLM using AWQ

awq generative-ai large-language-models llm-training llms quantize

Last synced: 22 Nov 2024

https://github.com/gurpreetkaurjethra/llms-inference-and-fine-tuning

Estimate Memory Consumption of LLMs Inference and Fine Tuning

fine-tuning generative-ai large-language-models llm-inference llm-training llms memory-allocation

Last synced: 22 Nov 2024

https://github.com/sreeeswaran/train-your-llm

This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library.

artificial-intelligence deep-learning language-model large-language-model large-language-models llm llm-training llms machine-learning model-training nlp pretrained-language-model pretrained-models training

Last synced: 22 Nov 2024

https://github.com/shreyanmitra/candyllm

A simple, easy-to-use framework for HuggingFace and OpenAI text-generation models focused on explainability.

alpaca falcon gpt-4 huggingface llama3 llm-training llms mistral openai transformers vicuna

Last synced: 22 Dec 2024

https://github.com/firojalam/llamalens

This repository contains the resources, code, and documentation for LlamaLens, a specialized multilingual large language model (LLM) designed to analyze news and social media content effectively. LlamaLens supports multiple languages, including Arabic, English, and Hindi, and is tailored for diverse tasks such as sentiment analysis, misinformation.

arabic downstream-tasks emotion-detection english hindi llm llm-inference llm-training newsmedia sentiment-classification social-media

Last synced: 28 Dec 2024

https://github.com/hrolive/poland-end-to-end-llm-bootcamp

This bootcamp is designed to give NLP researchers an end-to-end overview on the fundamentals of NVIDIA NeMo framework, complete solution for building large language models. It will also have hands-on exercises complimented by tutorials, code snippets, and presentations to help researchers kick-start with NeMo LLM Service and Guardrails.

gpt llama2 llm llm-inference llm-training nemo-guardrails nvidia nvidia-nemo p-tuning prompt-tuning tensorrt triton

Last synced: 20 Apr 2025

https://github.com/ethicalabs-ai/flowertune-qwen2.5-coder-0.5b-instruct

FlowerTune LLM on Coding Dataset

ai federated-learning federated-learning-framework llm-fine-tuning llm-finetuning llm-training machine-learning ml qwen2 qwen2-5 transformers transformers-models

Last synced: 04 Apr 2025

https://github.com/shreyanmitra/CandyLLM

A simple, easy-to-use framework for HuggingFace and OpenAI text-generation models focused on explainability.

alpaca falcon gpt-4 huggingface llama3 llm-training llms mistral openai transformers vicuna

Last synced: 19 Feb 2025

https://github.com/lamm-mit/graph-aware-transformers

Graph-Aware Attention for Adaptive Dynamics in Transformers

ai4science attention-mechanism graph graph-aware huggingface-transformers language llm-training llms materials-informatics materials-science materiomics

Last synced: 02 Mar 2025

https://github.com/agents4good/masterchef-ai

Acesse: https://agents4good.github.io/MasterChef-AI/

artificial-intelligence distillation distillation-model llm llm-inference llm-training open-source ufcg

Last synced: 11 Mar 2025

https://github.com/dev-d-gr8/storyscape

A storytelling (generates stories with pictures) generative AI based iOS application based on custom fine tuned LLaMA 3.2 3B-Instruct model on Hindi stories (Provision to generate English stories via call to OpenAI GPT-4o).

app aws django docker generative-ai generative-art ios jenkins llm llm-inference llm-training llmops mobile-development python sagemaker swift swiftui

Last synced: 03 Mar 2025

https://github.com/bhaskarr103/legaldoc_ai

A research driven approach on building AI-powered legal assistant leveraging FAISS for efficient legal document retrieval and Hugging Face model for intelligent response generation.

faiss-vector-database langchain llm-training

Last synced: 22 Mar 2025

https://github.com/jianzhnie/scaletorch

A PyTorch toolkit for large model training

distributed llm-training torch

Last synced: 28 Mar 2025

https://github.com/dkimjpg/ai-large-language-tagger-using-hidden-markov-models

An implementation of a language part-of-speech (POS) tagger using Hidden Markov Models. Basically, it takes English text as input and tries to tag each word as a noun, verb, adjective, etc. based on the input's sequence.

ai hmm-viterbi-algorithm llm-training

Last synced: 09 Mar 2025