Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with llm-inference

A curated list of projects in awesome lists tagged with llm-inference .

https://github.com/thefcraft/localgpt

clone of chatgpt usign html css js and flask

chatgpt chatgpt-clone chatgpt-gui flask llm-inference python

Last synced: 12 Jan 2025

https://github.com/firojalam/llamalens

This repository contains the resources, code, and documentation for LlamaLens, a specialized multilingual large language model (LLM) designed to analyze news and social media content effectively. LlamaLens supports multiple languages, including Arabic, English, and Hindi, and is tailored for diverse tasks such as sentiment analysis, misinformation.

arabic downstream-tasks emotion-detection english hindi llm llm-inference llm-training newsmedia sentiment-classification social-media

Last synced: 28 Dec 2024

https://github.com/nickpotafiy/illama

A fast, lightweight, parallel inference server for Llama LLMs.

exllama exllamav2 flash-attention-2 inference llama llama2 llama3 llm-inference paged-attention server

Last synced: 10 Oct 2024

https://github.com/giuseppebellamacina/vulnerabilitybot

Vulnerability Bot with Database

cybersecurity-tool llm llm-inference

Last synced: 05 Dec 2024

https://github.com/niansa/libjustlm

Super easy to use library for doing LLaMA/GPT-J stuff! - Mirror of: https://gitlab.com/niansa/libjustlm

ai cpp17 cpp20 gpt-j llama llama2 llm llm-inference mpt python wrapper-library

Last synced: 13 Jan 2025

https://github.com/williamzebrowski/assistant-api

OpenAI Assistant API integrated with Elasticsearch, Logstash & Kibana

ai chatapp chatgpt conversational-ai data elasticsearch kibana llm-inference llms openai rag

Last synced: 11 Oct 2024

https://github.com/aadit3003/llm-rhyme-eval

English and Dutch rhyming datasets (5k word pairs each) for five types of rhymes. Three open-source LLMs (Llama2, Llama3, CrystalChat) are tested on these datasets, with prompt variation.

llama2 llama3 llm llm-evaluation llm-inference nlp orthography phonology rhyme rhyme-analysis

Last synced: 22 Dec 2024

https://github.com/aadit3003/llm-medical-personas

Examination of whether LLMs can maintain consistency over extended multiple text generation for 10 medical personas. 5 novel plausibility metrics proposed, and an ontology of common LLM errors.

ai bart flant5 gen llama2 llama2-7b llm llm-inference maximal-marginal-relevance medical mmr nlp nlp-medical-records question-answering

Last synced: 22 Dec 2024

https://github.com/sureshbeekhani/ai-quick-summaries

Developed an AI-powered web app using Streamlit and Google Gemini AI for generating concise summaries from PDFs, images, and text files. The app features real-time text summarization, file upload support, and a user-friendly interface.

chatbot gemini gpt image-and-pdf llm llm-inference python streamlit

Last synced: 07 Dec 2024

https://github.com/abhaskumarsinha/corpus2gpt

Corpus2GPT: A project enabling users to train their own GPT models on diverse datasets, including local languages and various corpus types, using Keras and compatible with TensorFlow, PyTorch, or JAX backends for subsequent storage or sharing.

attention-mechanism jax keras large-language-models llm llm-inference llm-training python3 pytorch tensorflow

Last synced: 21 Nov 2024

https://github.com/rahulunair/simple_llm_inference

A simple example of LLM inference on Intel GPUs (XPUs)

intel-arc intel-gpu-max intelgpu ipex llm-inference transformers

Last synced: 13 Jan 2025

https://github.com/wtlow003/speculative-sampling

Implementation of Speculative Sampling in "Accelerating Large Language Model Decoding with Speculative Sampling"

deepmind llm-inference speculative-decoding speculative-sampling

Last synced: 16 Jan 2025

https://github.com/paulpierre/vllm-docker

test Llama-3.2-11B-Vision-Instruct 4-bit quant quickly on an a100 40GB

docker docker-compose llama llama3 llm llm-inference llms vllm

Last synced: 20 Jan 2025

https://github.com/aaashrafhabib/advanced-rag-system-

End To End Advanced Rag Project Using Open Source LLM models and Groq Inferencing

generative-ai langchain llm-inference python rag

Last synced: 15 Jan 2025

https://github.com/mukeshmithrakumar/llm-poc-2024

Popular Large Language Models from scratch - 2024

gpt llama llm llm-inference llm-training transformer

Last synced: 17 Jan 2025

https://github.com/dawid-szewc/perplexity-cli

๐Ÿง  A simple command-line client for the Perplexity API. Ask questions and receive answers directly from the terminal! ๐Ÿš€๐Ÿš€๐Ÿš€

ai bash bash-script llm-inference perplexity perplexity-ai perplexity-api python python3 zsh

Last synced: 02 Dec 2024

https://github.com/drake9098/vulnerabilitybot

A client-server structure to make queries and send it to an AI model

cybersecurity llm llm-inference

Last synced: 28 Nov 2024

https://github.com/dev-d-gr8/storyscape

A storytelling (generates stories with pictures) generative AI based iOS application based on custom fine tuned LLaMA 3.2 3B-Instruct model on Hindi stories (Provision to generate English stories via call to OpenAI GPT-4o).

app aws django docker generative-ai generative-art ios jenkins llm llm-inference llm-training llmops mobile-development python sagemaker swift swiftui

Last synced: 14 Jan 2025

https://github.com/eternalflame02/single-node-finetuning-of-tiny-llama-using-intel-xeon-spr

The project was undertaken as part of the Intel Unnati Industrial Training program for the year 2024. The primary objective of this project aligns with Problem Statement PS-04: Introduction to GenAI LLM Inference on CPUs and subsequent LLM Model Finetuning for the development of a Custom Chatbot.

intel-unnati llm-finetuning llm-inference python tinyllama

Last synced: 12 Dec 2024

https://github.com/collab-uniba/irc-setfit-ollama-demo

Issue report classification demo with SetFit and Ollama for NASA's Flight System software repositories

docker issue-management llm-inference ollama-python setfit

Last synced: 10 Jan 2025

https://github.com/sergio11/streamlit_llm_langchain_applications

Explore innovative Language Model applications (LLMs) with Streamlit-based Proof of Concepts (POCs) ๐Ÿš€. These demos showcase open-source models using Groq for cloud-based inference and LangChain for efficient orchestration ๐ŸŒ. From writing assistants to blog post generators, experience AI-driven tools enhancing productivity and creativity ๐Ÿ“š๐Ÿ’ก.

chromadb faiss faiss-vector-database groq-ai groq-api langchain langchain-python llama3 llm llm-framework llm-inference llms mistral-7b streamlit tavily

Last synced: 14 Dec 2024

https://github.com/abhaskumarsinha/Corpus2GPT

Corpus2GPT: A project enabling users to train their own GPT models on diverse datasets, including local languages and various corpus types, using Keras and compatible with TensorFlow, PyTorch, or JAX backends for subsequent storage or sharing.

attention-mechanism jax keras large-language-models llm llm-inference llm-training python3 pytorch tensorflow

Last synced: 20 Oct 2024

https://github.com/es7/introduction-to-llms

In this repository I have explained the application of Large Language Models (LLMs). Starting from how to use LLMs in our own application till how to build a LLM.

computer-vision deep-learning huggingface llm llm-framework llm-inference llm-training machine-learning natural-language-processing prompt-engineering prompt-learning

Last synced: 11 Jan 2025

https://github.com/niansa/discord_llama

Multi-Model and multi-tasking llama Discord Bot - Mirror of: https://gitlab.com/niansa/discord_llama

ai cpp20 discord-bot llama llama2 llamacpp llm llm-inference

Last synced: 13 Jan 2025

https://github.com/projects-mk/chat-with-documents-quickstart

Repository containing code for setting up RAG on your machine. Implemented OpenAI as well as HuggingFace llms and embedding models

huggingface langchain-python langfuse llm llm-inference ollama open-source openai rag

Last synced: 13 Nov 2024

https://github.com/ashmadev/react-ollama-ui

Awesome UI for interacting with your local LLMs

ai chatbot llm-inference ollama

Last synced: 19 Nov 2024

https://github.com/richardsonlima/synapsense

SynapSense: Python In-Context Learning for Large Language Models SynapSense is a cutting-edge Python library designed to streamline the implementation of In-Context Learning (ICL) with Large Language Models (LLMs).

ai genai llm llm-agent llm-inference llmops llms

Last synced: 13 Nov 2024

https://github.com/khaledsharif/openrag

openrag = ollama + dspy + chroma

llm-inference rag vector-database

Last synced: 23 Nov 2024

https://github.com/alexlnkp/remi

A basic LLM chatbot named Remi

chatbot llm-inference

Last synced: 19 Dec 2024

https://github.com/regular-baf/bafchat

Bringing local LLMs to a Minecraft front-end through commands.

ai api gpt llama llm llm-inference minecraft minecraft-mod redpajama

Last synced: 17 Jan 2025

https://github.com/siddhant-k-code/contextify

Python script designed to streamline the process of providing context to Large Language Models (LLMs) from your project files. It's particularly useful when working on coding tasks with LLMs, as it can automatically gather and format relevant code from your project.

context contextify llm llm-inference python

Last synced: 01 Dec 2024

https://github.com/duck4i/hnlp-translate

Translate JSON localizations automatically - Helsinki NLP model

helsinki-nlp llm-inference transformers

Last synced: 24 Dec 2024

https://github.com/kira94-hkz/powerserve

High-speed and easy-use LLM serving framework for local deployment

llama llm llm-inference llm-serving npu qwen smallthinker smartphone

Last synced: 20 Jan 2025

https://github.com/elskow/multilang-saas-paraphrasing-tool

Forked version of https://github.com/alfazh123/ParaFaze with a State-of-the-Art of an over engineering :)

llm-inference nextjs page-router paraphrasing-tool python t5-model

Last synced: 12 Jan 2025

https://github.com/thansen0/fastllm.cpp

A low latency, fault tolerant API for accessing LLM's written in C++ using llama.cpp.

llamacpp llm llm-inference

Last synced: 22 Dec 2024

https://github.com/hherpa/squidblock-ru-docs

SquidBlock - ะธะฝะฝะพะฒะฐั†ะธะพะฝะฝะฐั ะฟะปะฐั‚ั„ะพั€ะผะฐ, ะบะพั‚ะพั€ะฐั ะพั‚ะบั€ั‹ะฒะฐะตั‚ ะฑะตะทะณั€ะฐะฝะธั‡ะฝั‹ะต ะฒะพะทะผะพะถะฝะพัั‚ะธ ะดะปั ั€ะฐะทั€ะฐะฑะพั‚ะบะธ ะธ ะทะฐะฟัƒัะบะฐ ะฑะปะพั‡ะฝะพ-ะผะพะดัƒะปัŒะฝั‹ั… LLM-ัะธัั‚ะตะผ.

agent-oriented-programming agentic-agi ai autogen chat-application chatbot documentation gpt-35-turbo gpt-4 graph llm llm-agents llm-inference llmops node pipeline rag retrieval-augmented-generation

Last synced: 24 Dec 2024

https://github.com/saherpathan/invoicify-ai-cohere

A Flask application that extracts invoice details from uploaded PDFs and images using LLM inference API

cohereapi flask-application llm-inference ocr pdfplumber python3

Last synced: 20 Jan 2025

https://github.com/zvoverman/yt-video-summarizer

Firefox extension that summarizes youtube videos using AI

ai javascript llm-inference longformer-models

Last synced: 24 Nov 2024

https://github.com/nazago/meeting-minutes-generator

Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated

langchain-python llm-inference local-inference meeting-minutes ollama speech-to-text summarization whisper

Last synced: 02 Jan 2025

https://github.com/vicky87883/urlshortner

UrlShortner map's the larger url's into smaller one. This app is fully designed in python and used postgresql database for mapping url's.

django django-rest-framework large-language-models llm-inference postgresql-database url-parser url-shortener url-shortener-microservice

Last synced: 30 Nov 2024

https://github.com/jkevin2010/llms-for-dementia-detection

Fine-tuning Large Language Models (LLMs) for the early detection of Alzheimer's Disease and Related Dementias (ADRD)

alzheimer-disease-prediction chatgpt dementia-detection finetuning gpt-4 llm-inference machine-learning

Last synced: 07 Dec 2024

https://github.com/abhinav330/msc-project

AI-Powered Chatbot for University Websites This project enhances the usability of university websites by providing an AI-driven chatbot powered by advanced Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG).

chatbot data-science data-visualization finetuning-llms gemma2 llama3 llama3-finetune llm llm-inference mistral-7b nlp ollama phi-3-mini rag research-project

Last synced: 02 Jan 2025

https://github.com/hrolive/large-language-models-on-supercomputers

Comprehensive exploration of LLMs, including cutting-edge techniques and tools such as parameter-efficient fine-tuning (PEFT), quantization, zero redundancy optimizers (ZeRO), fully sharded data parallelism (FSDP), DeepSpeed, and Huggingface accelerate.

deepspeed evaluation-metrics fsdp high-performance-computing hpc huggingface huggingface-transformers jupyter llm llm-inference llm-training monitoring peft python quantization slurm tokenization transformer unsloth

Last synced: 04 Jan 2025

https://github.com/howardchiang2/vllm

ๅœจcolabไธŠๅฟซ้€Ÿ้ชŒ่ฏvllm

llm llm-inference

Last synced: 13 Jan 2025

https://github.com/iamaziz/llm-cost-estimator

Estimate the cost of using OpenAI models based on the number of input and output tokens.

cost-estimation llm-inference openai-api

Last synced: 04 Jan 2025

https://github.com/cai991108/machine-learning-and-language-model

This project explores GPT-2 and Llama models through pre-training, fine-tuning, and Chain-of-Thought (CoT) prompting. It includes memory-efficient optimizations (SGD, LoRA, BAdam) and evaluations on math datasets (GSM8K, NumGLUE, StimulEq, SVAMP).

chainofthought finetune-llm gpt2 llama llm llm-inference pretrained-language-model

Last synced: 20 Jan 2025

https://github.com/kristofferv98/agent_nexus

Agentic framework for dynamic function calling across latest LLMs (gpt-4o, gemini-2.0-flash, groq modes, and anthropic models). Converts Python functions into provider-specific schemas for autonomous tool use. Features unified API, JSON schema generation, and integrated tool execution handling.

agent-orchestration agents anthropic function-calling gemini gemini-2-0-flash-exp gemini-tools groq json-schema llm-inference multi-llm openai parallel-processing schema-generation tool-generator tool-integration tools

Last synced: 20 Dec 2024

https://github.com/rfdzan/t5-llm-training

creating a workflow to train t5 language models

language-model llm-inference llm-training pytorch

Last synced: 28 Dec 2024

https://github.com/mohammad-nour-alawad/voice-interpreter-for-data-visualization

Django app for voice interpreter to manipulate and Visualize data.

data-visualization django javascript llm-inference voice-assistant

Last synced: 25 Nov 2024

https://github.com/saritaphd/medical-chatbot-using-llama2

This project is a medical chatbot powered by the open-source Llama 2 model and integrated with Pinecone for efficient vector search. The chatbot is designed to answer user queries based on information extracted from medical documents (PDFs).

chatbot-application generative-ai langchain llama2 llm-inference llm-training pinecone python

Last synced: 22 Dec 2024

https://github.com/omars44/open-assistant-demo

langchain open assistant demo using hugging face Hub (Inference API)

langchain langchain-python llm llm-inference open-assistant

Last synced: 19 Jan 2025

https://github.com/rs-py/howtofinetuneanllm

This is a step by step example of how to quickly finetune an LLM without access to robust hardware using simple text data. For a more in depth format read the article on medium that is linked below.

fine-tuning huggingface huggingface-transformers llama llm llm-inference

Last synced: 18 Jan 2025

https://github.com/t-mohamed-shafeek/llm-for-language-translation

This repository contains a simple and beginner-level noteboook which employs the mBART LLM/model for the translation of English text into Indian Languages.

huggingface-transformers language-tra llm llm-inference mbart mbart50 nlp nmt

Last synced: 14 Dec 2024

https://github.com/thansen0/fast-llm-api

A low latency, fault tolerant API for accessing LLM's.

llm llm-inference

Last synced: 22 Dec 2024

https://github.com/ripan-roy/mcts-chain-of-thought

This project shows the implementation of Monte Carlo Tree Search with a large language model to generate and evaluate reasoning chains or chain of thoughts for advanced problem-solving.

chain-of-thought large-language-models llama3 llm-evaluation llm-inference llms monte-carlo-tree-search ollama-api python

Last synced: 12 Jan 2025

https://github.com/riolaf05/langchain-fastapi-rag-platform

A platform to test multiple LLM models inside a RAG workflow to choose the best model for embedding and retrieval and the best prompt according to the use case

artificial-intelligence aws cloud iac iac-terraform infrastructure langchain langchain-python llm llm-inference llms python rag serverless terraform

Last synced: 20 Nov 2024

https://github.com/howardchiang2/easy_mode_llm_inference

็ฎ€ๅ•็š„LLama.cppไฝฟ็”จ

llama2 llm llm-inference

Last synced: 22 Dec 2024

https://github.com/mapluisch/llava-websocket-server

Python-based WebSocket for CLI LLaVA inference.

inference llama llama2 llava llm llm-inference python websocket websockets

Last synced: 12 Jan 2025

https://github.com/duck4i/node-llama

LLM inside your Node.JS

llamacpp llm llm-inference local nodejs

Last synced: 05 Jan 2025

https://github.com/hoehrmann/ietf-cert

Experimental autonomous AI LLM & RAG IETF reviewer

ai autonomous ietf llm-inference quality-assurance retrieval-augmented-generation

Last synced: 05 Dec 2024

https://github.com/karan-parmar-007/blog-ai

Whisper AI is an innovative blogging platform that combines advanced artificial intelligence and cloud technologies to revolutionize the way users create, share, and engage with content. With a range of features designed to enhance the blogging experience, Whisper AI empowers users to express themselves freely while ensuring privacy and security.

artificial-intelligence bootstrap django html-css-javascript langchain large-language-models llama2 llamacpp llm-inference python3 security translation

Last synced: 22 Dec 2024

https://github.com/0xnu/fine_tune_llm_docker

Fine-tune large language models (LLMs) using the Hugging Face Transformers library.

docker llm llm-fine-tuning llm-finetuning llm-inference

Last synced: 15 Dec 2024

https://github.com/andrewkchan/deepseek.cpp

CPU inference for the DeepSeek family of large language models in pure C++

cpp deepseek llama llm llm-inference machine-learning transformers

Last synced: 20 Jan 2025

https://github.com/danielrosehill/the-llm-files

A blog about my adventures and discoveries working with large language models (LLMs).

large-language-models llm-benchmarking llm-inference llms

Last synced: 04 Dec 2024

https://github.com/mrseanryan/gpt-function-calling-bare-bones

Function Calling an LLM taking a bare-bones (no libraries) approach

function-calling llm llm-inference

Last synced: 28 Dec 2024

https://github.com/dluc/ai-cronjobs

A collection of simple cronjobs for macOS, leveraging LLMs.

ai cronjob llm-inference macos

Last synced: 09 Dec 2024

https://github.com/chloelavrat/personal-assistant

Access different personal assistants with HuggingChat and Streamlit

assistant-chat-bots huggingchat llm llm-agent llm-inference streamlit summary

Last synced: 02 Jan 2025

https://github.com/fahmiaziz98/large_language_model

This repository contains my practice in learning llms, specifically BERT, T5, GPT-2

bert fine-tuning gpt-2 huggingface large-language-models llm-inference openai streamlit t5

Last synced: 20 Nov 2024

https://github.com/leozqin/hops

A load-balancing reverse proxy server that enables you to address a fleet of diverse Ollama instances as a single one

llama llm llm-inference load-balancer ollama ollama-api reverse-proxy self-hosted selfhosted

Last synced: 20 Jan 2025

https://github.com/tinybiggames/askllm

LLM API Access for Delphi Delphiโ„ข

ai claude-api llm-inference llms openai-api win64 windows-10 windows-11

Last synced: 05 Dec 2024

https://github.com/tybrucechen/llm-based-web-chat-robot

Web Chat Robot based on LLama3.2-1B Model at Server-side Deployment with Continuous Conversation

audio-transcribing llama3 llm llm-inference text-generation

Last synced: 11 Dec 2024

https://github.com/wtlow003/ngram-decoding

(Re)-implementation of "Prompt Lookup Decoding" by Apoorv Saxena, with extended ideas from LLMA Decoding.

llm-inference n-gram ngram-decoding prompt-lookup-decoding speculative-decoding

Last synced: 16 Jan 2025