Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with llm-inference
A curated list of projects in awesome lists tagged with llm-inference .
https://github.com/ooridata/toolio
GenAI & agent toolkit for Apple Silicon Mac, implementing JSON schema-steered structured output (3SO) and tool-calling in Python. For more on 3SO: https://huggingface.co/blog/ucheog/llm-power-steering
agentic ai apple-silicon client-server genai json-schema llm llm-inference mac mlx tool-calling tools
Last synced: 28 Dec 2024
https://github.com/webgptorg/promptbook
It's time for a paradigm shift! The future of software is in plain English ✨
Last synced: 29 Dec 2024
https://github.com/xtekky/gpt4local
Openai-style, fast & lightweight local language model inference w/ documents
ai chatbot chatbots chatgpt chatgpt-api documents gpt gpt-4 gpt4free language-model llm llm-inference local local-llm openai openai-api python
Last synced: 27 Oct 2024
https://github.com/felladrin/MiniSearch
Minimalist web-searching app with an AI assistant that runs directly from your browser. Uses Web-LLM, Ratchet-ML, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
ai artificial-intelligence generative-ai gpu-accelerated information-retrieval llm llm-inference machine-learning nlp question-answering ratchet-ml retrieval-augmented-generation search search-engine searxng typescript web-llm webapp wllama
Last synced: 29 Oct 2024
https://github.com/kddubey/cappr
Completion After Prompt Probability. Make your LLM make a choice
huggingface kv-cache llamacpp llm-inference probability prompt-engineering text-classification zero-shot
Last synced: 29 Dec 2024
https://github.com/picovoice/pico-cookbook
Recipes for on-device voice AI and local LLM
cookbook llm llm-inference local-llm on-device-ai recipes voice-ai voice-assistant
Last synced: 01 Jan 2025
https://github.com/mobile-artificial-intelligence/maid_llm
maid_llm is a dart implementation of llama.cpp used by the mobile artificial intelligence distribution (maid)
facebook flutter-ai gemma ggml gguf llama llama2 llamacpp llm llm-inference local-ai meta mistral mixtral mobile-ai
Last synced: 02 Jan 2025
https://github.com/hkust-nlp/dart-math
Official implementation for the paper *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
deep-learning llm llm-evaluation llm-inference llm-training mathematics nlp
Last synced: 20 Nov 2024
https://github.com/Mobile-Artificial-Intelligence/maid_llm
maid_llm is a dart implementation of llama.cpp used by the mobile artificial intelligence distribution (maid)
facebook flutter-ai gemma ggml gguf llama llama2 llamacpp llm llm-inference local-ai meta mistral mixtral mobile-ai
Last synced: 25 Nov 2024
https://github.com/hoshinonyaruko/gensokyo-llm
开源的智能体项目 支持6种聊天平台 Onebotv11一对多连接 流式信息 agent 对话keyboard气泡生成 支持6种大模型接口(持续增加中) 具有将多种大模型接口转化为带有上下文的通用格式的能力.
ai-agents ai-agents-framework chatbot llm llm-api llm-inference onebot onebot-plugin onebot11 qqbot
Last synced: 12 Nov 2024
https://github.com/monk1337/auto-ollama
run ollama & gguf easily with a single command
autogguf autoollama gguf inference llama llm llm-inference lora mergelora mistral ollama openai
Last synced: 24 Nov 2024
https://github.com/Hoshinonyaruko/Gensokyo-llm
开源的智能体项目 支持6种聊天平台 Onebotv11一对多连接 流式信息 agent 对话keyboard气泡生成 支持6种大模型接口(持续增加中) 具有将多种大模型接口转化为带有上下文的通用格式的能力.
ai-agents ai-agents-framework chatbot llm llm-api llm-inference onebot onebot-plugin onebot11 qqbot
Last synced: 28 Oct 2024
https://github.com/mani-kantap/llm-inference-solutions
A collection of all available inference solutions for the LLMs
llm-inference llm-serving llmops
Last synced: 17 Nov 2024
https://github.com/friendliai/friendli-client
Friendli: the fastest serving engine for generative AI
ai generative-ai gpt gpt3 inference inference-engine inference-server llama2 llm llm-inference llm-ops llm-serving llmops llms mistral ml mlops serving stable-diffusion
Last synced: 29 Dec 2024
https://github.com/lofcz/llmtornado
One .NET library to consume OpenAI, Anthropic, Cohere, Google, Azure, Groq, and self-hosed APIs.
anthropic-ai chatbot cohere command-r-plus gemini gpt-4v gpt4o groq koboldcpp llm-inference o1 o1-mini o1-preview ollama openai sdk sonnet sonnet3-5
Last synced: 02 Jan 2025
https://github.com/harleyszhang/lite_llama
The llama model inference lite framework by tirton.
llama llama3 llm llm-inference python3 triton-kernels
Last synced: 03 Dec 2024
https://github.com/adithya-s-k/companionllm
CompanionLLM - A framework to finetune LLMs to be your own sentient conversational companion
fine-tuning finetuning hacktoberfest hacktoberfest-accepted hacktoberfest2023 huggingface llama llama2 llamacpp llm llm-inference llm-training lora mit-license open-source peft
Last synced: 03 Dec 2024
https://github.com/jndiogo/sibila
Extract structured data from local or remote LLM models
ai dataclasses gguf gpt large-language-models llamacpp llm-inference local-ai local-models openai pydantic python structured-data structured-extraction structured-generation
Last synced: 25 Dec 2024
https://github.com/ai-hypercomputer/jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
attention batching gemma inference llama llama2 llm llm-inference model-serving pytorch tpu
Last synced: 03 Jan 2025
https://github.com/harleyszhang/llm_counts
llm theoretical performance analysis tools and support params, flops, memory and latency analysis.
gpu-performance llama llm llm-inference profiler python3 transformer
Last synced: 21 Dec 2024
https://github.com/felladrin/awesome-ai-web-search
A list of software that allows searching the web with the assistance of AI.
ai ai-search-engine artificial-intelligence artificial-intelligence-projects awesome awesome-list generative-ai generative-ai-projects generative-ai-tools information-retrieval llm-inference metasearch question-answering rag retrieval-augmented-generation web-search
Last synced: 14 Nov 2024
https://github.com/ilyasmoutawwakil/py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
Last synced: 01 Jan 2025
https://github.com/mariochavez/aoororachain
Aoororachain is Ruby chain tool to work with LLMs
artificial-intelligence large-language-models llm llm-inference
Last synced: 14 Nov 2024
https://github.com/opencsgs/llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
deepspeed llama-cpp llm-inference ray transformer vllm
Last synced: 07 Nov 2024
https://github.com/Xyntopia/taskyon
Browser based Interface for Generative AI. Chat/Agent/Taskmanager Hybrid.
ai chatbot gpt gpt-4 gui llm llm-agent llm-inference
Last synced: 25 Oct 2024
https://github.com/praful932/llmsearch
Find better generation parameters for your LLM
llm llm-evaluation llm-inference nlp
Last synced: 27 Oct 2024
https://github.com/Praful932/llmsearch
Find better generation parameters for your LLM
llm llm-evaluation llm-inference nlp
Last synced: 08 Nov 2024
https://github.com/phospho-app/fastassert
Dockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate limits. Compare the quality and latency to your current LLM API provider.
docker llm llm-inference outlines vllm
Last synced: 09 Nov 2024
https://github.com/catallo/ht
ht - a shell command that answers your questions about shell commands
ai bash fish-shell gpt linux linux-shell llm llm-inference llms macos macos-shell macosx openai openai-api shell shellcode zsh
Last synced: 02 Nov 2024
https://github.com/pingcap/linguflow
LinguFlow, a low-code tool designed for LLM application development, simplifies the building, debugging, and deployment process for developers.
chatgpt gpt llm-framework llm-inference openai
Last synced: 06 Nov 2024
https://github.com/openmachine-ai/transformer-tricks
A collection of tricks to speed up LLMs
ai arxiv arxiv-papers llm llm-inference llmops machine-learning python transformer transformer-models transformer-pytorch
Last synced: 10 Nov 2024
https://github.com/kyegomez/exa
Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and minimal learning curve.
inference-engine llama2 llama2-7b llamacpp llamas llm-inference llms opensource
Last synced: 22 Dec 2024
https://github.com/waltonfuture/Diff-eRank
Code for https://arxiv.org/abs/2401.17139 (NeurIPS 2024)
evaluation-metrics llm llm-inference machine-learning mllm neurips-2024
Last synced: 26 Nov 2024
https://github.com/tinybiggames/lmengine
Local LLM Inference
c cpp indiedev library llamacpp llm-inference pascal win64 windows-10 windows-11
Last synced: 10 Oct 2024
https://github.com/hscspring/llama.np
Inference Llama/Llama2 Modes in NumPy
llama llama2 llm llm-inference numpy
Last synced: 06 Dec 2024
https://github.com/zrzrzrzrzrzrzr/lm-fly
大模型推理框架加速,让 LLM 飞起来
llm llm-inference mlx openvino tensorrt-llm tgi vllm
Last synced: 28 Dec 2024
https://github.com/yav-ai/amazon-bedrock-node-js-samples
This repository contains Node.js examples to get started with the Amazon Bedrock service.
amazon-bedrock amazon-titan aws aws-bedrock aws-lambda-node aws-sdk aws-sdk-javascript claude claude-3 claude-api cohere jurassic-ultra language-model llama2 llm-inference mistral mixtral-8x7b-instruct nodejs nodejsexamples stable-diffusion
Last synced: 22 Dec 2024
https://github.com/monocle2ai/monocle
Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python.
generative-ai linux-foundation llm-agent llm-inference llms observability opentelemetry oss python telemetry tracing
Last synced: 20 Dec 2024
https://github.com/darrylbayliss/simon-says-android
An Android App recreating the Simon Says game. Uses MediaPipe to run an LLM on device
android efficientnetv2 gemma-2b gemma-2b-it generative-ai jetpack-compose jetpack-navigation-compose kotlin llm llm-inference mediapipe mediapipe-classifier
Last synced: 07 Nov 2024
https://github.com/jayzhang42/sled
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433
decoding factuality google large-language-models llama llama2 llama3 llm llm-inference meta openai
Last synced: 22 Dec 2024
https://github.com/darthph0enix7/docpoi_repo
A local chatbot for managing docs
document-management langchain llama3 llm-inference localllm metadata-extraction ocr
Last synced: 12 Oct 2024
https://github.com/arcee-ai/arcee-python
The Arcee client for executing domain-adpated language model routines
ai llm llm-inference llm-training llmops
Last synced: 09 Nov 2024
https://github.com/brave-experiments/melt-public
codebase for "MELTing Point: Mobile Evaluation of Language Transformers"
android benchmarks energy-consumption ios jetson llamacpp llm-inference llmfarm mlc-llm
Last synced: 22 Dec 2024
https://github.com/commonroad/drplanner
🩺 : Elevate Your Planner, Perfect Your Motion 🌠
diagnosis-tool llm llm-inference motion-planning repairer
Last synced: 13 Nov 2024
https://github.com/tinybiggames/infero
An easy to use, high performant CUDA powered LLM inference library.
cuda llamacpp llm-inference win64 windows-10 windows-11
Last synced: 10 Oct 2024
https://github.com/hitz-zentroa/this-is-not-a-dataset
We introduce a large semi-automatically generated dataset of ~400,000 descriptive sentences about commonsense knowledge that can be true or false in which negation is present in about 2/3 of the corpus in different forms that we use to evaluate LLMs
benchmark common-sense commonsense decoder huggingface llama llama2 llm llm-inference negation scorer transformer
Last synced: 15 Nov 2024
https://github.com/armbues/SiLLM-examples
Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon
apple-silicon dpo large-language-models llm llm-inference llm-training lora mlx
Last synced: 07 Nov 2024
https://github.com/damo-nlp-sg/multipurpose-chatbot
A chatbot UI for RAG, multimodal, text completion. (support Transformers, llama.cpp, MLX, vLLM)
chatbot-application gradio-interface gradio-python-llm llm-inference
Last synced: 13 Nov 2024
https://github.com/riccorl/llama-trainer
Llama Trainer Utility
huggingface llama llm llm-inference llm-training llms transformer
Last synced: 22 Oct 2024
https://github.com/notnaton/microllm
My own implementation to run inference on local LLM models
Last synced: 27 Oct 2024
https://github.com/woheller69/llama_tk_chat
Simple chat interface for local AI using llama-cpp-python and llama-cpp-agent
gui llama-cpp-agent llama-cpp-python llm-inference
Last synced: 07 Nov 2024
https://github.com/build-on-aws/bedrock-agents-infer-models
Use natural language to run inference on various LLMs via Bedrock agents
bedrock generative-ai llm-inference
Last synced: 07 Nov 2024
https://github.com/woheller69/gpt4all-tk-chat
A TK based graphical user interface for gpt4all. It uses the python bindings. Run LLMs in a very slimmer environment and leave maximum resources for inference
ai gpt gpt4all gui-application llm-inference python
Last synced: 07 Nov 2024
https://github.com/allyson-ai/funcmaster
Function Calling LLMs that run locally on device.
llamacpp llm llm-inference python react-native
Last synced: 10 Oct 2024
https://github.com/aigptcode/advanced-prompt-hacking-tester
This code implements an Advanced Prompt Hacking Tester, which allows users to test the responses of an AI system by generating various types of prompts. It includes methods to generate random prompts, contextual adversarial prompts by modifying the original prompts semantically
ai api chatgpt chatgpt-api gemini-api hacking hacking-tool linux llm llm-inference microsoft openai openai-api openai-chatgpt python python3 windows
Last synced: 25 Nov 2024
https://github.com/aidatatools/llm_sentinel
A project (LLM Sentinel) that showcases NVIDIA's NeMo-Guardrails and LangChain for improving LLM safety
Last synced: 22 Dec 2024
https://github.com/jessonchan/chatalice
ChatAlice is a robust, cross-platform desktop application designed for MacOS, Windows, and Linux operating systems. It features support for API integration with major large language models (LLMs), notably ChatGPT, Claude, and others.
chatgpt-app claude-ai desktop-app llm-inference openai
Last synced: 12 Nov 2024
https://github.com/prithivsakthiur/strangerai
Turning Ideas to Product - StrangerAI - StrangerZone. Recommended to Deploy inside Huggingface Spaces SDK as GRADIO
api chat-application chatbot chatgpt llm-inference open-source openai openapi
Last synced: 17 Dec 2024
https://github.com/azminewasi/Awesome-LLMs-ICLR-24
It is a comprehensive resource hub compiling all LLM papers accepted at the International Conference on Learning Representations (ICLR) in 2024.
large-language-model large-language-models large-language-models-and-translation-systems large-language-models-for-graph-learning llm llm-agent llm-evaluation llm-framework llm-inference llm-privacy llm-prompting llm-security llm-serving llm-training llmops llms pretrained-language-model pretrained-models pretrained-weights
Last synced: 26 Sep 2024
https://github.com/darrylbayliss/simon-says-ios
An iOS App recreating the Simon Says game. Uses MediaPipe to run an LLM on device
clean-architecture cocoapods gemma-2b gemma-2b-it generative-ai ios ios-app ios-clean-architecture ios-swift llm llm-inference mediapipe mediapipe-classifier swift swift-async swift-clean-architecture swift-concurrency swift-observation swiftui
Last synced: 27 Dec 2024
https://github.com/tbogdala/sentient_core
A terminal style user interface to chat with AI characters using llama LLMs for locally processed AI.
ai chat-application ggml llama llamacpp llm llm-inference rust terminal-ui
Last synced: 17 Nov 2024
https://github.com/skywardai/kirin
APIs aggregator for inference, fine-tuning and build models.
ai api container conversational-ai fastapi fine-tuning llamacpp llm-inference llm-training rag sentence-embeddings vector-database
Last synced: 10 Oct 2024
https://github.com/chriamue/chat-flame-backend
ChatFlameBackend is an innovative backend solution for chat applications, leveraging the power of the Candle AI framework with a focus on the Mistral model
backend-api candle huggingface-inference-endpoint llama2 llm-inference mistral phi rust-lang
Last synced: 15 Dec 2024
https://github.com/johnclaw/chatllm.v
V-lang api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatbot chatllm cpu-inference gemma ggml inference llama llm llm-inference llms mistral phi3 quantization qwen v-lang vlang
Last synced: 22 Dec 2024
https://github.com/tipani86/menagerai
Test various open source language models alongside with ChatGPT and compare the differences.
chatgpt llama2 llm llm-inference opensource streamlit
Last synced: 05 Dec 2024
https://github.com/ehristoforu/tensorlm-webui
Simple and modern webui for LLM models based LLaMA.
ai fooocus gradio gradio-python-llm linux llamacpp llm llm-inference lm macos ml portable tensorlm text-generation-webui ui webui windows
Last synced: 10 Oct 2024
https://github.com/johnclaw/chatllm.vb
VB.NET api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm cpu-inference gemma ggml int8 int8-inference int8-quantization llama llm-inference mistral qwen vb-net vbnet
Last synced: 22 Dec 2024
https://github.com/amazon-science/tokenalign
Token Alignment via Character Matching for Subword Completion (ACL Findings 2024)
Last synced: 12 Nov 2024
https://github.com/tbogdala/mindmeld
A simple-to-use, open source GUI for local AI chat on desktop and mobile. Powered by llama.cpp.
ai flutter llamacpp llm llm-inference local-llm
Last synced: 22 Dec 2024
https://github.com/johnclaw/chatllm.cs
C# api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm cpu-inference csharp gemma ggml inference int8 int8-inference int8-quantization llama llm llm-inference llms mistral qwen
Last synced: 22 Dec 2024
https://github.com/johnclaw/chatllm.nim
Nim api-wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatbot chatllm cpu-inference gemma ggml inference llama llm llm-inference llms mistral nim nim-lang nim-language nimlang phi quantization qwen
Last synced: 22 Dec 2024
https://github.com/shreyansh26/llm-sampling
A collection of various LLM sampling methods implemented in pure Pytorch
llm llm-inference sampling-methods torch transformers
Last synced: 22 Dec 2024
https://github.com/amlana21/llm-stream-publish
How to stream LLM responses using AWS API Gateway Websockets and Lambda
aws devops llm-inference terraform
Last synced: 25 Dec 2024
https://github.com/unifyai/aibench-llm-endpoints
Runner in charge of collecting metrics from LLM inference endpoints for the Unify Hub
benchmark endpoints llm llm-inference python
Last synced: 14 Nov 2024
https://github.com/actualwitch/experiment
🔬 Experiment is a LLM chat UI with advanced tool use debugging facilities.
anthropic bun chat experiment-tracking inference isomorphic llm llm-inference openai react
Last synced: 24 Dec 2024
https://github.com/tinybiggames/phippsai
A library for local LLM Interfering using Ollama to build AI tools and agents in Delphi.
library llm-inference local-inference ollama ollama-api win64 windows-10 windows-11
Last synced: 05 Dec 2024
https://github.com/dwyl/rag-elixir-doc
Livebook to run a Phoenix_LiveView documentation Retrieval-Augmented Generation (RAG) enhanced LLM
cross-encoder elixir embeddings livebook llm-inference rag retrieval-augmented-generation sbert
Last synced: 12 Oct 2024
https://github.com/hrolive/poland-end-to-end-llm-bootcamp
This bootcamp is designed to give NLP researchers an end-to-end overview on the fundamentals of NVIDIA NeMo framework, complete solution for building large language models. It will also have hands-on exercises complimented by tutorials, code snippets, and presentations to help researchers kick-start with NeMo LLM Service and Guardrails.
gpt llama2 llm llm-inference llm-training nemo-guardrails nvidia nvidia-nemo p-tuning prompt-tuning tensorrt triton
Last synced: 09 Nov 2024
https://github.com/picovoice/llm-compression-benchmark
LLM Compression Benchmark
llm llm-compression llm-inference
Last synced: 22 Nov 2024
https://github.com/nickpotafiy/illama
A fast, lightweight, parallel inference server for Llama LLMs.
exllama exllamav2 flash-attention-2 inference llama llama2 llama3 llm-inference paged-attention server
Last synced: 10 Oct 2024
https://github.com/giuseppebellamacina/vulnerabilitybot
Vulnerability Bot with Database
cybersecurity-tool llm llm-inference
Last synced: 05 Dec 2024
https://github.com/muhtasham/simulator
🚀 A high-performance simulator for LLM inference optimization, modeling compute-bound prefill and memory-bound decode phases. Explore batching strategies, analyze throughput-latency trade-offs, and optimize inference deployments without real model overhead.
Last synced: 14 Dec 2024
https://github.com/amanpriyanshu/api-llm-hub
A static-page vanilla-js interface for various LLM APIs (OpenAI, Claude, Gemini, Together).
anthropic anthropic-claude claude claude-ai gemini gemini-api gpt gpt-3 gpt-4 javascript llm llm-inference llms openai openai-api package togetherai vanilla-javascript vanilla-js
Last synced: 28 Oct 2024
https://github.com/firojalam/llamalens
This repository contains the resources, code, and documentation for LlamaLens, a specialized multilingual large language model (LLM) designed to analyze news and social media content effectively. LlamaLens supports multiple languages, including Arabic, English, and Hindi, and is tailored for diverse tasks such as sentiment analysis, misinformation.
arabic downstream-tasks emotion-detection english hindi llm llm-inference llm-training newsmedia sentiment-classification social-media
Last synced: 28 Dec 2024
https://github.com/bsenst/llm-enhanced-ehr
Contribution to the LabLabAI AI Challenge Hackathon October 2023
ehr-notes langchain-python llm-inference streamlit-ui
Last synced: 19 Oct 2024
https://github.com/nptt9/illama
A fast, lightweight, parallel inference server for Llama LLMs.
exllama exllamav2 flash-attention-2 inference llama llama2 llama3 llm-inference paged-attention server
Last synced: 27 Oct 2024
https://github.com/CentML/llm-inference-bench
Lightweight and extensible LLM Inference serving benchmark tool written in Rust.
benchmarking llm-inference llm-serving
Last synced: 10 Nov 2024
https://github.com/shekharp1536/ollama-web
Ollama Web UI is a simple yet powerful web-based interface for interacting with large language models. It offers chat history, voice commands, voice output, model download and management, conversation saving, terminal access, multi-model chat, and more—all in one streamlined platform.
llama llama-cpp llama3 llm-inference ollama ollama-app ollama-chat ollama-client ollama-gui ollama-interface ollama-python ollama-ui ollama-webui python-llm-integration
Last synced: 22 Dec 2024
https://github.com/niansa/libjustlm
Super easy to use library for doing LLaMA/GPT-J stuff! - Mirror of: https://gitlab.com/niansa/libjustlm
ai cpp17 cpp20 gpt-j llama llama2 llm llm-inference mpt python wrapper-library
Last synced: 13 Nov 2024
https://github.com/aigptcode/ai-battle-llama3-vs-qwen2
Welcome to our AI Battle! Ask a question and let our two AI models battle it out
ai android api llama3 llama3-meta-ai llm llm-inference qwen qwen2 windows
Last synced: 25 Nov 2024
https://github.com/gurpreetkaurjethra/llms-inference-and-fine-tuning
Estimate Memory Consumption of LLMs Inference and Fine Tuning
fine-tuning generative-ai large-language-models llm-inference llm-training llms memory-allocation
Last synced: 22 Nov 2024
https://github.com/atelierarith/docstringtranslationexobackend.jl
Translate Julia's docstrings using `exo`: Run your own AI cluster at home with everyday devices
exo julia julialang llm-inference
Last synced: 03 Dec 2024
https://github.com/kaust-generative-ai/local-deployment-of-generative-ai-models
Training materials on how to deploy generative AI models locally on your laptop or workstation.
ai carpentries-incubator deployment english generative-ai lesson llama-cpp llamafile llm-inference ollama pre-alpha python
Last synced: 17 Dec 2024
https://github.com/amajji/llm-rag-chatbot-with-langchain
Development and deployment of a question-answer LLM model using Llama2 with 7B parameters and RAG with LangChain
ai chatbot chatbot-application cpu db inference langchain llama-index llama2 llm llm-inference question-answering rag retrieval-augmented-generation streamlit streamlit-webapp vector-database
Last synced: 22 Dec 2024
https://github.com/jankovicsandras/ml
Machine Learning, LLM and other Jupyter Notebooks and resources
ai embeddings jupyter jupyter-notebook llm llm-inference llms machine-learning natural-language-processing nlp nlp-machine-learning python python3 rag retrieval-augmented-generation vector-database vector-database-embedding
Last synced: 24 Dec 2024
https://github.com/sureshbeekhani/ai-quick-summaries
Developed an AI-powered web app using Streamlit and Google Gemini AI for generating concise summaries from PDFs, images, and text files. The app features real-time text summarization, file upload support, and a user-friendly interface.
chatbot gemini gpt image-and-pdf llm llm-inference python streamlit
Last synced: 07 Dec 2024
https://github.com/biosfood/intel-llm-guide
A guide on how to run LLMs on intel CPUs
guide intel llm llm-inference llm-serving machine-learning setup setup-development-environment tutorial
Last synced: 22 Dec 2024
https://github.com/muhammad-fiaz/emsugi
EMSUGI is a future prediction & analysis project on various factor like flood, earth quake, disease occurred on your neighborhood location.
ai emergency-management-system flask flask-application gemini gemini-ai gemini-api gemini-client genai huggingface langchain large-language-models llm-inference llms open-source open-source-project opensource python python3 transformers
Last synced: 12 Nov 2024
https://github.com/picovoice/serverless-picollm
LLM Inference on AWS Lambda
aws-lambda llm llm-compression llm-inference serverless serverless-inference
Last synced: 22 Nov 2024