awesome-golang-ai
Golang AI applications have incredible potential. With unique features like inexplicable speed, easy debugging, concurrency, and excellent libraries for ML, deep learning, and reinforcement learning.
https://github.com/promacanthus/awesome-golang-ai
Last synced: about 18 hours ago
JSON representation
-
General Machine Learning libraries
-
Pipeline and Data Version
-
-
Neural Networks
-
Linear Algebra
-
Probability Distributions
-
Pipeline and Data Version
-
-
Regression
-
Pipeline and Data Version
-
-
Bayesian Classifiers
-
Pipeline and Data Version
-
-
Recommendation Engines
-
Evolutionary Algorithms
-
Graph
-
Pipeline and Data Version
-
-
Cluster
-
Anomaly Detection
-
Pipeline and Data Version
-
-
DataFrames
-
Pipeline and Data Version
- gota
- dataframe-go - learning, and data manipulation/exploration.
- qframe
-
-
Explaining Model
-
Pipeline and Data Version
-
-
Large Language Model
-
DevTools
- go-attention
- swarmgo - sdk-go) is a Go package that allows you to create AI agents capable of interacting, coordinating, and executing tasks.
- orra - dev/orra project offers resilience for AI agent workflows.
- core - shot workflows, building autonomous agents, and working with LLM providers.
- gollm
- langchaingo - based programs in Go.
- gpt4all-bindings - language interfaces to easily integrate and interact with GPT4All's local LLMs, simplifying model loading and inference for developers.
- go-openai - 3, GPT-4, DALL·E, Whisper API wrapper for Go.
- llama.go
- eino
- fabric - source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
- genkit - powered apps with familiar code-centric patterns. Genkit makes it easy to develop, integrate, and test AI features with observability and evaluations. Genkit works with various models and platforms.
- ollama - R1, Phi-4, Gemma 2, and other large language models.
-
GPT
-
SDKs
- go-anthropic
- deepseek-go - 1, Chat V3, and Coder. Also supports external providers like Azure, OpenRouter and Local Ollama.
- openai-go
- generative-ai-go
- anthropic-sdk-go - first language model APIs via Go.
-
ChatGPT Apps
- feishu-openai - 4 + GPT-4V + DALL·E-3 + Whisper) delivers an extraordinary work experience.
- chatgpt-telegram
-
Pipeline and Data Version
- pachyderm - Centric Pipelines and Data Versioning.
-
Vector Database
- milvus - performance, cloud-native vector database built for scalable vector ANN search.
- weaviate - source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
- tidb - the open-source, cloud-native, distributed SQL database designed for modern applications.
-
-
Reinforcement Learning
-
Pipeline and Data Version
-
-
[Model Context Protocol](https://modelcontextprotocol.io/introduction)
-
Multi-modal
- gateway - Server for your Databases optimized for LLMs and AI-Agents.
- mcp-go
- mcp-golang
-
-
Benchmark
-
Code
- multi-swe-bench - SWE-bench project, developed by ByteDance's Doubao team, is the first open-source multilingual dataset for evaluating and enhancing large language models' ability to automatically debug code, covering 7 major programming languages (e.g., Java, C++, JavaScript) with real-world GitHub issues to benchmark "full-stack engineering" capabilities.
- BigCodeBench
- Code4Bench
- CRUXEval
- HumanEval
- MBPP - sourced Python programming problems, designed to be solvable by entry level programmers, covering programming fundamentals, standard library functionality, and so on.
- MultiPL-E - programming language benchmark for LLMs.
- SWE-bench - bench is a benchmark suite designed to evaluate the capabilities of large language models (LLMs) in solving real-world software engineering tasks, focusing on actual software bug-fixing challenges extracted from open-source projects.
- AIDER - related tasks, such as code writing and editing.
- LiveCodeBench
- BFCL - calling capability of different LLMs.
-
English
- ARC-AGI
- GPQA - Level Google-Proof Q&A Benchmark.
- ARC-Challenge
- BBH - Bench Tasks and Whether Chain-of-Thought Can Solve Them.
- HelloSwag
- IFEval - following capabilities of large language models by incorporating 25 verifiable instruction types (e.g., format constraints, keyword inclusion) and applying dual strict-loose metrics for automated, objective assessment of model compliance.
- MMLU-CF - free Multi-task Language Understanding Benchmark.
- MMLU-Pro - Task Language Understanding Benchmark.
- PIQA
- WinoGrande
- BIG-bench
- MMLU
- LiveBench - Free LLM Benchmark.
-
Math
- Omni-MATH - MATH is a comprehensive and challenging benchmark specifically designed to assess LLMs' mathematical reasoning at the Olympiad level.
- grade-school-math - step reasoning capabilities in language models, revealing that even large transformers struggle with these conceptually simple yet procedurally complex tasks.
- MATH - solving capabilities, offering dataset loaders, evaluation code, and pre-training data.
- MathVista
- TAU-bench - source benchmark suite designed to evaluate the performance of large language models (LLMs) on complex reasoning tasks across multiple domains.
- AIME
-
Chinese
-
Tool Use
-
Open ended
- Arena-Hard - Hard-Auto: An automatic LLM benchmark.
-
False refusal
-
Multi-modal
- geneval - focused framework for evaluating text-to-image alignment.
- LongVideoBench
- MLVU - task Long Video Understanding Benchmark.
- perception_test
- TempCompass
- Video-MME - MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis.
- VBench - source project aiming to build a comprehensive evaluation benchmark for video generation models.
- DPG-Bench
-
-
Embedding Benchmark
-
Pipeline and Data Version
- MTEB - source benchmarking framework for evaluating and comparing text embedding models across 8 tasks (e.g., classification, retrieval, clustering) using 58 datasets in 112 languages, providing standardized performance metrics for model selection.
- BRIGHT - intensive retrieval, featuring 12 diverse datasets (math, code, biology, etc.) to evaluate retrieval models across complex, context-rich queries requiring logical inference.
-
-
Decision Trees
-
Pipeline and Data Version
- CloudForest - threaded decision tree ensembles (Random Forest, Gradient Boosting, etc.) designed for high-dimensional heterogeneous data with missing values, emphasizing speed and robustness for real-world machine learning tasks.
-
Programming Languages
Categories
Benchmark
48
Large Language Model
25
General Machine Learning libraries
11
Neural Networks
9
DataFrames
3
[Model Context Protocol](https://modelcontextprotocol.io/introduction)
3
Explaining Model
3
Anomaly Detection
3
Recommendation Engines
3
Regression
2
Reinforcement Learning
2
Bayesian Classifiers
2
Linear Algebra
2
Cluster
2
Evolutionary Algorithms
2
Embedding Benchmark
2
Probability Distributions
1
Decision Trees
1
Graph
1
Sub Categories
Keywords
go
25
golang
22
machine-learning
13
llm
9
ai
8
openai
6
neural-network
6
deep-learning
5
artificial-intelligence
5
chatgpt
4
benchmark
4
gpt-3
4
data-science
4
vector-database
3
statistics
3
scientific-computing
3
language-model
3
gpt-4
3
rag
2
regression
2
nlp
2
llama
2
natural-language-processing
2
dataframe
2
reasoning
2
neural-networks
2
anthropic
2
multimodal-large-language-models
2
evaluation
2
agents
2
optimization
2
large-language-models
2
deepseek
2
data-analysis
2
matrix
2
automatic-differentiation
2
computation-graph
2
api
2
deeplearning
2
semantic-search
2
neural-search
2
serverless
2
information-retrieval
2
vector-search
2
nearest-neighbor-search
2
streaming-api
2
chatgpt-api
2
cloud-native
2
langchain
2
recommender-system
2