Projects in Awesome Lists tagged with benchmark
A curated list of projects in awesome lists tagged with benchmark .
https://github.com/sharkdp/hyperfine
A command-line benchmarking tool
benchmark cli command-line rust terminal tool
Last synced: 12 May 2025
https://github.com/zalandoresearch/fashion-mnist
A MNIST-like fashion product database. Benchmark :point_down:
benchmark computer-vision convolutional-neural-networks dataset deep-learning fashion fashion-mnist gan machine-learning mnist zalando
Last synced: 29 Apr 2025
https://github.com/dotnet/benchmarkdotnet
Powerful .NET library for benchmarking
benchmark benchmarking c-sharp csharp dotnet hacktoberfest performance
Last synced: 09 Sep 2025
https://github.com/dotnet/BenchmarkDotNet
Powerful .NET library for benchmarking
benchmark benchmarking c-sharp csharp dotnet hacktoberfest performance
Last synced: 14 Mar 2025
https://github.com/hatoo/oha
Ohayou(おはよう), HTTP load generator, inspired by rakyll/hey with tui animation.
benchmark cli command-line http http2 load-generator load-testing rust tui
Last synced: 13 May 2025
https://github.com/techempower/frameworkbenchmarks
Source for the TechEmpower Framework Benchmarks project
benchmark framework frameworkbenchmarks performance suite
Last synced: 15 May 2025
https://github.com/TechEmpower/FrameworkBenchmarks
Source for the TechEmpower Framework Benchmarks project
benchmark framework frameworkbenchmarks performance suite
Last synced: 13 Mar 2025
https://github.com/the-benchmarker/web-frameworks
Which is the fastest web framework?
benchmark framework http measurement performance standard web
Last synced: 12 May 2025
https://github.com/open-mmlab/mmpose
OpenMMLab Pose Estimation Toolbox and Benchmark.
animal-pose-estimation benchmark cpm crowdpose face-keypoint freihand hand-pose-estimation higher-hrnet hourglass hrnet human-pose mmpose mpii mspn ochuman pose-estimation pytorch rsn rtmpose udp
Last synced: 12 May 2025
https://github.com/akopytov/sysbench
Scriptable database and system performance benchmark
benchmark console freebsd linux lua luajit macos micro-benchmarks mysql oltp postgresql sysbench
Last synced: 15 May 2025
https://github.com/open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
benchmark chatgpt evaluation large-language-model llama2 llama3 llm openai
Last synced: 17 Nov 2025
https://github.com/erikbern/ann-benchmarks
Benchmarks of approximate nearest neighbor libraries in Python
benchmark docker nearest-neighbors
Last synced: 13 May 2025
https://github.com/spiritlhls/ecs
VPS融合怪服务器测评项目(VPS Fusion Monster Server Test Script) 更推荐使用无环境依赖的=>https://github.com/oneclickvirt/ecs
almalinux arch astralinux bench-script benchmark cdn centos checker debian fedora freebsd ipv6 lemonbench openai oracle-linux rockylinux speedtest sysbench ubuntu vps
Last synced: 13 May 2025
https://github.com/masonr/yet-another-bench-script
YABS - a simple bash script to estimate Linux server performance using fio, iperf3, & Geekbench
bash bench-script benchmark benchmark-scripts disk-performance fio geekbench iperf3 linux performance speedtest
Last synced: 13 May 2025
https://github.com/teddysun/across
Across the Great Wall we can reach every corner in the world
auto-transfer-backup backup bbr benchmark kms l2tp shell unixbench
Last synced: 13 May 2025
https://github.com/spiritLHLS/ecs
VPS融合怪服务器测评项目(VPS Fusion Monster Server Test Script)(尽量做最全能测试服务器的脚本)
almalinux arch astralinux bench-script benchmark cdn centos checker debian fedora freebsd ipv6 lemonbench openai oracle-linux rockylinux speedtest sysbench ubuntu vps
Last synced: 24 Mar 2025
https://github.com/bheisler/criterion.rs
Statistics-driven benchmarking library for Rust
benchmark criterion gnuplot rust statistics
Last synced: 12 May 2025
https://github.com/open-mmlab/mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
action-recognition ava benchmark deep-learning i3d non-local openmmlab posec3d pytorch slowfast spatial-temporal-action-detection temporal-action-localization tsm tsn uniformerv2 video-classification video-understanding x3d
Last synced: 13 May 2025
https://github.com/projectphysx/fluidx3d
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
benchmark cfd computational-fluid-dynamics fluid-dynamics fluid-simulation fluid-solver gpgpu gpu gpu-computing high-performance-computing hpc interactive-visualization lattice-boltzmann lbm opencl physics raytracing scientific-computing scientific-visualization simulation
Last synced: 13 May 2025
https://github.com/ProjectPhysX/FluidX3D
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
benchmark cfd computational-fluid-dynamics fluid-dynamics fluid-simulation fluid-solver gpgpu gpu gpu-computing high-performance-computing hpc interactive-visualization lattice-boltzmann lbm opencl physics raytracing scientific-computing scientific-visualization simulation
Last synced: 26 Mar 2025
https://github.com/open-compass/OpenCompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
benchmark chatgpt evaluation large-language-model llama2 llama3 llm openai
Last synced: 30 Jul 2025
https://github.com/baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
artificial-intelligence benchmark ceval chatgpt chinese gpt gpt-4 huggingface large-language-models llama2 mmlu natural-language-processing
Last synced: 01 Apr 2025
https://github.com/baichuan-inc/baichuan2
A series of large language models developed by Baichuan Intelligent Technology
artificial-intelligence benchmark ceval chatgpt chinese gpt gpt-4 huggingface large-language-models llama2 mmlu natural-language-processing
Last synced: 13 May 2025
https://github.com/cluebenchmark/clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
albert benchmark bert chinese chineseglue corpus dataset glue language-model nlu pretrained-models pytorch roberta tensorflow transformers
Last synced: 14 May 2025
https://github.com/CLUEbenchmark/CLUE
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
albert benchmark bert chinese chineseglue corpus dataset glue language-model nlu pretrained-models pytorch roberta tensorflow transformers
Last synced: 28 Mar 2025
https://github.com/foolwood/benchmark_results
Visual Tracking Paper List
benchmark deep-learning paper tracking visual-tracking
Last synced: 26 Jan 2026
https://github.com/michaelgrupp/evo
Python package for the evaluation of odometry and SLAM
benchmark euroc evaluation kitti mapping metrics odometry robotics ros ros2 slam trajectory trajectory-analysis trajectory-evaluation tum
Last synced: 16 May 2025
https://github.com/evolvinglmms-lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
agi audio-evaluation benchmark evaluation large-language-models llm-evaluation multimodal multimodal-evaluation video-understanding vision-language-model vlm
Last synced: 28 Feb 2026
https://github.com/MichaelGrupp/evo
Python package for the evaluation of odometry and SLAM
benchmark euroc evaluation kitti mapping metrics odometry robotics ros ros2 slam trajectory trajectory-analysis trajectory-evaluation tum
Last synced: 11 Apr 2025
https://github.com/devmeremenko/xcodebenchmark
XcodeBenchmark measures the compilation time of a large codebase on iMac, MacBook, and Mac Pro
benchmark cocoapods swift xcode
Last synced: 13 May 2025
https://github.com/ruc-nlpir/flashrag
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
benchmark datasets large-language-models retrieval-augmented-generation
Last synced: 06 Apr 2026
https://github.com/devMEremenko/XcodeBenchmark
XcodeBenchmark measures the compilation time of a large codebase on iMac, MacBook, and Mac Pro
benchmark cocoapods swift xcode
Last synced: 27 Mar 2025
https://github.com/embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
benchmark bitext-mining clustering information-retrieval low-resource-nlp mteb multilingual-nlp multimodal neural-search reranking retrieval sbert semantic-search sentence-transformers sts text-classification text-embedding
Last synced: 19 Apr 2026
https://github.com/baichuan-inc/Baichuan-13B
A 13B large language model developed by Baichuan Intelligent Technology
artificial-intelligence benchmark ceval chatgpt chinese gpt-4 huggingface large-language-models mmlu natural-language-processing
Last synced: 19 Apr 2025
https://github.com/baichuan-inc/baichuan-13b
A 13B large language model developed by Baichuan Intelligent Technology
artificial-intelligence benchmark ceval chatgpt chinese gpt-4 huggingface large-language-models mmlu natural-language-processing
Last synced: 03 Oct 2025
https://github.com/swe-bench/swe-bench
SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?
benchmark language-model software-engineering
Last synced: 12 May 2025
https://github.com/microsoftarchive/promptbench
A unified evaluation framework for large language models
adversarial-attacks benchmark chatgpt evaluation large-language-models prompt prompt-engineering robustness
Last synced: 31 Mar 2026
https://github.com/RUC-NLPIR/FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
benchmark datasets large-language-models retrieval-augmented-generation
Last synced: 08 Sep 2025
https://github.com/phoronix-test-suite/phoronix-test-suite
The Phoronix Test Suite open-source, cross-platform automated testing/benchmarking software.
benchmark benchmarking bsd linux performance php profiling solaris testing
Last synced: 13 May 2025
https://github.com/microsoft/promptbench
A unified evaluation framework for large language models
adversarial-attacks benchmark chatgpt evaluation large-language-models prompt prompt-engineering robustness
Last synced: 13 May 2025
https://github.com/SWE-bench/SWE-bench
SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?
benchmark language-model software-engineering
Last synced: 07 Mar 2025
https://github.com/processone/tsung
Tsung is a high-performance benchmark framework for various protocols including HTTP, XMPP, LDAP, etc.
amqp benchmark benchmarking erlang ldap mqtt mysql postgresql xmpp
Last synced: 13 May 2025
https://github.com/plummerssoftwarellc/primes
Prime Number Projects in C#/C++/Python
benchmark benchmarks docker drag-race primes primesieve programming-languages
Last synced: 29 Apr 2025
https://github.com/PlummersSoftwareLLC/Primes
Prime Number Projects in C#/C++/Python
benchmark benchmarks docker drag-race primes primesieve programming-languages
Last synced: 29 Apr 2025
https://github.com/kodezi/chronos
Kodezi Chronos Debugging-first language model achieving 65.3% autonomous bug fixing (6-7x better than GPT-4). Research, benchmarks & evaluation framework. Model available Q1 2026 via Kodezi OS.
artificial-intelligence autonomous-debugging benchmark benchmark-report bug-fixing chronos code code-analysis code-analysis-tool code-debugger code-understanding debugging developer-tools kodezi language-model machine-learning program-repair software-engineering
Last synced: 09 Mar 2026
https://github.com/robotwin-Platform/robotwin
RoboTwin 2.0 Offical Repo
benchmark data-generator embodied-ai robotics
Last synced: 07 May 2026
https://github.com/smallnest/go-web-framework-benchmark
:zap: Go web framework benchmark
benchmark concurrency http-router-benchmark mux webframework
Last synced: 14 May 2025
https://github.com/smallnest/Go-web-framework-benchmark
:zap: Go web framework benchmark
benchmark concurrency http-router-benchmark mux webframework
Last synced: 12 Mar 2025
https://github.com/swe-bench/SWE-bench
[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?
benchmark language-model software-engineering
Last synced: 12 Aug 2025
https://github.com/tinylibs/tinybench
🔎 A simple, tiny and lightweight benchmarking library!
benchmark hacktoberfest light-weight performance tinylibs
Last synced: 14 May 2025
https://github.com/felixguendling/cista
Cista is a simple, high-performance, zero-copy C++ serialization & reflection library.
benchmark cpp cpp17 deserialization efficient high-performance reflection serialization zero-copy
Last synced: 14 May 2025
https://github.com/logpai/logparser
A machine learning toolkit for log parsing [ICSE'19, DSN'16]
anomaly-detection benchmark log log-analysis log-mining log-parser log-parsing
Last synced: 20 Feb 2026
https://github.com/smallnest/1m-go-tcp-server
benchmarks for implementation of servers which support 1 million connections
Last synced: 07 Oct 2025
https://github.com/evanwashere/mitata
benchmark tooling that loves you ❤️
benchmark bun cpp deno graaljs javascript jsc library microbenchmark node nodejs performance single-header spidermonkey v8
Last synced: 13 May 2025
https://github.com/xlang-ai/osworld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
agent artificial-intelligence benchmark cli code-generation gui language-model large-action-model llm multimodal natural-language-processing reinforcement-learning rpa vlm
Last synced: 14 May 2025
https://github.com/beir-cellar/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
benchmark bert colbert dataset deep-learning dpr elasticsearch information-retrieval llm nlp passage-retrieval pytorch question-generation rag retrieval retrieval-models sbert sentence-transformers zero-shot-retrieval
Last synced: 14 May 2025
https://github.com/ashvardanian/BenchmarkingTutorial
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
assembly assembly-language avx512 benchmark coroutines cpp cpp-programming cpp17 cpp20 cuda gcc google-benchmark hpc io-uring linux-kernel llvm ptx ranges tutorial tutorials
Last synced: 26 Jun 2025
https://github.com/elanis/web-to-desktop-framework-comparison
An objective comparison of multiple frameworks that allow us to "transform" our web apps to desktop applications.
benchmark characteristics desktop-framework-comparison electron electron-app electronjs flutter maui neutralinojs nodegui nwjs nwjs-application tauri tauri-app wails
Last synced: 14 May 2025
https://github.com/oneclickvirt/ecs
VPS融合怪服务器测评项目 GO版本 VPS Fusion Monster Server Test GO Version 尽量成为最全能的服务器测评项目,使用 Go 实现,无需任何环境依赖。 Aiming to be the most comprehensive server testing project, implemented in Go with zero environment dependencies.
benchmark benchmarks darwin goecs golang linux macos windows
Last synced: 16 Apr 2026
https://github.com/martinus/nanobench
Simple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20
benchmark cpp cpp11 header-only microbenchmark single-file single-header single-header-lib
Last synced: 24 Oct 2025
https://github.com/intellabs/fastrag
Efficient Retrieval Augmentation and Generation Framework
benchmark colbert diffusion generative-ai information-retrieval knowledge-graph llm multi-modal nlp question-answering semantic-search sentence-transformers summarization transformers
Last synced: 14 May 2025
https://github.com/MLGroupJLU/LLM-eval-survey
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
benchmark evaluation large-language-models llm llms model-assessment
Last synced: 04 Apr 2025
https://github.com/google-deepmind/tapnet
Tracking Any Point (TAP)
benchmark computer-vision deep-learning point-tracking robotics
Last synced: 14 May 2025
https://github.com/mlgroupjlu/llm-eval-survey
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
benchmark evaluation large-language-models llm llms model-assessment
Last synced: 06 Feb 2026
https://github.com/OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
action-recognition benchmark contrastive-learning foundation-models instruction-tuning masked-autoencoder multimodal open-set-recognition self-supervised spatio-temporal-action-localization temporal-action-localization video-clip video-data video-dataset video-question-answering video-retrieval video-understanding vision-transformer zero-shot-classification zero-shot-retrieval
Last synced: 20 Mar 2025
https://github.com/opengenerativeai/llm-colosseum
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
benchmark genai llm streetfighterai
Last synced: 10 Mar 2026
https://github.com/Elanis/web-to-desktop-framework-comparison
An objective comparison of multiple frameworks that allow us to "transform" our web apps to desktop applications.
benchmark characteristics desktop-framework-comparison electron electron-app electronjs flutter maui neutralinojs nodegui nwjs nwjs-application tauri tauri-app wails
Last synced: 14 Mar 2025
https://github.com/cheind/py-motmetrics
:bar_chart: Benchmark multiple object trackers (MOT) in Python
benchmark clear-mot-metrics metrics mot mot-challenge object-detection object-tracking tracker
Last synced: 14 May 2025
https://github.com/mlcommons/inference
Reference implementations of MLPerf™ inference benchmarks
Last synced: 14 May 2025
https://github.com/OpenGenerativeAI/llm-colosseum
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
benchmark genai llm streetfighterai
Last synced: 19 Jul 2025
https://github.com/jsperf/jsperf.com
jsperf.com v2. https://github.com/h5bp/lazyweb-requests/issues/174
benchmark benchmarking javascript performance
Last synced: 16 May 2025
https://github.com/ionelmc/pytest-benchmark
pytest fixture for benchmarking code
benchmark benchmarking performance pytest python
Last synced: 23 Apr 2025
https://github.com/attaswift/attabench
Microbenchmarking app for Swift with nice log-log plots
app benchmark macos microbenchmarks swift
Last synced: 16 May 2025
https://github.com/xlang-ai/OSWorld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
agent artificial-intelligence benchmark cli code-generation gui language-model large-action-model llm multimodal natural-language-processing reinforcement-learning rpa vlm
Last synced: 18 Apr 2025
https://github.com/attaswift/Attabench
Microbenchmarking app for Swift with nice log-log plots
app benchmark macos microbenchmarks swift
Last synced: 04 Aug 2025
https://github.com/kengz/slm-lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
a2c a3c benchmark deep-reinforcement-learning dqn policy-gradient ppo pytorch reinforcement-learning sac
Last synced: 11 Feb 2026
https://github.com/kengz/SLM-Lab
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
a2c a3c benchmark deep-reinforcement-learning dqn policy-gradient ppo pytorch reinforcement-learning sac
Last synced: 01 Apr 2025
https://github.com/dadhi/fastexpressioncompiler
Fast Compiler for C# Expression Trees and the lightweight LightExpression alternative. Diagnostic and code generation tools for the expressions.
benchmark closure code-generation compiler delegate delegates dryioc expression-tree il-optimizations performance
Last synced: 13 May 2025
https://github.com/benchmark-action/github-action-benchmark
GitHub Action for continuous benchmarking to keep performance
Last synced: 06 May 2026
https://github.com/evalplus/evalplus
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
benchmark chatgpt efficiency gpt-4 large-language-models program-synthesis testing
Last synced: 12 Jan 2026
https://github.com/myzhan/boomer
A better load generator for locust, written in golang.
benchmark benchmark-framework boomer locust performance performance-testing
Last synced: 14 May 2025
https://github.com/oxwhirl/smac
SMAC: The StarCraft Multi-Agent Challenge
benchmark machine-learning multiagent-systems reinforcement-learning starcraft-ii
Last synced: 15 May 2025
https://github.com/haydenjames/bench-scripts
A compilation of Linux server benchmarking scripts.
benchmark benchmarking linux performance scripts vps
Last synced: 16 May 2025
https://github.com/medmnist/medmnist
[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification
2d 3d automl benchmark classification dataset decathlon deep-learning federated-learning few-shot-learning machine-learning medical medical-image-analysis medical-image-computing medical-imaging medmnist mnist multi-modal pytorch
Last synced: 14 May 2025
https://github.com/nvzqz/divan
Fast and simple benchmarking for Rust projects
benchmark fast performance rust simple
Last synced: 13 May 2025
https://github.com/flow-project/flow
Computational framework for reinforcement learning in traffic control
autonomous benchmark reinforcement-learning sumo traffic-control vehicle-control
Last synced: 15 May 2025
https://github.com/Tencent/AICGSecEval
A.S.E (AICGSecEval) is a repository-level AI-generated code security evaluation benchmark developed by Tencent Wukong Code Security Team.
agent aigc benchmark codesecurity llm
Last synced: 16 Feb 2026
https://github.com/dadhi/FastExpressionCompiler
Fast Compiler for C# Expression Trees and the lightweight LightExpression alternative. Diagnostic and code generation tools for the expressions.
benchmark closure code-generation compiler delegate delegates dryioc expression-tree il-optimizations performance
Last synced: 16 Mar 2025
https://github.com/redis/memtier_benchmark
NoSQL Redis and Memcache traffic generation and benchmarking tool.
benchmark load-testing memcached redis stress-testing
Last synced: 17 Mar 2026
https://github.com/itayinbarr/little-coder
A coding agent optimized to smaller LLMs
ai-coding-assistant aider-polygot benchmark code-generation coding-agent coding-agents local-llm ollama qwen small-language-models tool-use
Last synced: 22 May 2026
https://github.com/kimwalisch/primesieve
🚀 Fast prime number generator
arm-neon arm-sve avx512 benchmark eratosthenes math number-theory prime-numbers primes primesieve sieve sieve-of-eratosthenes stress-testing
Last synced: 14 May 2025
https://github.com/zilliztech/vectordbbench
Benchmark for vector databases.
benchmark cost-effectiveness performance vector-database vector-search vectordb
Last synced: 12 Feb 2026
https://github.com/pdebench/PDEBench
PDEBench: An Extensive Benchmark for Scientific Machine Learning
ai autoregressive-models benchmark deep-learning fluid-dynamics jax machine-learning navier-stokes-equations neural-networks neural-operators partial-differential-equations physics-informed-neural-networks pytorch scientific scientific-computing sciml simulation
Last synced: 11 Feb 2026