An open API service indexing awesome lists of open source software.

awesome-performance-engineering

A curated, opinionated collection of tools and resources dedicated to Performance Engineering, covering both Observability and Performance Testing.
https://github.com/be-next/awesome-performance-engineering

Last synced: 8 days ago
JSON representation

  • Observability

    • AI-Augmented Observability

      • Dynatrace Davis AI - 🟠 Deterministic and causal AI for topology-aware automatic root-cause analysis.
      • Datadog Watchdog - 🟠 ML-driven anomaly detection across metrics, logs, and APM data.
      • Moogsoft - 🟠 AIOps platform for alert correlation, noise reduction, and incident clustering.
      • New Relic AI - 🟠 Applied intelligence with anomaly detection, incident correlation, and natural-language querying.
      • Honeycomb BubbleUp - 🟠 Automated outlier correlation across high-cardinality dimensions.
      • Coroot - 🟢🔵 Open-source eBPF-powered observability with automated service map discovery.
    • Alerting & Incident Response

      • Alertmanager - ⭐🟢 Prometheus-native alert handling with grouping, silencing, inhibition, and routing.
      • Grafana OnCall - 🟢🔵 Open-source on-call management and alert routing with native Grafana integration.
      • Keep - 🟢🔵 Open-source alert management platform consolidating alerts from multiple sources.
      • Alerta - 🟢 Unified alert correlation and management across multiple monitoring systems.
      • PagerDuty - 🟠 Industry-standard incident response and on-call management platform.
      • Opsgenie - 🟠 Alerting and escalation platform, part of the Atlassian suite.
      • Rootly - 🟠 AI-assisted incident management with automated timelines and postmortem generation.
    • Database Observability

    • Distributed Tracing

      • OpenTelemetry - ⭐🟢🔵 Open standard for distributed tracing, metrics, and logs with language-specific SDKs and auto-instrumentation.
      • Jaeger - ⭐🟢🔵 CNCF graduated distributed tracing backend and UI, originally from Uber.
      • Grafana Tempo - ⭐🟢🔵 High-scale tracing backend requiring only object storage, with native Grafana integration.
      • Zipkin - 🟢 Pioneering distributed tracing system (Twitter, 2012) with a simple architecture.
      • Apache SkyWalking - ⭐🟢🔵 Observability platform with bytecode-injection-based tracing, popular in the Java ecosystem.
      • SigNoz - 🟢🔵 Open-source OpenTelemetry-native observability platform with unified metrics, traces, and logs.
      • Pinpoint - Bytecode-instrumentation-based APM and tracing for Java and PHP with zero-code-change approach.
    • Legacy & Historical

      • Graphite - Pioneering time-series storage and graphing system with Whisper backend and Carbon collector.
      • Redash - SQL-first data visualization and collaboration connecting to many data sources.
    • Log Management & Log Pipelines

      • Grafana Loki - ⭐🟢🔵 Label-based log aggregation that indexes metadata instead of content for cost-efficient storage at scale.
      • Fluent Bit - ⭐🟢🔵🚀 Lightweight, high-performance log processor and forwarder for edge and containerized environments.
      • Fluentd - 🟢🔵 CNCF graduated unified logging layer with 1000+ plugins for complex routing.
      • Elasticsearch - ⭐🟢🟠 Distributed search and analytics engine with powerful full-text search capabilities.
      • OpenSearch - 🟢🔵 Community-driven, Apache-2.0-licensed fork of Elasticsearch, backed by AWS.
      • Logstash - Flexible log ingestion and transformation pipeline, part of the Elastic Stack.
      • Graylog - 🟢🟠 Centralized log management with built-in alerting and dashboards.
      • rsyslog - 🟢🚀 High-performance system logging daemon handling millions of messages per second.
    • Metrics Collection & Time-Series Storage

      • Prometheus - ⭐🟢🔵 Pull-based cloud-native metrics platform with dimensional data model and PromQL query language.
      • VictoriaMetrics - ⭐🟢🚀 High-performance, cost-efficient Prometheus-compatible TSDB with high-cardinality and long-retention support.
      • Thanos - ⭐🟢🔵 Long-term storage, global query view, and high availability layer for Prometheus via sidecar architecture.
      • Mimir - ⭐🟢🔵🚀 Horizontally scalable, multi-tenant Prometheus-compatible TSDB from Grafana Labs.
      • InfluxDB - 🟢🟠 Purpose-built time-series database with high write throughput and a Rust-based engine (v3).
      • Grafana Alloy - ⭐🟢🔵 OpenTelemetry-native telemetry collector supporting metrics, logs, traces, and profiles.
      • Telegraf - 🟢 Plugin-driven agent for collecting and reporting metrics with 300+ input plugins.
      • StatsD - Lightweight, UDP-based metrics aggregation daemon with broad application support.
      • Netdata - ⭐🟢🚀 Real-time per-second monitoring with built-in anomaly detection and zero-configuration agent.
    • Monitoring Suites (Operations-Oriented)

      • Zabbix - 🟢 Enterprise-grade monitoring platform with agent-based and agentless monitoring.
      • Nagios - 🟢 Pioneering open-source check-based monitoring with an enormous plugin ecosystem.
      • Icinga - 🟢 Modern evolution of Nagios with improved APIs, configuration management, and scalability.
      • Checkmk - 🟢🟠 Infrastructure and application monitoring with auto-discovery for large environments.
    • Observability Pipelines and Telemetry Processing

      • OpenTelemetry Collector - ⭐🟢🔵 Standard telemetry processing pipeline with receivers, processors, and exporters for any signal.
      • Vector - 🟢🚀 End-to-end observability data routing and transformation with programmable VRL transforms.
      • Logstash - ETL-style processing for observability data with powerful filter plugins.
      • Cribl Stream - 🟠🚀 Commercial observability pipeline for routing, reducing, and enriching telemetry data.
    • Observability Platforms (Integrated)

      • Datadog - 🟠 SaaS observability platform with AI-powered anomaly detection and root-cause analysis.
      • Dynatrace - 🟠 AI-driven observability with automatic topology discovery and root-cause analysis (Davis AI).
      • New Relic - 🟠 Developer-centric observability platform with NRQL query language and a generous free tier.
      • Splunk Observability - 🟠 Observability built on Splunk's machine data analytics platform.
      • Elastic Observability - 🟠 Observability solution built on the Elastic Stack with self-managed and cloud options.
      • Honeycomb - 🟠 Observability platform for high-cardinality event data with BubbleUp automated correlation.
      • Grafana Cloud - 🟠 Managed Grafana stack (Mimir, Loki, Tempo, Pyroscope) with a generous free tier.
      • Instana (IBM) - 🟠 Automatic infrastructure and application discovery with real-time observability.
      • AppDynamics (Splunk/Cisco) - 🟠 Enterprise APM with business transaction monitoring and code-level diagnostics.
      • Chronosphere - 🟠 Cloud-native observability platform focused on metrics at scale with cost control.
      • Lightstep / ServiceNow Cloud Observability - 🟠 OpenTelemetry-native observability platform, now part of ServiceNow.
      • Sematext - 🟢🟠 SaaS observability platform with OpenTelemetry-native support and topology discovery.
    • Profiling & Continuous Performance Analysis

      • Parca - ⭐🟢🔵 eBPF-based continuous profiling platform with zero-instrumentation and differential flame graphs (CNCF sandbox).
      • Grafana Pyroscope - ⭐🟢🔵 Continuous profiling with flame graph visualization and multi-language support.
      • async-profiler - 🟢🚀 Low-overhead JVM sampling profiler capturing CPU, allocation, and lock contention profiles.
      • perf - 🚀 Linux kernel performance analysis tool with hardware counters, tracepoints, and sampling.
      • bpftrace - 🟢🚀 High-level tracing language for Linux eBPF with dynamic kernel and user-space tracing.
      • bcc (BPF Compiler Collection) - 🟢🚀 Toolkit for creating eBPF-based tracing programs with dozens of ready-to-use tools.
      • Grafana Beyla - 🟢🔵🚀 eBPF-based zero-code auto-instrumentation generating RED metrics and distributed traces.
      • Perfetto - 🟢 System-wide tracing and profiling toolkit from Google for Android, Chrome, and general system analysis.
    • Real User Monitoring (RUM) & Frontend Observability

      • Sentry - 🟢 Error tracking and performance monitoring with session replay and Web Vitals.
      • Grafana Faro - 🟢🔵 Open-source frontend observability SDK capturing errors, performance, and user events.
      • OpenTelemetry Browser SDK - 🟢 OTel instrumentation for web applications capturing page loads and resource timings.
      • LogRocket - 🟠 Session replay combined with frontend performance monitoring.
    • Service Mesh Observability

      • Kiali - 🟢🔵 Observability console for Istio with topology visualization and traffic flow analysis.
      • Linkerd Viz - 🟢🔵 Built-in telemetry and dashboard for Linkerd service mesh.
      • Hubble - 🟢🔵🚀 eBPF-powered network observability for Cilium with L3/L4/L7 flow visibility.
    • SLO Management

      • Sloth - 🟢🔵 SLO generation for Prometheus with YAML definitions and multi-window multi-burn-rate alerts.
      • Pyrra - 🟢🔵 Kubernetes-native SLO management generating Prometheus recording rules and alerts.
      • OpenSLO - 🟢 Open, vendor-neutral specification for defining SLOs as code.
      • Nobl9 - 🟠 Enterprise SLO platform with unified tracking and error budget management.
    • Synthetic Monitoring

      • Checkly - 🟢🔵 Monitoring as code for APIs and browsers with Playwright-based synthetic checks.
      • Grafana Synthetic Monitoring - 🟢🔵 Probe-based multi-location synthetic monitoring integrated into Grafana Cloud.
      • Uptime Kuma - ⭐🟢 Self-hosted monitoring tool with HTTP, TCP, DNS, and keyword checks.
    • Visualization & Dashboards

      • Grafana - ⭐🟢 Open-source observability dashboard platform supporting 100+ data sources with alerting and annotations.
      • Kibana - 🟢🟠 Visualization and log exploration for Elasticsearch and OpenSearch data.
      • OpenSearch Dashboards - 🟢🔵 Open-source fork of Kibana for OpenSearch.
      • Apache Superset - 🟢 SQL-first analytics and dashboarding platform for ad-hoc data exploration.
      • Perses - 🟢🔵 CNCF sandbox dashboards-as-code project with native PromQL and TraceQL support.
  • Performance Testing

    • API Testing & Contract Testing

      • Hurl - 🟢 Plain-text HTTP request runner for API testing in CI with assertions and chaining.
      • Postman - ⭐🟢🟠 API development and testing platform with Newman CLI for CI/CD integration.
      • REST-assured - 🟢 Java DSL for testing REST APIs with fluent syntax and JUnit/TestNG integration.
      • Karate - 🟢 BDD-style API testing framework combining API testing, mocking, and performance testing.
      • Step CI - 🟢 Open-source YAML-based API testing and monitoring framework for CI/CD.
      • Pact - 🟢 Contract testing framework ensuring provider-consumer compatibility for HTTP APIs and messaging.
      • Dredd - API testing tool that validates implementations against OpenAPI and API Blueprint specifications.
    • Browser & Frontend Performance

      • Lighthouse - ⭐🟢 Google's auditing tool for performance, accessibility, and SEO with actionable scores.
      • WebPageTest - ⭐🟢 Web performance analysis with filmstrip views, waterfall charts, and multi-location testing.
      • Playwright - ⭐🟢 Browser automation framework with built-in performance timing APIs for Chromium, Firefox, and WebKit.
      • Sitespeed.io - 🟢 Open-source web performance monitoring integrating Lighthouse, WebPageTest, and Grafana dashboards.
      • Puppeteer - 🟢 Chrome DevTools Protocol API enabling programmatic access to performance traces and network interception.
      • Yellowlab Tools - 🟢 Frontend code quality and performance auditing for JavaScript, CSS, and rendering issues.
      • SpeedCurve - 🟠 Continuous frontend performance monitoring with Core Web Vitals tracking and competitive benchmarking.
    • Chaos Engineering & Fault Injection

      • Litmus - ⭐🟢🔵 CNCF incubating Kubernetes chaos engineering platform with extensive experiment library.
      • Chaos Mesh - ⭐🟢🔵 CNCF incubating Kubernetes-native chaos platform with pod, network, and I/O fault injection.
      • Gremlin - 🟠 Enterprise chaos engineering platform with managed experiments and safety controls.
      • Chaos Monkey - ⭐🟢 Netflix's pioneering chaos tool that randomly terminates instances in production.
      • Pumba - 🟢🔵 Chaos testing for Docker containers with network delay and packet loss injection.
      • Steadybit - 🟠🔵 Enterprise reliability platform combining chaos engineering with resilience validation.
      • AWS Fault Injection Service - 🟠🔵 Managed fault injection for AWS resources with native service integration.
    • CI/CD Integration & Performance Gates

      • Gatling Enterprise - 🟠 Managed Gatling execution with CI/CD integrations and historical comparison.
      • Lighthouse CI - 🟢 Run Lighthouse in CI with performance budgets, baseline comparison, and trend tracking.
      • Taurus - 🟢 YAML-based automation wrapper for JMeter, Gatling, Locust with unified reporting.
    • Cloud Provider Services

      • AWS Distributed Load Testing - 🟠🔵 Distributed load testing architecture on AWS via CloudFormation supporting JMeter, k6, and Locust.
      • Azure App Testing - 🟠🔵 Microsoft's managed load testing service supporting JMeter and Locust with multi-region simulation.
    • Database Performance Testing & Benchmarking

      • HammerDB - ⭐🟢 Open-source database benchmarking tool supporting TPC-C and TPC-H workloads across major databases.
      • sysbench - ⭐🟢🚀 Scriptable multi-threaded benchmark tool for OLTP, CPU, memory, and I/O tests.
      • pgbench - 🟢 PostgreSQL built-in benchmarking tool with custom scripts for workload simulation.
      • YCSB (Yahoo! Cloud Serving Benchmark) - ⭐🟢 Framework for benchmarking NoSQL and NewSQL databases with standard workloads.
      • benchbase (formerly OLTPBench) - 🟢 Multi-DBMS benchmarking framework supporting TPC-C, TPC-H, and YCSB workloads.
      • mysqlslap - MySQL built-in load emulation client for quick benchmarks.
      • mysqlslap - MySQL built-in load emulation client for quick benchmarks.
    • Developer-Centric Platforms

      • Grafana k6 Cloud - 🟠 Managed k6 execution with multi-region load zones and real-time Grafana visualization.
      • Octoperf - 🟠 SaaS performance testing platform built on JMeter with distributed load generation.
    • Enterprise Platforms

      • BlazeMeter - 🟠 Cloud performance testing platform supporting JMeter, Gatling, Locust, Selenium, and Playwright.
    • gRPC & Protocol-Specific Testing

      • ghz - 🟢🚀 gRPC benchmarking and load testing tool supporting unary and streaming RPCs.
      • k6 + xk6-grpc - 🟢🔵 k6 extension for scriptable gRPC load testing scenarios.
      • k6 + xk6-kafka - 🟢🔵 k6 extension for Apache Kafka load testing at scale.
      • kafka-producer-perf-test / kafka-consumer-perf-test - 🟢 Built-in Kafka benchmarking tools for producer and consumer throughput.
      • RabbitMQ PerfTest - 🟢 Official RabbitMQ benchmarking tool for throughput and latency measurement.
      • k6 + xk6-websockets - 🟢🔵 Built-in k6 WebSocket support for testing real-time and bidirectional protocols.
    • HTTP Benchmarking & Micro-Benchmarking

      • wrk2 - 🚀 Constant-throughput HTTP benchmarking with accurate latency histograms that avoids coordinated omission.
      • wrk - 🚀 HTTP benchmarking tool with Lua scripting for quick relative performance comparisons.
      • Vegeta - 🟢🚀 HTTP load testing tool with constant request rate mode and built-in plotting.
      • hey - 🟢 Simple HTTP load generator, successor to Apache Bench (ab).
      • oha - 🟢🚀 Rust-based HTTP load generator with real-time TUI.
      • bombardier - 🟢🚀 Fast, cross-platform HTTP benchmarking tool with detailed latency reporting.
      • hyperfoil - 🟢🔵🚀 Distributed benchmarking framework designed to avoid coordinated omission.
    • Load & Stress Testing

      • k6 - ⭐🟢🔵 Modern load testing tool with JavaScript ES6 scripting and native Prometheus/Grafana integration.
      • Gatling - ⭐🟢🚀 High-performance load testing framework with Scala/Java/Kotlin DSL and detailed HTML reports.
      • Locust - ⭐🟢 Python-based load testing framework defining user behavior in plain Python code.
      • Apache JMeter - ⭐🟢 Load testing tool with GUI and extensive protocol support (HTTP, JDBC, JMS, LDAP, SOAP).
      • Artillery - 🟢🔵 Node.js-based load testing toolkit with YAML scenarios supporting HTTP, WebSocket, and Socket.io.
      • NBomber - 🟢 Load testing framework for .NET with C#/F# scripting.
      • Tsung - 🚀 Erlang-based distributed load testing tool handling massive concurrent connections across multiple protocols.
      • GoReplay (gor) - 🟢🚀 Capture and replay production HTTP traffic for load testing with real traffic patterns.
      • Anteon (formerly Ddosify) - 🔵 eBPF-based Kubernetes performance testing platform with distributed load generation.
      • Neoload - 🟠 Enterprise performance testing platform with codeless and as-code options.
      • LoadRunner / OpenText - 🟠 Enterprise performance testing platform with broad protocol support.
    • Network Simulation & Traffic Shaping

      • tc (Traffic Control) - Linux kernel traffic shaping with netem qdisc for network emulation.
      • Comcast - CLI tool for simulating bad network conditions wrapping tc/pfctl.
      • Clumsy - 🟢 Windows network condition simulator for packet drop, lag, throttle, and reordering.
    • Results Analysis & Reporting

      • k6 HTML Report - 🟢 Standalone HTML report generator for k6 test results.
      • HdrHistogram - 🟢🚀 High Dynamic Range Histogram for accurate latency measurement capturing the full distribution.
      • Gatling Reports - 🟢 Built-in HTML reports with percentile distributions and response time series.
      • Apache JMeter Dashboard - 🟢 Built-in HTML dashboard generating APDEX scores and response time distributions.
      • Taurus Reporting - 🟢 Unified reporting across multiple load testing engines with BlazeMeter integration.
    • Service Virtualization and Mocking

      • WireMock - ⭐🟢🔵 HTTP mock server with request matching, stateful behavior, response templating, and fault injection.
      • Mountebank - 🟢 Multi-protocol service virtualization supporting HTTP, HTTPS, TCP, and SMTP.
      • Hoverfly - 🟢🔵 Lightweight service virtualization with capture-and-replay mode for API simulation.
      • MockServer - 🟢 HTTP/HTTPS mock server with expectation-based matching and callback actions.
      • Microcks - 🟢🔵 Kubernetes-native API mocking and testing importing OpenAPI, AsyncAPI, gRPC, and GraphQL contracts.
    • Synthetic Data Generation

      • Faker - ⭐🟢 Realistic fake data generation for JavaScript/TypeScript with massive locale support.
      • DataFaker - 🟢 Modern Java data generation library with expression-based generation.
      • Mimesis - 🟢🚀 High-performance fake data generator for Python with strong locale support.
      • Neosync - 🔵 Open-source platform for anonymizing production data and generating synthetic datasets.
    • System & Infrastructure Benchmarking

      • fio - ⭐🟢🚀 Reference I/O benchmarking tool with configurable workloads and multiple engines (libaio, io_uring).
      • stress-ng - 🟢🚀 System stress testing tool with 300+ methods covering CPU, memory, I/O, and network.
      • Phoronix Test Suite - 🟢 Comprehensive benchmarking platform with 500+ test profiles and result comparison.
      • iperf3 - ⭐🟢🚀 Network bandwidth measurement tool for TCP/UDP throughput testing.
    • Tools & Integrations