{"id":13412585,"url":"https://github.com/AppsFlyer/go-sundheit","last_synced_at":"2025-03-14T18:31:46.112Z","repository":{"id":38418594,"uuid":"180148704","full_name":"AppsFlyer/go-sundheit","owner":"AppsFlyer","description":"A library built to provide support for defining service health for golang services. It allows you to register async health checks for your dependencies and the service itself, provides a health endpoint that exposes their status, and health metrics.","archived":false,"fork":false,"pushed_at":"2024-07-25T06:36:55.000Z","size":245,"stargazers_count":528,"open_issues_count":4,"forks_count":30,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-07-31T20:51:08.757Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AppsFlyer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-08T12:54:01.000Z","updated_at":"2024-07-25T06:35:12.000Z","dependencies_parsed_at":"2023-12-06T13:41:19.926Z","dependency_job_id":"7927b799-2d65-4e0f-b6a8-0c727cb9a978","html_url":"https://github.com/AppsFlyer/go-sundheit","commit_stats":null,"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AppsFlyer%2Fgo-sundheit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AppsFlyer%2Fgo-sundheit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AppsFlyer%2Fgo-sundheit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AppsFlyer%2Fgo-sundheit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AppsFlyer","download_url":"https://codeload.github.com/AppsFlyer/go-sundheit/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243625150,"owners_count":20321241,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T20:01:26.453Z","updated_at":"2025-03-14T18:31:45.786Z","avatar_url":"https://github.com/AppsFlyer.png","language":"Go","funding_links":[],"categories":["Go","Distributed Systems","分布式系统","Relational Databases"],"sub_categories":["Search and Analytic Databases","Advanced Console UIs","检索及分析资料库","SQL 查询语句构建库"],"readme":"# go-sundheit\n[![Actions Status](https://github.com/AppsFlyer/go-sundheit/workflows/go-build/badge.svg)](https://github.com/AppsFlyer/go-sundheit/actions)\n[![CircleCI](https://circleci.com/gh/AppsFlyer/go-sundheit.svg?style=svg)](https://circleci.com/gh/AppsFlyer/go-sundheit)\n[![Coverage Status](https://coveralls.io/repos/github/AppsFlyer/go-sundheit/badge.svg?branch=master)](https://coveralls.io/github/AppsFlyer/go-sundheit?branch=master)\n[![Go Report Card](https://goreportcard.com/badge/github.com/AppsFlyer/go-sundheit)](https://goreportcard.com/report/github.com/AppsFlyer/go-sundheit)\n[![Godocs](https://img.shields.io/badge/golang-documentation-blue.svg)](https://godoc.org/github.com/AppsFlyer/go-sundheit)\n[![Mentioned in Awesome Go](https://awesome.re/mentioned-badge.svg)](https://github.com/avelino/awesome-go)  \n\n\u003cimg align=\"right\" src=\"docs/go-sundheit.png\" width=\"200\"\u003e\n\nA library built to provide support for defining service health for golang services.\nIt allows you to register async health checks for your dependencies and the service itself, \nand provides a health endpoint that exposes their status.\n\n## What's go-sundheit?\nThe project is named after the German word `Gesundheit` which means ‘health’, and it is pronounced `/ɡəˈzʊntˌhaɪ̯t/`.\n\n## Installation\nUsing go modules:\n```\ngo get github.com/AppsFlyer/go-sundheit@v0.5.0\n```\n\n## Usage\n```go\nimport (\n\t\"net/http\"\n\t\"time\"\n\t\"log\"\n\n\t\"github.com/pkg/errors\"\n\t\"github.com/AppsFlyer/go-sundheit\"\n\n\thealthhttp \"github.com/AppsFlyer/go-sundheit/http\"\n\t\"github.com/AppsFlyer/go-sundheit/checks\"\n)\n\nfunc main() {\n\t// create a new health instance\n\th := gosundheit.New()\n\t\n\t// define an HTTP dependency check\n\thttpCheckConf := checks.HTTPCheckConfig{\n\t\tCheckName: \"httpbin.url.check\",\n\t\tTimeout:   1 * time.Second,\n\t\t// dependency you're checking - use your own URL here...\n\t\t// this URL will fail 50% of the times\n\t\tURL:       \"http://httpbin.org/status/200,300\",\n\t}\n\t// create the HTTP check for the dependency\n\t// fail fast when you misconfigured the URL. Don't ignore errors!!!\n\thttpCheck, err := checks.NewHTTPCheck(httpCheckConf)\n\tif err != nil {\n\t\tfmt.Println(err)\n\t\treturn // your call...\n\t}\n\n\t// Alternatively panic when creating a check fails\n\thttpCheck = checks.Must(checks.NewHTTPCheck(httpCheckConf))\n\n\terr = h.RegisterCheck(\n\t\thttpCheck,\n\t\tgosundheit.InitialDelay(time.Second),         // the check will run once after 1 sec\n\t\tgosundheit.ExecutionPeriod(10 * time.Second), // the check will be executed every 10 sec\n\t)\n\t\n\tif err != nil {\n\t\tfmt.Println(\"Failed to register check: \", err)\n\t\treturn // or whatever\n\t}\n\n\t// define more checks...\n\t\n\t// register a health endpoint\n\thttp.Handle(\"/admin/health.json\", healthhttp.HandleHealthJSON(h))\n\t\n\t// serve HTTP\n\tlog.Fatal(http.ListenAndServe(\":8080\", nil))\n}\n```\n### Using `Option` to Configure `Health` Service\nTo create a health service, it's simple as calling the following code:\n```go\ngosundheit.New(options ...Option)\n```\nThe optional parameters of `options` allows the user to configure the Health Service by passing configuration functions (implementing `Option` signature).    \nAll options are marked with the prefix `WithX`. Available options:\n- `WithCheckListeners` - enables you to act on check registration, start and completed events\n- `WithHealthListeners` - enables you to act on changes in the health service results\n\n### Built-in Checks\nThe library comes with a set of built-in checks.\nCurrently implemented checks are as follows:\n\n#### HTTP built-in check\nThe HTTP check allows you to trigger an HTTP request to one of your dependencies, \nand verify the response status, and optionally the content of the response body.\nExample was given above in the [usage](#usage) section\n\n#### DNS built-in check(s)\nThe DNS checks allow you to perform lookup to a given hostname / domain name / CNAME / etc, \nand validate that it resolves to at least the minimum number of required results.\n\nCreating a host lookup check is easy:\n```go\n// Schedule a host resolution check for `example.com`, requiring at least one results, and running every 10 sec\nh.RegisterCheck(\n\tchecks.NewHostResolveCheck(\"example.com\", 1),\n\tgosundheit.ExecutionPeriod(10 * time.Second),\n)\n```\n\nYou may also use the low level `checks.NewResolveCheck` specifying a custom `LookupFunc` if you want to to perform other kinds of lookups.\nFor example you may register a reverse DNS lookup check like so:\n```go\nfunc ReverseDNLookup(ctx context.Context, addr string) (resolvedCount int, err error) {\n\tnames, err := net.DefaultResolver.LookupAddr(ctx, addr)\n\tresolvedCount = len(names)\n\treturn\n}\n\n//...\n\nh.RegisterCheck(\n\tchecks.NewResolveCheck(ReverseDNLookup, \"127.0.0.1\", 3),\n\tgosundheit.ExecutionPeriod(10 * time.Second),\n\tgosundheit.ExecutionTimeout(1*time.Second)\n)\n```\n\n#### Ping built-in check(s)\nThe ping checks allow you to verifies that a resource is still alive and reachable.\nFor example, you can use it as a DB ping check (`sql.DB` implements the Pinger interface):\n```go\n\tdb, err := sql.Open(...)\n\tdbCheck, err := checks.NewPingCheck(\"db.check\", db)\n\t_ = h.RegisterCheck(\u0026gosundheit.Config{\n\t\tCheck: dbCheck,\n\t\t// ...\n\t})\n```\n\nYou can also use the ping check to test a generic connection like so:\n```go\n\tpinger := checks.NewDialPinger(\"tcp\", \"example.com\")\n\tpingCheck, err := checks.NewPingCheck(\"example.com.reachable\", pinger)\n\th.RegisterCheck(pingCheck)\n``` \n\nThe `NewDialPinger` function supports all the network/address parameters supported by the `net.Dial()` function(s)\n\n### Custom Checks\nThe library provides 2 means of defining a custom check.\nThe bottom line is that you need an implementation of the `Check` interface:\n```go\n// Check is the API for defining health checks.\n// A valid check has a non empty Name() and a check (Execute()) function.\ntype Check interface {\n\t// Name is the name of the check.\n\t// Check names must be metric compatible.\n\tName() string\n\t// Execute runs a single time check, and returns an error when the check fails, and an optional details object.\n\tExecute() (details interface{}, err error)\n}\n```\nSee examples in the following 2 sections below.\n\n#### Use the CustomCheck struct\nThe `checksCustomCheck` struct implements the `checks.Check` interface,\nand is the simplest way to implement a check if all you need is to define a check function.\n\nLet's define a check function that fails 50% of the times:\n```go\nfunc lotteryCheck() (details interface{}, err error) {\n\tlottery := rand.Float32()\n\tdetails = fmt.Sprintf(\"lottery=%f\", lottery)\n\tif lottery \u003c 0.5 {\n\t\terr = errors.New(\"Sorry, I failed\")\n\t}\n\treturn\n}\n```\n\nNow we register the check to start running right away, and execute once per 2 minutes with a timeout of 5 seconds:\n```go\nh := gosundheit.New()\n...\n\nh.RegisterCheck(\n\t\u0026checks.CustomCheck{\n\t\tCheckName: \"lottery.check\",\n\t\tCheckFunc: lotteryCheck,\n\t},\n\tgosundheit.InitialDelay(0),\n\tgosundheit.ExecutionPeriod(2 * time.Minute), \n\tgosundheit.ExecutionTimeout(5 * time.Second)\n)\n```\n\n#### Implement the Check interface\nSometimes you need to define a more elaborate custom check.\nFor example when you need to manage state.\nFor these cases it's best to implement the `Check` interface yourself.\n\nLet's define a flexible example of the lottery check, that allows you to define a fail probability:\n```go\ntype Lottery struct {\n\tmyname string\n\tprobability float32\n}\n\nfunc (l Lottery) Execute() (details interface{}, err error) {\n\tlottery := rand.Float32()\n\tdetails = fmt.Sprintf(\"lottery=%f\", lottery)\n\tif lottery \u003c l.probability {\n\t\terr = errors.New(\"Sorry, I failed\")\n\t}\n\treturn\n}\n\nfunc (l Lottery) Name() string {\n\treturn l.myname\n}\n```\n\nAnd register our custom check, scheduling it to run every 30 seconds (after a 1 second initial delay) with a 5 seconds timeout:\n```go\nh := gosundheit.New()\n...\n\nh.RegisterCheck(\n\tLottery{myname: \"custom.lottery.check\", probability:0.3},\n\tgosundheit.InitialDelay(1*time.Second),\n\tgosundheit.ExecutionPeriod(30*time.Second),\n\tgosundheit.ExecutionTimeout(5*time.Second),\n)\n```\n\n#### Custom Checks Notes\n1. If a check take longer than the specified rate period, then next execution will be delayed, \nbut will not be concurrently executed.\n1. Checks must complete within a reasonable time. If a check doesn't complete or gets hung, \nthe next check execution will be delayed. Use proper time outs.\n1. Checks must respect the provided context. Specifically, a check must abort its execution, and return an error, if the context has been cancelled.  \n1. **A health-check name must be a metric name compatible string** \n  (i.e. no funky characters, and spaces allowed - just make it simple like `clicks-db-check`).\n  See here: https://help.datadoghq.com/hc/en-us/articles/203764705-What-are-valid-metric-names-\n\n### Expose Health Endpoint\nThe library provides an HTTP handler function for serving health stats in JSON format.\nYou can register it using your favorite HTTP implementation like so:\n```go\nhttp.Handle(\"/admin/health.json\", healthhttp.HandleHealthJSON(h))\n```\nThe endpoint can be called like so:\n```text\n~ $ curl -i http://localhost:8080/admin/health.json\nHTTP/1.1 503 Service Unavailable\nContent-Type: application/json\nDate: Tue, 22 Jan 2019 09:31:46 GMT\nContent-Length: 701\n\n{\n\t\"custom.lottery.check\": {\n\t\t\"message\": \"lottery=0.206583\",\n\t\t\"error\": {\n\t\t\t\"message\": \"Sorry, I failed\"\n\t\t},\n\t\t\"timestamp\": \"2019-01-22T11:31:44.632415432+02:00\",\n\t\t\"num_failures\": 2,\n\t\t\"first_failure_time\": \"2019-01-22T11:31:41.632400256+02:00\"\n\t},\n\t\"lottery.check\": {\n\t\t\"message\": \"lottery=0.865335\",\n\t\t\"timestamp\": \"2019-01-22T11:31:44.63244047+02:00\",\n\t\t\"num_failures\": 0,\n\t\t\"first_failure_time\": null\n\t},\n\t\"url.check\": {\n\t\t\"message\": \"http://httpbin.org/status/200,300\",\n\t\t\"error\": {\n\t\t\t\"message\": \"unexpected status code: '300' expected: '200'\"\n\t\t},\n\t\t\"timestamp\": \"2019-01-22T11:31:44.632442937+02:00\",\n\t\t\"num_failures\": 4,\n\t\t\"first_failure_time\": \"2019-01-22T11:31:38.632485339+02:00\"\n\t}\n}\n```\nOr for the shorter version:\n```text\n~ $ curl -i http://localhost:8080/admin/health.json?type=short\nHTTP/1.1 503 Service Unavailable\nContent-Type: application/json\nDate: Tue, 22 Jan 2019 09:40:19 GMT\nContent-Length: 105\n\n{\n\t\"custom.lottery.check\": \"PASS\",\n\t\"lottery.check\": \"PASS\",\n\t\"my.check\": \"FAIL\",\n\t\"url.check\": \"PASS\"\n}\n```\n\nThe `short` response type is suitable for the consul health checks / LB heath checks.\n\nThe response code is `200` when the tests pass, and `503` when they fail.\n\n### CheckListener\nIt is sometimes desired to keep track of checks execution and apply custom logic.\nFor example, you may want to add logging, or external metrics to your checks, \nor add some trigger some recovery logic when a check fails after 3 consecutive times.\n\nThe `gosundheit.CheckListener` interface allows you to hook this custom logic.\n\nFor example, lets add a logging listener to our health repository:\n```go\ntype checkEventsLogger struct{}\n\nfunc (l checkEventsLogger) OnCheckRegistered(name string, res gosundheit.Result) {\n\tlog.Printf(\"Check %q registered with initial result: %v\\n\", name, res)\n}\n\nfunc (l checkEventsLogger) OnCheckStarted(name string) {\n\tlog.Printf(\"Check %q started...\\n\", name)\n}\n\nfunc (l checkEventsLogger) OnCheckCompleted(name string, res gosundheit.Result) {\n\tlog.Printf(\"Check %q completed with result: %v\\n\", name, res)\n}\n```\n\nTo register your listener:\n```go\nh := gosundheit.New(gosundheit.WithCheckListeners(\u0026checkEventsLogger))\n```\n\nPlease note that your `CheckListener` implementation must not block!\n\n### HealthListener\nIt is something desired to track changes in registered checks results.\nFor example, you may want to log the amount of results monitored, or send metrics on these results.\n\nThe `gosundheit.HealthListener` interface allows you to hook this custom logic.\n\nFor example, lets add a logging listener:\n```go\ntype healthLogger struct{}\n\nfunc (l healthLogger) OnResultsUpdated(results map[string]Result) {\n\tlog.Printf(\"There are %d results, general health is %t\\n\", len(results), allHealthy(results))\n}\n```\n\nTo register your listener:\n```go\nh := gosundheit.New(gosundheit.WithHealthListeners(\u0026checkHealthLogger))\n```\n\n## Metrics\nThe library can expose metrics using a `CheckListener`. At the moment, OpenCensus is available and exposes the following metrics:\n* `health/check_status_by_name` - An aggregated health status gauge (0/1 for fail/pass) at the time of sampling.\nThe aggregation uses the following tags:\n  * `check=allChecks`     - all checks aggregation\n  * `check=\u003ccheck-name\u003e`  - specific check aggregation\n*  `health/check_count_by_name_and_status` - Aggregated pass/fail counts for checks, with the following tags: \n   * `check=allChecks`     - all checks aggregation\n   * `check=\u003ccheck-name\u003e`  - specific check aggregation\n   * `check-passing=[true|false]` \n* `health/executeTime` - The time it took to execute a checks. Using the following tag:\n  * `check=\u003ccheck-name\u003e`  - specific check aggregation\n\n\nThe views can be registered like so:\n```go\nimport (\n\t\"github.com/AppsFlyer/go-sundheit\"\n\t\"github.com/AppsFlyer/go-sundheit/opencensus\"\n\t\"go.opencensus.io/stats/view\"\n)\n// This listener can act both as check and health listener for reporting metrics\noc := opencensus.NewMetricsListener()\nh := gosundheit.New(gosundheit.WithCheckListeners(oc), gosundheit.WithHealthListeners(oc))\n// ...\nview.Register(opencensus.DefaultHealthViews...)\n// or register individual views. For example:\nview.Register(opencensus.ViewCheckExecutionTime, opencensus.ViewCheckStatusByName, ...)\n```\n\n### Classification\n\nIt is sometimes required to report metrics for different check types (e.g. setup, liveness, readiness).\nTo report metrics using `classification` tag - it's possible to initialize the OpenCensus listener with classification:\n\n```go\n// startup\nopencensus.NewMetricsListener(opencensus.WithStartupClassification())\n// liveness\nopencensus.NewMetricsListener(opencensus.WithLivenessClassification())\n// readiness\nopencensus.NewMetricsListener(opencensus.WithReadinessClassification())\n// custom\nopencensus.NewMetricsListener(opencensus.WithClassification(\"custom\"))\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAppsFlyer%2Fgo-sundheit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FAppsFlyer%2Fgo-sundheit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAppsFlyer%2Fgo-sundheit/lists"}