{"id":13412560,"url":"https://github.com/chrislusf/glow","last_synced_at":"2025-05-15T03:09:01.130Z","repository":{"id":33738637,"uuid":"37393283","full_name":"chrislusf/glow","owner":"chrislusf","description":"Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Storm, etc. I am also working on another similar pure Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.","archived":false,"fork":false,"pushed_at":"2018-11-02T06:09:14.000Z","size":35626,"stargazers_count":3216,"open_issues_count":15,"forks_count":248,"subscribers_count":144,"default_branch":"master","last_synced_at":"2025-04-14T03:11:45.239Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chrislusf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-Apache 2.0.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-06-14T00:33:48.000Z","updated_at":"2025-04-02T12:48:31.000Z","dependencies_parsed_at":"2022-08-09T09:15:29.064Z","dependency_job_id":null,"html_url":"https://github.com/chrislusf/glow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrislusf%2Fglow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrislusf%2Fglow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrislusf%2Fglow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrislusf%2Fglow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chrislusf","download_url":"https://codeload.github.com/chrislusf/glow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254264770,"owners_count":22041794,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T20:01:26.134Z","updated_at":"2025-05-15T03:08:56.113Z","avatar_url":"https://github.com/chrislusf.png","language":"Go","funding_links":[],"categories":["Distributed Systems","开源类库","Misc","Open source library","Go","Relational Databases","分布式系统","分佈式系統","\u003cspan id=\"分布式系统-distributed-systems\"\u003e分布式系统 Distributed Systems\u003c/span\u003e"],"sub_categories":["Search and Analytic Databases","Advanced Console UIs","机器学习","Machine Learning","检索及分析资料库","高级控制台界面","SQL 查询语句构建库","高級控制台界面","\u003cspan id=\"高级控制台用户界面-advanced-console-uis\"\u003e高级控制台用户界面 Advanced Console UIs\u003c/span\u003e"],"readme":"# glow\n[![Build Status](https://travis-ci.org/chrislusf/glow.svg?branch=master)](https://travis-ci.org/chrislusf/glow)\n[![GoDoc](https://godoc.org/github.com/chrislusf/glow?status.svg)](https://godoc.org/github.com/chrislusf/glow)\n\n# Purpose\n\nGlow is providing a library to easily compute in parallel threads or distributed to clusters of machines. This is written in pure Go.\n\nI am also working on another pure-Go system, https://github.com/chrislusf/gleam , which is more flexible and more performant.\n\n# Installation\n```\n$ go get github.com/chrislusf/glow\n$ go get github.com/chrislusf/glow/flow\n```\n\n# One minute tutorial\n\n## Simple Start\n\nHere is a simple full example:\n\n```go\npackage main\n\nimport (\n\t\"flag\"\n\t\"strings\"\n\n\t\"github.com/chrislusf/glow/flow\"\n)\n\nfunc main() {\n\tflag.Parse()\n\n\tflow.New().TextFile(\n\t\t\"/etc/passwd\", 3,\n\t).Filter(func(line string) bool {\n\t\treturn !strings.HasPrefix(line, \"#\")\n\t}).Map(func(line string, ch chan string) {\n\t\tfor _, token := range strings.Split(line, \":\") {\n\t\t\tch \u003c- token\n\t\t}\n\t}).Map(func(key string) int {\n\t\treturn 1\n\t}).Reduce(func(x int, y int) int {\n\t\treturn x + y\n\t}).Map(func(x int) {\n\t\tprintln(\"count:\", x)\n\t}).Run()\n}\n\n```\n\nTry it.\n```\n  $ ./word_count\n```\n\nIt will run the input text file, '/etc/passwd', in 3 go routines, filter/map/map, and then reduced to one number in one goroutine (not exactly one goroutine, but let's skip the details for now.) and print it out.\n\nThis is useful already, saving lots of idiomatic but repetitive code on channels, sync wait, etc, to fully utilize more CPU cores.\n\nHowever, there is one more thing! It can run across a Glow cluster, which can be run multiple servers/racks/data centers!\n\n## Scale it out\nTo setup the Glow cluster, we do not need experts on Zookeeper/HDFS/Mesos/YARN etc. Just build or download one binary file.\n\n### Setup the cluster\n```shell\n  # Fetch and install via go, or just download it from somewhere.\n  $ go get github.com/chrislusf/glow\n  # Run a script from the root directory of the repo to start a test cluster.\n  $ etc/start_local_glow_cluster.sh\n```\nGlow Master and Glow Agent run very efficiently. They take about 6.5MB and 5.5MB memory respectively in my environments. I would recommend set up agents on any server you can find. You can tap into the computing power whenever you need to.\n\n### Start the driver program\nTo leap from one computer to clusters of computers, add this line to the import list:\n\n```go\n\t_ \"github.com/chrislusf/glow/driver\"\n```\n\nAnd put this line as the first statement in the main() function:\n\n```go\n\tflag.Parse()\n```\n\nThis will \"steroidize\" the code to run in cluster mode!\n\n```\n$ ./word_count -glow -glow.leader=\"localhost:8930\"\n```\nThe word_count program will become a driver program, dividing the execution into a directed acyclic graph(DAG), and send tasks to agents.\n\n### Visualize the flow\n\nTo understand how each executor works, you can visualize the flow by generating a dot file of the flow, and render it to png file via \"dot\" command provided from graphviz.\n```\n$ ./word_count -glow -glow.flow.plot \u003e x.dot\n$ dot -Tpng -otestSelfJoin.png x.dot\n```\n\n![Glow Hello World Execution Plan](https://raw.githubusercontent.com/chrislusf/glow/master/etc/helloworld.png)\n\n# Read More\n\n1. Wiki page: https://github.com/chrislusf/glow/wiki\n2. Mailing list: https://groups.google.com/forum/#!forum/glow-user-discussion\n3. Examples: https://github.com/chrislusf/glow/tree/master/examples/\n\n\n## Docker container\nDocker is not required. But if you like docker, here are instructions.\n\n```\n# Cross compile artefact for docker\n$ GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build .\n# build container\n$ docker build -t glow .\n```\nSee `examples/` directory for docker-compose setups.\n\n# Contribution\nStart using it! And report or fix any issue you have seen, add any feature you want.\n\nFork it, code it, and send pull requests. Better first discuss about the feature you want on the mailing list.\nhttps://groups.google.com/forum/#!forum/glow-user-discussion\n\n# License\nhttp://www.apache.org/licenses/LICENSE-2.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchrislusf%2Fglow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchrislusf%2Fglow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchrislusf%2Fglow/lists"}