{"id":13412117,"url":"https://github.com/HDT3213/rdb","last_synced_at":"2025-03-14T17:31:35.266Z","repository":{"id":37340323,"uuid":"426669980","full_name":"HDT3213/rdb","owner":"HDT3213","description":"Golang implemented  Redis RDB parser for secondary development and memory analysis","archived":false,"fork":false,"pushed_at":"2024-08-26T12:43:08.000Z","size":582,"stargazers_count":475,"open_issues_count":2,"forks_count":104,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-03-12T01:07:05.565Z","etag":null,"topics":["analyzer","go","parser","rdb","redis"],"latest_commit_sha":null,"homepage":"https://www.cnblogs.com/Finley/p/16251360.html","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HDT3213.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-11-10T15:14:53.000Z","updated_at":"2025-03-07T09:54:19.000Z","dependencies_parsed_at":"2023-11-22T02:30:44.484Z","dependency_job_id":"7db1d061-89c5-461e-a1a2-20ae4ebd4d57","html_url":"https://github.com/HDT3213/rdb","commit_stats":{"total_commits":85,"total_committers":10,"mean_commits":8.5,"dds":"0.14117647058823535","last_synced_commit":"8bbf93a9dd95c2e4760438fa9dff1a83f57b44d1"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDT3213%2Frdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDT3213%2Frdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDT3213%2Frdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HDT3213%2Frdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HDT3213","download_url":"https://codeload.github.com/HDT3213/rdb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243136021,"owners_count":20241990,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analyzer","go","parser","rdb","redis"],"created_at":"2024-07-30T20:01:21.177Z","updated_at":"2025-03-14T17:31:35.220Z","avatar_url":"https://github.com/HDT3213.png","language":"Go","funding_links":[],"categories":["Data Integration Frameworks","开源类库","Database","Open source library","数据库","Go","Generators"],"sub_categories":["Database Tools","数据库","Database","数据库工具"],"readme":"![license](https://img.shields.io/github/license/HDT3213/rdb)\n![download](https://img.shields.io/github/downloads/hdt3213/rdb/total)\n[![Go Reference](https://pkg.go.dev/badge/github.com/hdt3213/rdb.svg)](https://pkg.go.dev/github.com/hdt3213/rdb)\n\u003cbr\u003e\n[![Build Status](https://github.com/hdt3213/rdb/actions/workflows/main.yml/badge.svg)](https://github.com/HDT3213/rdb/actions?query=branch%3Amaster)\n[![Coverage Status](https://coveralls.io/repos/github/HDT3213/rdb/badge.svg?branch=master)](https://coveralls.io/github/HDT3213/rdb?branch=master)\n[![Go Report Card](https://goreportcard.com/badge/github.com/HDT3213/rdb)](https://goreportcard.com/report/github.com/HDT3213/rdb)\n\u003cbr\u003e\n[![Mentioned in Awesome Go](https://awesome.re/mentioned-badge-flat.svg)](https://github.com/avelino/awesome-go)\n\n[中文版](https://github.com/HDT3213/rdb/blob/master/README_CN.md)\n\nThis is a golang implemented Redis RDB parser for secondary development and memory analysis.\n\nIt provides abilities to:\n\n- Generate memory report for rdb file\n- Convert RDB files to JSON\n- Convert RDB files to Redis Serialization Protocol (or AOF file)\n- Find the biggest N keys in RDB files\n- Draw FlameGraph to analysis which kind of keys occupied most memory\n- Customize data usage\n- Generate RDB file\n\nSupport RDB version: 1 \u003c= version \u003c= 12(Redis 7.2)\n\nIf you read Chinese, you could find a thorough introduction to the RDB file format here: [Golang 实现 Redis(11): RDB 文件格式](https://www.cnblogs.com/Finley/p/16251360.html)\n\nThanks sripathikrishnan for his [redis-rdb-tools](https://github.com/sripathikrishnan/redis-rdb-tools)\n\n# Install\n\nIf you have installed `go` on your compute, just simply use:\n\n```\ngo install github.com/hdt3213/rdb@latest\n```\n\n### Package Managers\n\nIf you're a [Homebrew](https://brew.sh/) user, you can install [rdb](https://formulae.brew.sh/formula/rdb) via:\n\n```sh\n$ brew install rdb\n```\n\nOr, you can download executable binary file from [releases](https://github.com/HDT3213/rdb/releases) and put its path to\nPATH environment.\n\nuse `rdb` command in terminal, you can see it's manual\n\n```\nThis is a tool to parse Redis' RDB files\nOptions:\n  -c command, including: json/memory/aof/bigkey/prefix/flamegraph\n  -o output file path, if there is no `-o` option, output to stdout\n  -n number of result, using in command: bigkey/prefix\n  -port listen port for flame graph web service\n  -sep separator for flamegraph, rdb will separate key by it, default value is \":\". \n                supporting multi separators: -sep sep1 -sep sep2 \n  -regex using regex expression filter keys\n  -no-expired reserve expired keys\n\nExamples:\nparameters between '[' and ']' is optional\n1. convert rdb to json\n  rdb -c json -o dump.json dump.rdb\n2. generate memory report\n  rdb -c memory -o memory.csv dump.rdb\n3. convert to aof file\n  rdb -c aof -o dump.aof dump.rdb\n4. get largest keys\n  rdb -c bigkey [-o dump.aof] [-n 10] dump.rdb\n5. get number and size by prefix\n  rdb -c prefix [-n 10] [-max-depth 3] [-o prefix-report.csv] dump.rdb\n6. draw flamegraph\n  rdb -c flamegraph [-port 16379] [-sep :] dump.rdb\n```\n\n# Convert to Json\n\nUsage:\n\n```\nrdb -c json -o \u003coutput_path\u003e \u003csource_path\u003e\n```\n\nexample:\n\n```\nrdb -c json -o intset_16.json cases/intset_16.rdb\n```\n\nYou can get some rdb examples in [cases](https://github.com/HDT3213/rdb/tree/master/cases)\n\nThe examples for json result:\n\n```json\n[\n    {\"db\":0,\"key\":\"hash\",\"size\":64,\"type\":\"hash\",\"hash\":{\"ca32mbn2k3tp41iu\":\"ca32mbn2k3tp41iu\",\"mddbhxnzsbklyp8c\":\"mddbhxnzsbklyp8c\"}},\n    {\"db\":0,\"key\":\"string\",\"size\":10,\"type\":\"string\",\"value\":\"aaaaaaa\"},\n    {\"db\":0,\"key\":\"expiration\",\"expiration\":\"2022-02-18T06:15:29.18+08:00\",\"size\":8,\"type\":\"string\",\"value\":\"zxcvb\"},\n    {\"db\":0,\"key\":\"list\",\"expiration\":\"2022-02-18T06:15:29.18+08:00\",\"size\":66,\"type\":\"list\",\"values\":[\"7fbn7xhcnu\",\"lmproj6c2e\",\"e5lom29act\",\"yy3ux925do\"]},\n    {\"db\":0,\"key\":\"zset\",\"expiration\":\"2022-02-18T06:15:29.18+08:00\",\"size\":57,\"type\":\"zset\",\"entries\":[{\"member\":\"zn4ejjo4ths63irg\",\"score\":1},{\"member\":\"1ik4jifkg6olxf5n\",\"score\":2}]},\n    {\"db\":0,\"key\":\"set\",\"expiration\":\"2022-02-18T06:15:29.18+08:00\",\"size\":39,\"type\":\"set\",\"members\":[\"2hzm5rnmkmwb3zqd\",\"tdje6bk22c6ddlrw\"]}\n]\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eJson Fromat Detail\u003c/summary\u003e\n  \n## string \n\n```json\n{\n    \"db\": 0,\n    \"key\": \"string\",\n    \"size\": 10, // estimated memory size\n    \"type\": \"string\",\n\t\"expiration\":\"2022-02-18T06:15:29.18+08:00\",\n    \"value\": \"aaaaaaa\"\n}\n```\n\n## list\n\n```json\n{\n    \"db\": 0,\n    \"key\": \"list\",\n    \"expiration\": \"2022-02-18T06:15:29.18+08:00\",\n    \"size\": 66,\n    \"type\": \"list\",\n    \"values\": [\n        \"7fbn7xhcnu\",\n        \"lmproj6c2e\",\n        \"e5lom29act\",\n        \"yy3ux925do\"\n    ]\n}\n```\n\n## set\n\n```json\n{\n    \"db\": 0,\n    \"key\": \"set\",\n    \"expiration\": \"2022-02-18T06:15:29.18+08:00\",\n    \"size\": 39,\n    \"type\": \"set\",\n    \"members\": [\n        \"2hzm5rnmkmwb3zqd\",\n        \"tdje6bk22c6ddlrw\"\n    ]\n}\n```\n\n## hash\n\n```json\n{\n    \"db\": 0,\n    \"key\": \"hash\",\n    \"size\": 64,\n    \"type\": \"hash\",\n\t\"expiration\": \"2022-02-18T06:15:29.18+08:00\",\n    \"hash\": {\n        \"ca32mbn2k3tp41iu\": \"ca32mbn2k3tp41iu\",\n        \"mddbhxnzsbklyp8c\": \"mddbhxnzsbklyp8c\"\n    }\n}\n```\n\n## zset\n\n```json\n{\n    \"db\": 0,\n    \"key\": \"zset\",\n    \"expiration\": \"2022-02-18T06:15:29.18+08:00\",\n    \"size\": 57,\n    \"type\": \"zset\",\n    \"entries\": [\n        {\n            \"member\": \"zn4ejjo4ths63irg\",\n            \"score\": 1\n        },\n        {\n            \"member\": \"1ik4jifkg6olxf5n\",\n            \"score\": 2\n        }\n    ]\n}\n```\n\n## stream\n\n```json\n{\n    \"db\": 0,\n    \"key\": \"mystream\",\n    \"size\": 1776,\n    \"type\": \"stream\",\n    \"encoding\": \"\",\n    \"version\": 3, // Version 2 means is RDB_TYPE_STREAM_LISTPACKS_2, 3 means is RDB_TYPE_STREAM_LISTPACKS_3\n\t// StreamEntry is a node in the underlying radix tree of redis stream, of type listpacks, which contains several messages. There is no need to care about which entry the message belongs to when using it.\n    \"entries\": [ \n        {\n            \"firstMsgId\": \"1704557973866-0\", // ID of the master entry at listpack head \n            \"fields\": [ // master fields, used for compressing size\n                \"name\",\n                \"surname\"\n            ],\n            \"msgs\": [ // messages in entry\n                {\n                    \"id\": \"1704557973866-0\",\n                    \"fields\": {\n                        \"name\": \"Sara\",\n                        \"surname\": \"OConnor\"\n                    },\n                    \"deleted\": false\n                }\n            ]\n        }\n    ],\n    \"groups\": [ // consumer groups\n        {\n            \"name\": \"consumer-group-name\",\n            \"lastId\": \"1704557973866-0\",\n            \"pending\": [ // pending messages\n                {\n                    \"id\": \"1704557973866-0\",\n                    \"deliveryTime\": 1704557998397,\n                    \"deliveryCount\": 1\n                }\n            ],\n            \"consumers\": [ // consumers in the group\n                {\n                    \"name\": \"consumer-name\",\n                    \"seenTime\": 1704557998397,\n                    \"pending\": [\n                        \"1704557973866-0\"\n                    ],\n                    \"activeTime\": 1704557998397\n                }\n            ],\n            \"entriesRead\": 1\n        }\n    ],\n    \"len\": 1, // current number of messages inside this stream\n    \"lastId\": \"1704557973866-0\",\n    \"firstId\": \"1704557973866-0\",\n    \"maxDeletedId\": \"0-0\",\n    \"addedEntriesCount\": 1\n}\n```\n\n\u003c/details\u003e\n\n# Generate Memory Report\n\nRDB uses rdb encoded size to estimate redis memory usage.\n\n```bash\nrdb -c memory -o \u003coutput_path\u003e \u003csource_path\u003e\n```\n\nExample:\n\n```bash\nrdb -c memory -o mem.csv cases/memory.rdb\n```\n\nThe examples for csv result:\n\n```csv\ndatabase,key,type,size,size_readable,element_count\n0,hash,hash,64,64B,2\n0,s,string,10,10B,0\n0,e,string,8,8B,0\n0,list,list,66,66B,4\n0,zset,zset,57,57B,2\n0,large,string,2056,2K,0\n0,set,set,39,39B,2\n```\n\n# Analyze By Prefix\n\nIf you can distinguish modules based on the prefix of the key, for example, the key of user data is `User:\u003cuid\u003e`, the key of Post is `Post:\u003cpostid\u003e`, the user statistics is `Stat:User:???`, and the statistics of Post is `Stat:Post:???`.Then we can get the status of each module through prefix analysis:\n\n```csv\ndatabase,prefix,size,size_readable,key_count\n0,Post:,1170456184,1.1G,701821\n0,Stat:,405483812,386.7M,3759832\n0,Stat:Post:,291081520,277.6M,2775043\n0,User:,241572272,230.4M,265810\n0,Topic:,171146778,163.2M,694498\n0,Topic:Post:,163635096,156.1M,693758\n0,Stat:Post:View,133201208,127M,1387516\n0,Stat:User:,114395916,109.1M,984724\n0,Stat:Post:Comment:,80178504,76.5M,693758\n0,Stat:Post:Like:,77701688,74.1M,693768\n```\n\nFormat:\n\n```bash\nrdb -c prefix [-n \u003ctop-n\u003e] [-max-depth \u003cmax-depth\u003e] -o \u003coutput_path\u003e \u003csource_path\u003e\n```\n\n- The prefix analysis results are arranged in descending order of memory space. The `-n` option can specify the number of outputs. All are output by default.\n\n- `-max-depth` can limit the maximum depth of the prefix tree. In the above example, the depth of `Stat:` is 1, and the depth of `Stat:User:` and `Stat:Post:` is 2.\n\nExample:\n\n```bash\nrdb -c prefix -n 10 -max-depth 2 -o prefix.csv cases/memory.rdb\n```\n\n# Flame Graph\n\nIn many cases there is not a few very large key but lots of small keys that occupied most memory.\n\nRDB tool could separate keys by the given delimeters, then aggregate keys with same prefix.\n\nFinally RDB tool presents the result as flame graph, with which you could find out which kind of keys consumed most\nmemory.\n\n![截屏2022-10-30 12.06.00.png](https://s2.loli.net/2022/11/08/HW9ZxGfeEzArUhM.png)\n\nIn this example, the keys of pattern `Comment:*` use 8.463% memory.\n\nUsage:\n\n```\nrdb -c flamegraph [-port \u003cport\u003e] [-sep \u003cseparator1\u003e] [-sep \u003cseparator2\u003e] \u003csource_path\u003e\n```\n\nExample:\n\n```\nrdb -c flamegraph -port 16379 -sep : dump.rdb\n```\n\n# Find The Biggest Keys\n\nRDB can find biggest N keys in file\n\n```\nrdb -c bigkey -n \u003cresult_number\u003e \u003csource_path\u003e\n```\n\nExample:\n\n```\nrdb -c bigkey -n 5 cases/memory.rdb\n```\n\nThe examples for csv result:\n\n```csv\ndatabase,key,type,size,size_readable,element_count\n0,large,string,2056,2K,0\n0,list,list,66,66B,4\n0,hash,hash,64,64B,2\n0,zset,zset,57,57B,2\n0,set,set,39,39B,2\n```\n\n# Convert to AOF\n\nUsage:\n\n```\nrdb -c aof -o \u003coutput_path\u003e \u003csource_path\u003e\n```\n\nExample:\n\n```\nrdb -c aof -o mem.aof cases/memory.rdb\n```\n\nThe examples for aof result:\n\n```\n*3\n$3\nSET\n$1\ns\n$7\naaaaaaa\n```\n\n# Regex Filter\n\nRDB tool supports using regex expression to filter keys.\n\nExample:\n```rdb\nrdb -c json -o regex.json -regex '^l.*' cases/memory.rdb\n```\n\n# Customize data usage\n\n```go\npackage main\n\nimport (\n\t\"github.com/hdt3213/rdb/parser\"\n\t\"os\"\n)\n\nfunc main() {\n\trdbFile, err := os.Open(\"dump.rdb\")\n\tif err != nil {\n\t\tpanic(\"open dump.rdb failed\")\n\t}\n\tdefer func() {\n\t\t_ = rdbFile.Close()\n\t}()\n\tdecoder := parser.NewDecoder(rdbFile)\n\terr = decoder.Parse(func(o parser.RedisObject) bool {\n\t\tswitch o.GetType() {\n\t\tcase parser.StringType:\n\t\t\tstr := o.(*parser.StringObject)\n\t\t\tprintln(str.Key, str.Value)\n\t\tcase parser.ListType:\n\t\t\tlist := o.(*parser.ListObject)\n\t\t\tprintln(list.Key, list.Values)\n\t\tcase parser.HashType:\n\t\t\thash := o.(*parser.HashObject)\n\t\t\tprintln(hash.Key, hash.Hash)\n\t\tcase parser.ZSetType:\n\t\t\tzset := o.(*parser.ZSetObject)\n\t\t\tprintln(zset.Key, zset.Entries)\n\t\tcase parser.StreamType:\n\t\t\tstream := o.(*parser.StreamObject)\n\t\t\tprintln(stream.Entries, stream.Groups)\n\t\t}\n\t\t// return true to continue, return false to stop the iteration\n\t\treturn true\n\t})\n\tif err != nil {\n\t\tpanic(err)\n\t}\n}\n```\n\n# Generate RDB file\n\nThis library can generate RDB file: \n\n```go\npackage main\n\nimport (\n\t\"github.com/hdt3213/rdb/encoder\"\n\t\"github.com/hdt3213/rdb/model\"\n\t\"os\"\n\t\"time\"\n)\n\nfunc main() {\n\trdbFile, err := os.Create(\"dump.rdb\")\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\tdefer rdbFile.Close()\n\tenc := encoder.NewEncoder(rdbFile)\n\terr = enc.WriteHeader()\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\tauxMap := map[string]string{\n\t\t\"redis-ver\":    \"4.0.6\",\n\t\t\"redis-bits\":   \"64\",\n\t\t\"aof-preamble\": \"0\",\n\t}\n\tfor k, v := range auxMap {\n\t\terr = enc.WriteAux(k, v)\n\t\tif err != nil {\n\t\t\tpanic(err)\n\t\t}\n\t}\n\n\terr = enc.WriteDBHeader(0, 5, 1)\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\texpirationMs := uint64(time.Now().Add(time.Hour*8).Unix() * 1000)\n\terr = enc.WriteStringObject(\"hello\", []byte(\"world\"), encoder.WithTTL(expirationMs))\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\terr = enc.WriteListObject(\"list\", [][]byte{\n\t\t[]byte(\"123\"),\n\t\t[]byte(\"abc\"),\n\t\t[]byte(\"la la la\"),\n\t})\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\terr = enc.WriteSetObject(\"set\", [][]byte{\n\t\t[]byte(\"123\"),\n\t\t[]byte(\"abc\"),\n\t\t[]byte(\"la la la\"),\n\t})\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\terr = enc.WriteHashMapObject(\"list\", map[string][]byte{\n\t\t\"1\":  []byte(\"123\"),\n\t\t\"a\":  []byte(\"abc\"),\n\t\t\"la\": []byte(\"la la la\"),\n\t})\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\terr = enc.WriteZSetObject(\"list\", []*model.ZSetEntry{\n\t\t{\n\t\t\tScore: 1.234,\n\t\t\tMember: \"a\",\n\t\t},\n\t\t{\n\t\t\tScore: 2.71828,\n\t\t\tMember: \"b\",\n\t\t},\n\t})\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\terr = enc.WriteEnd()\n\tif err != nil {\n\t\tpanic(err)\n\t}\n}\n```\n\n# Benchmark\n\nTested on MacBook Pro (16-inch, 2019) 2.6 GHz 6cores Intel Core i7, using  a 1.3 GB RDB file encoded with v9 format from Redis 5.0 in production environment.\n\n|usage|elapsed|speed|\n|:-:|:-:|:-:|\n|ToJson|74.12s|17.96MB/s|\n|Memory|18.585s|71.62MB/s|\n|AOF|104.77s|12.76MB/s|\n|Top10|14.8s|89.95MB/s|\n|FlameGraph|21.83s|60.98MB/s|\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHDT3213%2Frdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FHDT3213%2Frdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHDT3213%2Frdb/lists"}