{"id":13412025,"url":"https://github.com/linkedin/goavro","last_synced_at":"2025-08-17T01:34:52.443Z","repository":{"id":27740268,"uuid":"31228039","full_name":"linkedin/goavro","owner":"linkedin","description":"Goavro is a library that encodes and decodes Avro data.","archived":false,"fork":false,"pushed_at":"2025-06-09T02:40:15.000Z","size":753,"stargazers_count":1034,"open_issues_count":81,"forks_count":225,"subscribers_count":24,"default_branch":"master","last_synced_at":"2025-08-15T02:59:01.522Z","etag":null,"topics":["avro","golang"],"latest_commit_sha":null,"homepage":"https://pkg.go.dev/github.com/linkedin/goavro/v2","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linkedin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-02-23T20:28:46.000Z","updated_at":"2025-08-10T06:16:27.000Z","dependencies_parsed_at":"2024-05-02T20:57:18.528Z","dependency_job_id":"a83132da-b1a9-420c-b79f-0bf9ae8ad9b7","html_url":"https://github.com/linkedin/goavro","commit_stats":{"total_commits":293,"total_committers":53,"mean_commits":5.528301886792453,"dds":0.590443686006826,"last_synced_commit":"8eb9f0e2d756cea165f593f80c6780f4b0b4dbb6"},"previous_names":[],"tags_count":44,"template":false,"template_full_name":null,"purl":"pkg:github/linkedin/goavro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fgoavro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fgoavro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fgoavro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fgoavro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linkedin","download_url":"https://codeload.github.com/linkedin/goavro/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fgoavro/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270796216,"owners_count":24647319,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["avro","golang"],"created_at":"2024-07-30T20:01:20.246Z","updated_at":"2025-08-17T01:34:52.424Z","avatar_url":"https://github.com/linkedin.png","language":"Go","funding_links":[],"categories":["Uncategorized","Database","数据库","Go","Generators","Data Integration Frameworks"],"sub_categories":["Database Schema Migration","数据库模式迁移","Advanced Console UIs"],"readme":"\u003e [!IMPORTANT]\n\u003e Internally, most of LinkedIn has moved over to use https://github.com/hamba/avro for Avro serialization/deserialization needs as we found it to be significantly more performant in large-scale scenarios. goavro is not actively in development.\n\n# goavro\n\nGoavro is a library that encodes and decodes Avro data.\n\n## Description\n\n* Encodes to and decodes from both binary and textual JSON Avro data.\n* `Codec` is stateless and is safe to use by multiple goroutines.\n\nWith the exception of features not yet supported, goavro attempts to\nbe fully compliant with the most recent version of the\n[Avro specification](http://avro.apache.org/docs/1.8.2/spec.html).\n\n## Dependency Notice\n\nAll usage of `gopkg.in` has been removed in favor of Go modules.\nPlease update your import paths to `github.com/linkedin/goavro/v2`.  v1\nusers can still use old versions of goavro by adding a constraint to\nyour `go.mod` or `Gopkg.toml` file.\n\n```\nrequire (\n    github.com/linkedin/goavro v1.0.5\n)\n```\n\n```toml\n[[constraint]]\nname = \"github.com/linkedin/goavro\"\nversion = \"=1.0.5\"\n```\n\n## Major Improvements in v2 over v1\n\n### Avro namespaces\n\nThe original version of this library was written prior to my really\nunderstanding how Avro namespaces ought to work. After using Avro for\na long time now, and after a lot of research, I think I grok Avro\nnamespaces properly, and the library now correctly handles every test\ncase the Apache Avro distribution has for namespaces, including being\nable to refer to a previously defined data type later on in the same\nschema.\n\n### Getting Data into and out of Records\n\nThe original version of this library required creating `goavro.Record`\ninstances, and use of getters and setters to access a record's\nfields. When schemas were complex, this required a lot of work to\ndebug and get right. The original version also required users to break\nschemas in chunks, and have a different schema for each record\ntype. This was cumbersome, annoying, and error prone.\n\nThe new version of this library eliminates the `goavro.Record` type,\nand accepts a native Go map for all records to be encoded. Keys are\nthe field names, and values are the field values. Nothing could be\nmore easy. Conversely, decoding Avro data yields a native Go map for\nthe upstream client to pull data back out of.\n\nFurthermore, there is never a reason to ever have to break your schema\ndown into record schemas. Merely feed the entire schema into the\n`NewCodec` function once when you create the `Codec`, then use\nit. This library knows how to parse the data provided to it and ensure\ndata values for records and their fields are properly encoded and\ndecoded.\n\n### 3x--4x Performance Improvement\n\nThe original version of this library was truly written with Go's idea\nof `io.Reader` and `io.Writer` composition in mind. Although\ncomposition is a powerful tool, the original library had to pull bytes\noff the `io.Reader`--often one byte at a time--check for read errors,\ndecode the bytes, and repeat. This version, by using a native Go byte\nslice, both decoding and encoding complex Avro data here at LinkedIn\nis between three and four times faster than before.\n\n### Avro JSON Support\n\nThe original version of this library did not support JSON encoding or\ndecoding, because it wasn't deemed useful for our internal use at the\ntime. When writing the new version of the library I decided to tackle\nthis issue once and for all, because so many engineers needed this\nfunctionality for their work.\n\n### Better Handling of Record Field Default Values\n\nThe original version of this library did not well handle default\nvalues for record fields. This version of the library uses a default\nvalue of a record field when encoding from native Go data to Avro data\nand the record field is not specified. Additionally, when decoding\nfrom Avro JSON data to native Go data, and a field is not specified,\nthe default value will be used to populate the field.\n\n## Contrast With Code Generation Tools\n\nIf you have the ability to rebuild and redeploy your software whenever\ndata schemas change, code generation tools might be the best solution\nfor your application.\n\nThere are numerous excellent tools for generating source code to\ntranslate data between native and Avro binary or textual data. One\nsuch tool is linked below. If a particular application is designed to\nwork with a rarely changing schema, programs that use code generated\nfunctions can potentially be more performant than a program that uses\ngoavro to create a `Codec` dynamically at run time.\n\n* [gogen-avro](https://github.com/actgardner/gogen-avro)\n\nI recommend benchmarking the resultant programs using typical data\nusing both the code generated functions and using goavro to see which\nperforms better. Not all code generated functions will out perform\ngoavro for all data corpuses.\n\nIf you don't have the ability to rebuild and redeploy software updates\nwhenever a data schema change occurs, goavro could be a great fit for\nyour needs. With goavro, your program can be given a new schema while\nrunning, compile it into a `Codec` on the fly, and immediately start\nencoding or decoding data using that `Codec`. Because Avro encoding\nspecifies that encoded data always be accompanied by a schema this is\nnot usually a problem. If the schema change is backwards compatible,\nand the portion of your program that handles the decoded data is still\nable to reference the decoded fields, there is nothing that needs to\nbe done when the schema change is detected by your program when using\ngoavro `Codec` instances to encode or decode data.\n\n## Resources\n\n* [Avro CLI Examples](https://github.com/miguno/avro-cli-examples)\n* [Avro](https://avro.apache.org/)\n* [Google Snappy](https://google.github.io/snappy/)\n* [JavaScript Object Notation, JSON](https://www.json.org/)\n* [Kafka](https://kafka.apache.org)\n\n## Usage\n\nDocumentation is available via\n[![GoDoc](https://godoc.org/github.com/linkedin/goavro?status.svg)](https://godoc.org/github.com/linkedin/goavro).\n\n```Go\npackage main\n\nimport (\n    \"fmt\"\n\n    \"github.com/linkedin/goavro/v2\"\n)\n\nfunc main() {\n    codec, err := goavro.NewCodec(`\n        {\n          \"type\": \"record\",\n          \"name\": \"LongList\",\n          \"fields\" : [\n            {\"name\": \"next\", \"type\": [\"null\", \"LongList\"], \"default\": null}\n          ]\n        }`)\n    if err != nil {\n        fmt.Println(err)\n    }\n\n    // NOTE: May omit fields when using default value\n    textual := []byte(`{\"next\":{\"LongList\":{}}}`)\n\n    // Convert textual Avro data (in Avro JSON format) to native Go form\n    native, _, err := codec.NativeFromTextual(textual)\n    if err != nil {\n        fmt.Println(err)\n    }\n\n    // Convert native Go form to binary Avro data\n    binary, err := codec.BinaryFromNative(nil, native)\n    if err != nil {\n        fmt.Println(err)\n    }\n\n    // Convert binary Avro data back to native Go form\n    native, _, err = codec.NativeFromBinary(binary)\n    if err != nil {\n        fmt.Println(err)\n    }\n\n    // Convert native Go form to textual Avro data\n    textual, err = codec.TextualFromNative(nil, native)\n    if err != nil {\n        fmt.Println(err)\n    }\n\n    // NOTE: Textual encoding will show all fields, even those with values that\n    // match their default values\n    fmt.Println(string(textual))\n    // Output: {\"next\":{\"LongList\":{\"next\":null}}}\n}\n```\n\nAlso please see the example programs in the `examples` directory for\nreference.\n\n## OCF file reading and writing\n\nThis library supports reading and writing data in [Object Container File (OCF)](https://avro.apache.org/docs/current/spec.html#Object+Container+Files) format\n\n```go\npackage main\n\nimport (\n\t\"bytes\"\n\t\"fmt\"\n\t\"strings\"\n\n\t\"github.com/linkedin/goavro/v2\"\n)\n\nfunc main() {\n\tavroSchema := `\n\t{\n\t  \"type\": \"record\",\n\t  \"name\": \"test_schema\",\n\t  \"fields\": [\n\t\t{\n\t\t  \"name\": \"time\",\n\t\t  \"type\": \"long\"\n\t\t},\n\t\t{\n\t\t  \"name\": \"customer\",\n\t\t  \"type\": \"string\"\n\t\t}\n\t  ]\n\t}`\n\n\t// Writing OCF data\n\tvar ocfFileContents bytes.Buffer\n\twriter, err := goavro.NewOCFWriter(goavro.OCFConfig{\n\t\tW:      \u0026ocfFileContents,\n\t\tSchema: avroSchema,\n\t})\n\tif err != nil {\n\t\tfmt.Println(err)\n\t}\n\terr = writer.Append([]map[string]interface{}{\n\t\t{\n\t\t\t\"time\":     1617104831727,\n\t\t\t\"customer\": \"customer1\",\n\t\t},\n\t\t{\n\t\t\t\"time\":     1717104831727,\n\t\t\t\"customer\": \"customer2\",\n\t\t},\n\t})\n\tfmt.Println(\"ocfFileContents\", ocfFileContents.String())\n\n\t// Reading OCF data\n\tocfReader, err := goavro.NewOCFReader(strings.NewReader(ocfFileContents.String()))\n\tif err != nil {\n\t\tfmt.Println(err)\n\t}\n\tfmt.Println(\"Records in OCF File\");\n\tfor ocfReader.Scan() {\n\t\trecord, err := ocfReader.Read()\n\t\tif err != nil {\n\t\t\tfmt.Println(err)\n\t\t}\n\t\tfmt.Println(\"record\", record)\n\t}\n}\n```    \nThe above code in [go playground](https://play.golang.org/p/RuHQONqBXeg)\n\n### ab2t\n\nThe `ab2t` program is similar to the reference standard\n`avrocat` program and converts Avro OCF files to Avro JSON\nencoding.\n\n### arw\n\nThe Avro-ReWrite program, `arw`, can be used to rewrite an\nAvro OCF file while optionally changing the block counts, the\ncompression algorithm. `arw` can also upgrade the schema provided the\nexisting datum values can be encoded with the newly provided schema.\n\n### avroheader\n\nThe Avro Header program, `avroheader`, can be used to print various\nheader information from an OCF file.\n\n### splice\n\nThe `splice` program can be used to splice together an OCF file from\nan Avro schema file and a raw Avro binary data file.\n\n### Translating Data\n\nA `Codec` provides four methods for translating between a byte slice\nof either binary or textual Avro data and native Go data.\n\nThe following methods convert data between native Go data and byte\nslices of the binary Avro representation:\n\n    BinaryFromNative\n    NativeFromBinary\n\nThe following methods convert data between native Go data and byte\nslices of the textual Avro representation:\n\n    NativeFromTextual\n    TextualFromNative\n\nEach `Codec` also exposes the `Schema` method to return a simplified\nversion of the JSON schema string used to create the `Codec`.\n\n#### Translating From Avro to Go Data\n\nGoavro does not use Go's structure tags to translate data between\nnative Go types and Avro encoded data.\n\nWhen translating from either binary or textual Avro to native Go data,\ngoavro returns primitive Go data values for corresponding Avro data\nvalues. The table below shows how goavro translates Avro types to Go\ntypes.\n\n| Avro               | Go                       |\n| ------------------ | ------------------------ |\n| `null`             | `nil`                    |\n| `boolean`          | `bool`                   |\n| `bytes`            | `[]byte`                 |\n| `float`            | `float32`                |\n| `double`           | `float64`                |\n| `long`             | `int64`                  |\n| `int`              | `int32`                  |\n| `string`           | `string`                 |\n| `array`            | `[]interface{}`          |\n| `enum`             | `string`                 |\n| `fixed`            | `[]byte`                 |\n| `map` and `record` | `map[string]interface{}` |\n| `union`            | *see below*              |\n\nBecause of encoding rules for Avro unions, when an union's value is\n`null`, a simple Go `nil` is returned. However when an union's value\nis non-`nil`, a Go `map[string]interface{}` with a single key is\nreturned for the union. The map's single key is the Avro type name and\nits value is the datum's value.\n\n#### Translating From Go to Avro Data\n\nGoavro does not use Go's structure tags to translate data between\nnative Go types and Avro encoded data.\n\nWhen translating from native Go to either binary or textual Avro data,\ngoavro generally requires the same native Go data types as the decoder\nwould provide, with some exceptions for programmer convenience. Goavro\nwill accept any numerical data type provided there is no precision\nlost when encoding the value. For instance, providing `float64(3.0)`\nto an encoder expecting an Avro `int` would succeed, while sending\n`float64(3.5)` to the same encoder would return an error.\n\nWhen providing a slice of items for an encoder, the encoder will\naccept either `[]interface{}`, or any slice of the required type. For\ninstance, when the Avro schema specifies:\n`{\"type\":\"array\",\"items\":\"string\"}`, the encoder will accept either\n`[]interface{}`, or `[]string`. If given `[]int`, the encoder will\nreturn an error when it attempts to encode the first non-string array\nvalue using the string encoder.\n\nWhen providing a value for an Avro union, the encoder will accept\n`nil` for a `null` value. If the value is non-`nil`, it must be a\n`map[string]interface{}` with a single key-value pair, where the key\nis the Avro type name and the value is the datum's value. As a\nconvenience, the `Union` function wraps any datum value in a map as\nspecified above.\n\n```Go\nfunc ExampleUnion() {\n    codec, err := goavro.NewCodec(`[\"null\",\"string\",\"int\"]`)\n    if err != nil {\n        fmt.Println(err)\n    }\n    buf, err := codec.TextualFromNative(nil, goavro.Union(\"string\", \"some string\"))\n    if err != nil {\n        fmt.Println(err)\n    }\n    fmt.Println(string(buf))\n    // Output: {\"string\":\"some string\"}\n}\n```\n\n## Limitations\n\nGoavro is a fully featured encoder and decoder of binary and textual\nJSON Avro data. It fully supports recursive data structures, unions,\nand namespacing. It does have a few limitations that have yet to be\nimplemented.\n\n### Aliases\n\nThe Avro specification allows an implementation to optionally map a\nwriter's schema to a reader's schema using aliases. Although goavro\ncan compile schemas with aliases, it does not yet implement this\nfeature.\n\n### Kafka Streams\n\n[Kafka](http://kafka.apache.org) is the reason goavro was\nwritten. Similar to Avro Object Container Files being a layer of\nabstraction above Avro Data Serialization format, Kafka's use of Avro\nis a layer of abstraction that also sits above Avro Data Serialization\nformat, but has its own schema. Like Avro Object Container Files, this\nhas been implemented but removed until the API can be improved.\n\n### Default Maximum Block Counts, and Block Sizes\n\nWhen decoding arrays, maps, and OCF files, the Avro specification\nstates that the binary includes block counts and block sizes that\nspecify how many items are in the next block, and how many bytes are\nin the next block. To prevent possible denial-of-service attacks on\nclients that use this library caused by attempting to decode\nmaliciously crafted data, decoded block counts and sizes are compared\nagainst public library variables MaxBlockCount and MaxBlockSize. When\nthe decoded values exceed these values, the decoder returns an error.\n\nBecause not every upstream client is the same, we've chosen some sane\ndefaults for these values, but left them as mutable variables, so that\nclients are able to override if deemed necessary for their\npurposes. Their initial default values are (`math.MaxInt32` or\n~2.2GB).\n\n### Schema Evolution\n\nPlease see [my reasons why schema evolution is broken for Avro\n1.x](https://github.com/linkedin/goavro/blob/master/SCHEMA-EVOLUTION.md).\n\n## License\n\n### Goavro license\n\nCopyright 2017 LinkedIn Corp. Licensed under the Apache License,\nVersion 2.0 (the \"License\"); you may not use this file except in\ncompliance with the License. You may obtain a copy of the License at\nhttp://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\nimplied.\n\n### Google Snappy license\n\nCopyright (c) 2011 The Snappy-Go Authors. All rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are\nmet:\n\n   * Redistributions of source code must retain the above copyright\nnotice, this list of conditions and the following disclaimer.\n   * Redistributions in binary form must reproduce the above\ncopyright notice, this list of conditions and the following disclaimer\nin the documentation and/or other materials provided with the\ndistribution.\n   * Neither the name of Google Inc. nor the names of its\ncontributors may be used to endorse or promote products derived from\nthis software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n\"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\nLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\nA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\nOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\nSPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\nLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\nDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\nTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n## Third Party Dependencies\n\n### Google Snappy\n\nGoavro links with [Google Snappy](http://google.github.io/snappy/)\nto provide Snappy compression and decompression support.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedin%2Fgoavro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinkedin%2Fgoavro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedin%2Fgoavro/lists"}