{"id":19212487,"url":"https://github.com/fasterthanlime/stop-optimizing-me","last_synced_at":"2025-05-12T20:38:11.657Z","repository":{"id":66881798,"uuid":"135475368","full_name":"fasterthanlime/stop-optimizing-me","owner":"fasterthanlime","description":"How NOT to optimize something","archived":false,"fork":false,"pushed_at":"2018-05-30T17:19:52.000Z","size":12,"stargazers_count":14,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-20T17:40:03.466Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fasterthanlime.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-05-30T17:19:42.000Z","updated_at":"2023-10-07T10:25:41.000Z","dependencies_parsed_at":"2023-02-23T14:15:58.845Z","dependency_job_id":null,"html_url":"https://github.com/fasterthanlime/stop-optimizing-me","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasterthanlime%2Fstop-optimizing-me","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasterthanlime%2Fstop-optimizing-me/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasterthanlime%2Fstop-optimizing-me/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fasterthanlime%2Fstop-optimizing-me/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fasterthanlime","download_url":"https://codeload.github.com/fasterthanlime/stop-optimizing-me/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253817589,"owners_count":21969007,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T13:47:08.303Z","updated_at":"2025-05-12T20:38:11.590Z","avatar_url":"https://github.com/fasterthanlime.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# stop optimizing me!\n\nThis github repo is sort of a runnable blog post? I promise it gets good.\n\n\u003e Don't run the benchmarks now if you want to avoid spoilers.\n\n### Background story\n\nWe changed an API from a format like: \n\n```json\n{\n  \"p_osx\": true,\n  \"p_linux\": false,\n  \"p_windows\": true,\n  \"p_android\": false,\n  \"can_be_bought\": true,\n  \"has_demo\": false,\n  \"in_press_system\": false,\n}\n```\n\nTo a format like:\n\n```json\n{\n  \"traits\": [\"p_osx\", \"p_windows\", \"can_be_bought\"]\n}\n```\n\n... which takes up less space (and is more human-friendly).\n\nBut on the client-side, we still need a data representation:\n\n  * that lets us look up quickly whether an object has a particular trait or not\n  * that can be easily persisted to a database (where each trait has its own column)\n\nLuckily, Go lets us write custom marshalers/unmarshalers from json, so\nwe can let it know how we want to decode the `Traits` field of a struct\nlike this for example:\n\n```go\ntype Game struct {\n  Title string\n  Traits GameTraits\n}\n\n// create a type alias here, so we can\n// define methods on GameTraits\ntype GameTraits something\n\nfunc (gt GameTraits) MarshalJSON() ([]byte, error) {\n  return nil, errors.New(\"implement me!\")\n}\n\nfunc (gt GameTraits) UnmarshalJSON(data []byte) error {\n  return errors.New(\"implement me too!\")\n}\n\n// this is a Golang trick to make sure GameTraits implements\n// the interfaces we care about\nvar _ json.Marshaler = GameTraits{}\n// we simply declare an unnamed variable of the interface type,\n// and assign an empty object of our type to it. If it doesn't\n// implement the interface properly, it'll fail at compile time\n// (even if our type isn't used anywhere - which isn't that uncommon\n// for a library)\nvar _ json.Unmarshaler = GameTraits{}\n\n```\n\n### Solution 001: Using a map\n\n\u003e See the [001_map_simplest.go](001_map_simplest.go) file for runnable code\n\nA relatively straight-forward solution is to represent our traits as map.\n\nWe'll add a type alias for our key type, and add consts for all traits:\n\n```go\ntype GameTrait string\n\nconst (\n  GameTraitPlatformWindows GameTrait = \"p_windows\"\n  GameTraitPlatformLinux GameTrait   = \"p_linux\"\n  // etc.\n)\n```\n\nThen we can define a set of traits as a map with our key type:\n\n```go\n// notice the plural here\ntype GameTraits map[GameTrait]bool\n```\n\n\u003e Note: `bool` is overkill here, since we'll only ever have true\n\u003e values.\n\u003e\n\u003e We could use a `map[GameTrait]struct{}` instead, if we wanted\n\u003e to actually use this solution - then, values would take no memory at all.\n\nAnd implement marshalling:\n\n```go\nfunc (gt GameTraits) MarshalJSON() ([]byte, error) {\n  // in JSON, our traits are stored in an array of strings,\n  // so let's build just that\n  var traits []string\n\n  // gt is of type GameTraits, which is a map, so we can iterate\n  // over its keys with a for-range:\n  for k := range gt {\n    // we don't need to check the value stored in the map, since\n    // it's always true\n    traits = append(traits, k)\n  }\n\n  // golang knows how to marshal a string array, so let's let it do\n  // the thing. `json.Marshal()` also returns `([]byte, error)`\n  // so this works just fine\n  return json.Marshal(traits)\n}\n```\n\nAnd unmarshalling:\n\n```go\nfunc (gt GameTraits) UnmarshalJSON(data []byte) error {\n  // we're given a slice of bytes, let's first unmarshal\n  // it as an array of strings, then work from that\n  var traits []string\n\n  err := json.Unmarshal(data, \u0026traits)\n  if err != nil {\n    return err\n  }\n\n  // for each element of the array (we don't care about the index)\n  for _, trait := range traits {\n    // store it in the map\n    gt[trait] = true\n  }\n\n  // no errors, yay!\n  return nil\n}\n```\n\nWhoops, this version of `UnmarshalJSON` crashes with `assignment to nil map`.\n\nThat's because a `map[K]V` is actually a pointer - it need to be created explicitly with `make`.\n\nNo problem, let's define `UnmarshalJSON` on a pointer to GameTraits instead,\nand make a map from scratch every time it's called.\n\n\u003e Note: Go compilers know to turn `gt.UnmarshalJSON(data)` into `(\u0026gt).UnmarshalJSON(data)`. It's just one of those things you need to know.\n\n\n```go\n// taking a pointer over here\nfunc (gtp *GameTraits) UnmarshalJSON(data []byte) error {\n  var traits []string\n  // making a fresh map over there\n  gt := make(GameTraits)\n\n  err := json.Unmarshal(data, \u0026traits)\n  if err != nil {\n    return err\n  }\n\n  for _, trait := range traits {\n    gt[trait] = true\n  }\n\n  // make gtp point to our fresh map\n  *gtp = gt\n\n  return nil\n}\n```\n\nAnd it works pretty well! A very bad and inaccurate microbenchmark shows:\n\n```\n5900 ns/op\t    1257 B/op\t      20 allocs/op\n```\n\n### Maps are bad actually\n\nThere's a bunch of annoying things about our map implementation.\n\nLet's go through them in no particular order.\n\nThe first is **memory layout**. Our `GameTraits` type isn't meant to\nbe used in the void. It's supposed to be embedded into another type, like so:\n\n```go\ntype Game {\n  Title string\n  Traits GameTraits\n}\n```\n\nIn solution 001, `Traits` is a pointer to a map, which means we're actually\ndealing with two different objects, at different places in memory,\nallocated at different times - and every access means dereferencing the pointer\nto the map, hashing the key, finding the bucket, and returning the value.\n\nThe second is **ease of use**. Declaring a `GameTraits` map with some\nvalues set isn't the hardest thing in the world thanks to map literals:\n\n```go\ntraits := \u0026GameTraits{\n  // we can use consts as map keys\n  GameTraitPlatformLinux: true,\n  GameTraitPlatformWindows: true,\n}\n```\n\nChecking for a value isn't that bad either:\n\n```go\nif traits[GameTraitPlatformLinux] {\n  // do something linux-specific\n}\n```\n\nBut if we were going to use this approach in the real world, we'd have\ndefined `GameTraits` like so:\n\n```go\ntype GameTraits map[GameTrait]struct{}\n```\n\nThen declaring and accessing would look like this, respectively:\n\n```go\ntraits := \u0026GameTraits{\n  // uglyyyyyy (but compact in memory)\n  GameTraitPlatformLinux: struct{}{},\n  GameTraitPlatformWindows: struct{}{},\n}\n\n// oh yeah, map lookups actually return (T, bool)\n// just another lil' Go thing\nif _, ok := traits[GameTraitPlatformLinux]; ok {\n  // do something linux-specific\n}\n```\n\nAnd, like, ok, this is Go, we're all used to writing more code than we\nshould be just so the compiler can be fast and the language simple and yadda-yadda, but **come on**.\n\nThe third is that **it's not reflect-friendly**. A map is really friendly\nto iterate, but that only tells us about the keys currently set - there's\nno way to list all possible keys!\n\n\u003e Note: the fact that Go doesn't have *real* enums (or generic types) doesn't\n\u003e help our case here at all.\n\n\u003e Note 2: Oh btw, this whole article has an implicit \"yes, I've heard about\n\u003e language X and it's wonderful, but this is Go we're talking about\" agreement.\n\nThis is problematic because we're going to end up shoving these values\nin an SQLite table, and we definitely want to know what values there are,\nas they're each going to have their own column.\n\n### Solution 002: Using a struct\n\n\u003e See the [002_struct_simplest.go](002_struct_simplest.go) file for runnable code\n\nI'm actually sorta proud of this one.\n\nSo we start out by defining `GameTraits` as the simplest struct, with\nonly boolean fields - and we annotate each of them with the string\nthat appears in the JSON array:\n\n```go\ntype GameTraits struct {\n  PlatformOSX     bool `trait:\"p_osx\"`\n  PlatformWindows bool `trait:\"p_windows\"`\n  // etc.\n}\n```\n\nLet's review **memory layout** here - remember our type is used in another\ntype, like so:\n\n```go\ntype Game {\n  Title string\n  Traits GameTraits\n}\n```\n\nThis is effectively equivalent to:\n\n```go\ntype Game {\n  Title string\n  Traits struct{\n    PlatformOSX bool\n    PlatformWindows bool\n    // etc.\n  }\n}\n```\n\nWhich is effectively equivalent to:\n\n```go\ntype Game {\n  Title string\n  TraitPlatformOSX bool\n  TraitPlatformWindows bool\n  // etc.\n}\n```\n\nEverything gets allocated at the same time, in the same memory block, so\nit's cache-friendly and everything.\n\n\u003e Note: ok there's a lot more to consider when we want to design CPU-friendly\n\u003e things - not the least of which, alignment. And depending on how often\n\u003e traits are accessed, indirection might actually *help* avoid cache misses.\n\u003e\n\u003e But y'all realize that's way outside the scope of this document right.\n\u003e We're doing *microbenchmarks* on a piece of code that we *shouldn't be optimizing*, so let's keep it nice and simple.\n\nLet's do a quick **ease of use** review:\n\n```go\ntraits := \u0026GameTraits{\n  PlatformOSX: true,\n  CanBeBought: true,\n}\n\nif traits.PlatformLinux {\n  // do something linux-specific here\n}\n```\n\nBeautiful! Well, the beautifulest that Go will allow, but still, that's something.\n\nSo, we love it, the CPU loves it (*), all that's left is to implement marshalling\nand unmarshalling.\n\n\u003e (*) The CPU might not love it\n\nNow, we don't want to write specific code for each field, because then we'd have\n3 places where fields are listed, and it's just too easy to miss one.\n\nInstead, let's use reflection.\n\n\u003e Note: I don't care if you think that it's a bad excuse to use reflection.\n\u003e Try and stop me.\n\nIt's all relatively straightforward, *provided you've spent weeks getting to\nknow the specifics of Go reflection*.\n\n```go\nfunc (gt GameTraits) MarshalJSON() ([]byte, error) {\n  // eventually we're going to marshal this\n  var traits []string\n\n  // let's turn 'gt' into something we can inspect\n  val := reflect.ValueOf(gt)\n  // let's grab its type too, because we're going to\n  // need those `trait:\"XXX\"` annotations\n  typ := val.Type()\n\n  // probably the fastest way to go through all the fields\n  // of a struct\n  for i := 0; i \u003c typ.NumField(); i++ {\n    // this evaluates to true if the ith field of gt is set\n    // (in the order in which they were defined - which Go\n    // spec says is the same order in which they're laid out in memory)\n    if val.Field(i).Bool() {\n      // if it's true let's grab the annotation\n      trait := typ.Field(i).Tag.Get(\"trait\")\n\n      // ...and add it to our result array\n      traits = append(traits, trait)\n    }\n  }\n\n  // finally we can emit some JSON\n  return json.Marshal(traits)\n}\n```\n\nUnmarshalling is a similar bundle of fun:\n\n```go\n// we need to be receiving a pointer, because we're going\n// to be modifying the contents of gt\nfunc (gt *GameTraits) UnmarshalJSON(data []byte) error {\n  // we have to call `Elem()` here because `gt` is a pointer\n  val := reflect.ValueOf(gt).Elem()\n  typ := val.Type()\n\n  // let's unmarshal into an array first\n  var traits []string\n\n  err := json.Unmarshal(data, \u0026traits)\n  if err != nil {\n    return err\n  }\n\n  for _, trait := range traits {\n    // oh no, we have to do an O(n) lookup to find\n    // the right field of the struct. I'm sure this won't\n    // completely bomb the microbenchmark...\n    for i := 0; i \u003c typ.NumField(); i++ {\n      // oh noooo, string comparison\n      if trait == typ.Field(i).Tag.Get(\"trait\") {\n        // at least we're using the fastest way to index fields...\n        // small consolation prize\n        val.Field(i).SetBool(true)\n      }\n    }\n  }\n\n  // woo we did it\n  return nil\n}\n```\n\nOkay that was a weird value of \"fun\", but hey, it works.\n\nLet's take a look at the benchmark:\n\n```\n 5900 ns/op\t    1257 B/op\t      20 allocs/op\n11700 ns/op\t    1392 B/op\t      60 allocs/op\n```\n\nOh noooooo. Not only is it 2x slower, it does 3x as many allocations.\n\n\u003e Repeat after me: it *does not matter*. We could ship this and all meet\n\u003e at the pub. This is in no way performance-critical. We won't be unmarshalling\n\u003e billions or even millions of records in a tight loop. What follows is\n\u003e completely gratuitious.\n\u003e\n\u003e The article should stop right here. But it doesn't.\n\n### Solution 003: bye bye O(n)\n\n\u003e See the [003_struct_cachereflect](003_struct_cachereflect.go) file for runnable code\n\nOkay so, at this point, our beautiful struct-based solution is slower than\na map - and being beaten by a general-purpose hash map is really vexing.\n\nWe don't know (because we haven't measured) how expensive exactly reflection is,\nbut what we **do** know is that for every trait, we have to go through up to\nN struct fields and compare strings just to find the right one - and that's,\nlike, criminal.\n\n\u003e It's not. It's fine. Go home. Stop optimizing me.\n\nBut here's the thing: the layout of our `GameTraits` struct does not change.\nIt's always the same from one execution of `{Unm,M}arshalJSON` to the next.\n\nYet we always do the same amount of work, as if we didn't know anything about\nthe type.\n\nI say let's do all the work we can in advance, and cache it in a structure\nwith an O(1) lookup\n\n\u003e map: So, you've come crawling back...\n\nWe'll need a way to map a trait to the index of the correspondingfield in the struct:\n\n```go\n// int is fine, we don't have 2^31-ish fields\nvar gameTraitToIndex map[string]int\n```\n\nAnd we could also use a list of traits (in the same order as the struct),\nso we don't need to use reflection to iterate them:\n\n```go\nvar gameTraits []string\n```\n\nWe'll use an `init` function (that is guaranteed to run on program startup)\nand reflect that struct once and for all:\n\n```go\nfunc init() {\n  // making a dummy struct just to get a reflect.Type\n  typ := reflect.TypeOf(GameTraitsStruct{})\n\n  // remember maps have to be `make()'d` - assigning to a\n  // nil map will crash.\n  gameTraitToIndex = make(map[string]int)\n\n  // let's also allocate the gameTraits array instead of\n  // using append(), since we know exactly how many items it should contain\n  gameTraits = make([]string, typ.NumField())\n\n  for i := 0; i \u003c typ.NumField(); i++ {\n    // scanning annotations only once, good!\n    trait := typ.Field(i).Tag.Get(\"trait\")\n    // all fairly straight-forward bookkeeping here.\n    gameTraitToIndex[trait] = i\n    gameTraits[i] = trait\n  }\n}\n```\n\nNow, we can make our marshalling function faster - using the `gameTraits`\narray to go through each field. We still need to use reflection to see if\nit's set, though:\n\n```go\nfunc (gt GameTraitsStruct) MarshalJSON() ([]byte, error) {\n  // we're going to marshal that in the end\n  var traits []string\n\n  val := reflect.ValueOf(gt)\n  // we actually care about the index (i) here, since\n  // it's also the index of the field in the GameTraits struct\n  for i, trait := range gameTraits {\n    // `val.Field(i)` returns a `reflect.Value`, we need to call `Bool()`\n    // on it to evaluate it as a boolean.\n    if val.Field(i).Bool() {\n      // using append is bad (it can cause traits to be reallocated several\n      // times if we're unlucky), but apart from iterating gameTraits twice,\n      // we can't really do much better\n      traits = append(traits, trait)\n    }\n  }\n  return json.Marshal(traits)\n}\n```\n\nSimilarly, we can write a better unmarshaler:\n\n```go\nfunc (gt *GameTraitsStruct) UnmarshalJSON(data []byte) error {\n  // ok this part is second nature by now\n  var traits []string\n  err := json.Unmarshal(data, \u0026traits)\n  if err != nil {\n    return err\n  }\n\n  // we have a pointer receiver (the (gt *GameTraitsStruct) bit in our\n  // function declaration), so we need to call `Elem()` here\n  val := reflect.ValueOf(gt).Elem()\n  for _, trait := range traits {\n    // our handy map lets us know which field to set in O(1)\n    // looks pretty good!\n    val.Field(gameTraitToIndex[trait]).SetBool(true)\n  }\n  return nil\n}\n```\n\nLet's check the benchmarks:\n\n```\n 5900 ns/op\t    1257 B/op\t      20 allocs/op\n11700 ns/op\t    1392 B/op\t      60 allocs/op\n 5300 ns/op\t    1072 B/op\t      20 allocs/op\n```\n\nWow, this is much better! \"Ship it\" levels of better.\n\nIt even beats the map approach by a hair - with the same number of\nmemory allocations (20), and slightly lower memory usage.\n\nFor real this time, **this is where the article should stop**. We took\na dumb approach, it was slower than another dumb approach, so we made\nit slightly smarter *without making an unholy mess*, and now it consistently\nbeats the first dumb approach.\n\nThere **is** such a thing as good enough, and that is most definitely it\nright here. Please, please stop reading here.\n\n### Solution 004: it's all bytes in the end\n\nOh.. you're still here.\n\n\u003e See the [004_struct_handrolled](004_struct_handrolled.go) file for runnable code\n\nSo 20 allocations to marshal and unmarshal a bunch of traits feels like it's too much.\n\n\u003e It's not. It's not too much. Turn back, it's still time!\n\n...after all, we're building a whole `[]string` just to call some function\nof the standard Go library and to throw it away.\n\nThat's bad!\n\n\u003e It's not that bad.\n\nNo, it is! We're not even pooling it, so it's a fresh allocation every time.\nThat means we're generating garbage the GC will have to free eventually - it'll\nhave to keep track of these temporary objects, and reclaim then in a sweep phase,\nand..\n\n\u003e THAT'S THE GC'S JOB. IT'S FIIINE.\n\nWe just need to return a `[]byte`, right? Why don't we build that directly?\n\n\u003e Because...\n\nAnd JSON is not that hard\n\n\u003e ...it really is though.\n\nOkay, JSON done right is really tricky, but we don't care about all valid JSON,\nwe only care about:\n\n  * an array\n  * of values that can only contain `[A-Za-z0-9_]`\n\nI'm sure we can parse and emit that easily!\n\n\u003e Ok buddy you're on your own\n\nLet's goooooooooooo\n\n```go\nfunc (gt GameTraitsStruct) MarshalJSON() ([]byte, error) {\n  // ok, this is the thing we *won't* handroll: bytes.Buffer\n  // is pretty well-optimized, it'll be hard to beat.\n  // we even allocate it on the stack, and it has a `bootstrap [64]byte`\n  // field that will *probably* be enough for most calls\n  var bb bytes.Buffer\n\n  // starting a JSON array, dum-de-dum\n  bb.WriteByte('[')\n\n  first := true\n  val := reflect.ValueOf(gt)\n  for i, trait := range gameTraits {\n    // still \n    if val.Field(i).Bool() {\n      if first {\n        // if it's the first value we're writing,\n        // the next value will not be the first anymore\n        first = false\n      } else {\n        // if it's *not* the first value we're writing,\n        // we need a comma separator\n        bb.WriteByte(',')\n      }\n      // JSON strings are double-quoted\n      bb.WriteByte('\"')\n      // we don't need to escape anything, as long as our\n      // values are [A-Za-z0-9_]\n      bb.WriteString(trait)\n      // let's not forget to close the string\n      bb.WriteByte('\"')\n    }\n  }\n  \n  // ...and close the array\n  bb.WriteByte(']')\n\n  // oh btw we completely did forgo error handling\n  // we're writing to memory, what could wrong?\n  // (ok, we could be out of RAM, but then we'd\n  // have bigger problems)\n\n  // tada! no json.Marshal()!\n  return bb.Bytes(), nil\n}\n```\n\nOk that wasn't so bad.\n\nWhat I mean is that the unmarshaller is much worse still:\n\n```go\nfunc (gt *GameTraitsStruct) UnmarshalJSON(data []byte) error {\n  val := reflect.ValueOf(gt).Elem()\n  // oh no that's never a good sign\n  i := 0\n  // ok so I guess our function will accept incomplete inputs\n  // that's fine (:fire:)\n  for i \u003c len(data) {\n    switch data[i] {\n    case '\"':\n      // oh look a string started, let's find the matching double quote\n      j := i + 1\n    scanString:\n      for {\n        switch data[j] {\n        case '\"':\n          // we found the matching double quote!\n          // better hope we don't have an off-by-one error here\n          // (there isn't, but there was the first time I wrote this)\n          trait := string(data[i+1 : j])\n          // i is our main cursor, skip over the whole quoted string\n          i = j + 1\n          // still using our map of \"trait name to field index\" to\n          // speed things up.\n          // still using reflection though.\n          val.Field(gameTraitToIndex[trait]).SetBool(true)\n          // oh yeah go has labels, very handy\n          // probably unneeded here, but years of C/JS \n          // have made me paranoid about switch statements.\n          break scanString\n        default:\n          // if we have anything other than a double quote, keep reading.\n          // this would break *badly* if double quotes were allowed\n          // in our trait names, because we do not handle escaping at all.\n\n          // in fact, our input could be valid but made entirely of `\\uXXXX`\n          // escapes and this wouldn't handle it correctly.\n\n          // it's unlikely in the real world, but let's just say we're not\n          // writing a JSON-compliant parser - we're writing for a very\n          // narrow subset.\n          j++\n        }\n      }\n    case ']':\n      // ah, that must mean the array is terminated (fingers crossed)\n      // so many checks we're not making here, I can't even...\n\n      // oh btw, notice that we never checked if the array *started*...\n      // in other terms, our function will be happy with this input:\n      // \n      //   \"a\",DINOSAUR\"b\"]haha whoops trailing data\n      //\n      // which is fantastic and disgusting. it's fantasting.\n      return nil\n    default:\n      i++\n    }\n  }\n  \n  // is it lunch time already\n  return nil\n}\n```\n\nAt this point in the game, we better find a good lawyer, because I've\nlost track of how many crimes we've committed.\n\nBut let's run benchmarks:\n\n```\n 5900 ns/op\t    1257 B/op\t      20 allocs/op\n11700 ns/op\t    1392 B/op\t      60 allocs/op\n 5300 ns/op\t    1072 B/op\t      20 allocs/op\n  700 ns/op\t     128 B/op\t       3 allocs/op \n```\n\n**B e a u t i f u l**.\n\nWe're doing **a sixth** of the allocations, using **a tenth** of the memory,\nand are running **almost 10x faster**.\n\nSurely we can stop there right.\n\n\u003e YOU COULD HAVE STOPPED EIGHT PAGES AGO\n\n### Solution 005: oh god why\n\n...ok let's try one more thing.\n\n\u003e See the [005_unreasonably_custom](005_unreasonably_custom.go) file for runnable code\n\nI'm sorta bothered by solution 004, because it's dirty and bad, but it's not\n100% dirty and bad. We're still using reflection, which is one of the Great Evils\nand the source of all misery on earth and beyond.\n\nSurely we can go, like, much further into stupidity.\n\nI know I talked about code duplication before, and how we shouldn't be listing\ntraits in three different places but hey fuck it let's do exactly that.\n\n```go\nfunc (gt GameTraitsStruct) MarshalJSON() ([]byte, error) {\n  // we know that part\n  var bb bytes.Buffer\n  bb.WriteByte('[')\n\n  first := true\n  // let's have one of these blocks for each value, ok sure why not\n  if gt.PlatformAndroid {\n    if first {\n      first = false\n    } else {\n      bb.WriteByte(',')\n    }\n    // oh neat, we turned 3 calls into one!\n    bb.WriteString(`\"p_android\"`)\n\n    // ...we could even have `,\"p_android\"` in a branch\n    // ...no let's just finish this freakin' post.\n  }\n  // yeah one more\n  if gt.PlatformWindows {\n    if first {\n      first = false\n    } else {\n      bb.WriteByte(',')\n    }\n    bb.WriteString(`\"p_windows\"`)\n  }\n  // and so on and so forth\n\n  // \u003csnip.......\n  //\n  // SO MUCH CODE OMITTED\n  //\n  // ........snip\u003e\n\n  // aw yiss bb\n  bb.WriteByte(']')\n  return bb.Bytes(), nil\n}\n```\n\nThat's just marshalling though!\n\nSurely we can do something equally stupid for unmarshalling?\n\nSomething along the lines of\n\n```go\nswitch trait {\n  case \"p_osx\":     gt.PlatformOSX = true\n  case \"p_windows\": gt.PlatformWindows = true\n  // etc.\n}\n```\n\nNo. No no no.\n\nSee, a sufficiently smart compiler would generate efficient code for that\n(instead of doing up to N string comparisons).\n\nBut we're writing Go here, so let's not assume the compiler is sufficiently\nsmart.\n\nInstead, let's write exactly what we would expect a smart compiler to generate:\n\n```go\nfunc (gt *GameTraitsStruct) UnmarshalJSON_UnreasonablyCustom(data []byte) error {\n  // same boring parser as solution 004, skip until next comment, right...\n  i := 0\n  for i \u003c len(data) {\n    switch data[i] {\n    case '\"':\n      j := i + 1\n    scanString:\n      for {\n        switch data[j] {\n        case '\"':\n          // ...there. welcome back!\n          trait := data[i+1 : j]\n          // `trait` is now a byte slice ([]byte) that holds, well, our trait.\n          // let's not compare strings, that's dumb.\n          // let's compare as little as possible\n          switch trait[0] {\n          case 'p':\n            // if it starts with a `p`, it's a platform.\n            // we know trait[1] is always going to be '_', so\n            // let's not bother checking it\n            switch trait[2] {\n            case 'w':\n              // luckily all platforms start with a unique letter!\n              gt.PlatformWindows = true\n            case 'l':\n              gt.PlatformLinux = true\n            case 'o':\n              gt.PlatformOSX = true\n            case 'a':\n              gt.PlatformAndroid = true\n            }\n          // only \"has_demo\" starts with 'h'\n          case 'h':\n            gt.HasDemo = true\n          // etc.\n          case 'c':\n            gt.CanBeBought = true\n          case 'i':\n            gt.InPressSystem = true\n          }\n          // rest of the boring parser omitted for brevity (lol)\n}\n```\n\nI mean, if *I* was a compiler, that's what I would generate. Maybe the ordering\nwould be different, but maybe the CPU's pipelining is good enough that it doesn't\nmatter. I would do some profile-guided optimization (PGO) if the pay was good\nenough, but I hear most compilers are free these days, so I'm not holding my digital breath.\n\nDoes this perform as well as it looks ugly?\n\n```\n 5900 ns/op\t    1257 B/op\t      20 allocs/op\n11700 ns/op\t    1392 B/op\t      60 allocs/op\n 5300 ns/op\t    1072 B/op\t      20 allocs/op\n  700 ns/op\t     128 B/op\t       3 allocs/op \n  300 ns/op\t     112 B/op\t       1 allocs/op\n```\n\nOH YOU BET IT DOES.\n\n---\n\nOk so this is a good example of what *not* to do.\n\nI was curious how fast I could get it, and I had already written that code\n(which I'm now going to trash), I figured I might as well turn it into an\nexploratory lesson of what *not* to do.\n\nTake care y'all, I'm off shipping solution 003 because it's better than good enough.\n\n * You can [follow me on Twitter](https://twitter.com/fasterthanlime) if you like this\n\n(Please do, so this won't have been all in vain)\n\n\u003e Note: to run the benchmarks for yourself, clone this repo\n\u003e and run `go test -benchmem -bench .`\n\u003e\n\u003e In the paragraphs above, I've rounded the `ns/op` values liberally\n\u003e to make comparison easier\n\u003e\n\u003e These microbenchmarks are never reliable to start with, so I hope you'll forgive me.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffasterthanlime%2Fstop-optimizing-me","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffasterthanlime%2Fstop-optimizing-me","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffasterthanlime%2Fstop-optimizing-me/lists"}