{"id":16881580,"url":"https://github.com/benbjohnson/application-development-using-boltdb","last_synced_at":"2025-04-11T11:45:23.644Z","repository":{"id":66318058,"uuid":"62244799","full_name":"benbjohnson/application-development-using-boltdb","owner":"benbjohnson","description":"Repository for my \"Application Development Using BoltDB\" talk","archived":false,"fork":false,"pushed_at":"2016-07-01T21:34:39.000Z","size":20,"stargazers_count":27,"open_issues_count":0,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-02T21:05:49.027Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/benbjohnson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-06-29T17:28:39.000Z","updated_at":"2025-03-07T15:33:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"f4a370c9-f5e2-4c8c-bfbc-242301f11d10","html_url":"https://github.com/benbjohnson/application-development-using-boltdb","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benbjohnson%2Fapplication-development-using-boltdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benbjohnson%2Fapplication-development-using-boltdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benbjohnson%2Fapplication-development-using-boltdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benbjohnson%2Fapplication-development-using-boltdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/benbjohnson","download_url":"https://codeload.github.com/benbjohnson/application-development-using-boltdb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248387768,"owners_count":21095271,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T16:04:05.083Z","updated_at":"2025-04-11T11:45:23.625Z","avatar_url":"https://github.com/benbjohnson.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"Application Development Using BoltDB\n====================================\n\n### Abstract\n\nWe've been taught for decades that we need a complex database server to run our\napplications. However, these servers incur a huge performance hit for most\nqueries and they are frequently misconfigured because of their operational\ncomplexity which can cause slowness or downtime. In this talk, I'll show you how\nto use a local, pure Go key/value store called BoltDB to build applications that\nare both simple and fast. We will see how development and deployment become a\nbreeze once you ditch your complex database server.\n\n\n### Introduction\n\nSoftware is too complex and too slow. We've seen the speed of CPUs increase\nby orders of magnitude in the past few decades yet our applications seem to\nrequire more hardware than ever. In the past several years I've used embedded\nkey/value databases for my applications because of their simplicity and speed.\nToday I'm going to walk you through building a simple application using a \npure Go key/value store I wrote called BoltDB.\n\n\n### What is a embedded key/value database?\n\nBefore we dive in, let's talk about what an embedded key/value store even is!\n\"Embedded\" refers to the database actually being compiled into your application\ninstead of a database server which you connect to over a socket. A good example\nof an embedded database is SQLite.\n\nHowever, SQLite is a relational database so let's talk about what \"key/value\"\nmeans next. Key/value databases are extremely simple. They map a \"key\", which\nis just a unique set of bytes, to a \"value\", which is an arbitrary set of bytes.\nIt helps to think of this in terms of relational databases. Your \"key\" would be\nyour primary key and your value would be the encoded row. In fact, most database\nservers utilize a key/value database internally to store rows. Essentially,\nthe key/value database is just a persisted map.\n\nSome key/value databases allow you to have multiple key/value mappings. In\nBoltDB, these are called \"buckets\". Every key is unique in a bucket and points\nto a value. Many times you can think of buckets like tables in a relational\ndatabase. You may have a \"users\" bucket or a \"products\" bucket.\n\nJust as there are many database servers to choose from, there are also many\ntypes of embedded key/value databases with different trade offs. Sometimes\nyou'll trade write performance for read performance or you'll trade\ntransactional safety for performance. For example, BoltDB is read optimized and\nsupports fully serializable ACID transactions. This makes it good for many read-\nheavy applications that require strong guarantees.\n\n\n### Getting started with BoltDB\n\nOne of the best things about using BoltDB is that installation process is so\nsimple. You don't need to install a server or even configure it. You just use\n\"go get\" like any other Go package and it'll work on Windows, Mac, Linux,\nRaspberry Pi, and even iOS and Android.\n\n```\n$ go get github.com/boltdb/bolt\n```\n\n\n#### Object encoding\n\nOne feature of relational databases that many of us take for granted is that\nthey handle encoding rows into bytes on disk. Since Bolt only works with byte\nslices we'll need to handle that manually. Lucky for us, there are a LOT of\noptions for object serialization.\n\nIn Go, one of the most popular serialization libraries is Protocol Buffers\n(also called \"protobufs\"). One implementation, called gogoprotobuf, is also\none of the fastest. With protobufs, we declare our serialization format and\nthen generate Go files for doing this quickly.\n\nLet's take a look at an example application to see how we'd do this. This\napplication shows how to do CRUD for a simple \"User\" data store but I've also\nbuilt many other less traditional applications on Bolt such as message queues\nand analytics.\n\n\n#### Domain types\n\nIn our app, we have a single domain type called \"User\" with two fields: ID\n\u0026 username. You can expand out to more types and nest objects but we'll stick\nwith a single object to keep things simple.\n\nI like to separate out my domain types from my encoding types by placing my\nencoding types in a subpackage called \"internal\". I do this for two reasons.\nFirst, it keeps the generated protobufs code separate. And second, the\n\"internal\" package is inaccessible from other packages outside our app and it's\nhidden from godoc.\n\nInside our protobuf definition we can see that it matches our domain type with\na few exceptions. Since it's our binary representation we have to specify a size\nfor our integer type. Also, you'll notice numbers on the right. These are\nessentially field IDs when it's encoded. When you add or remove fields you\ndon't need to do a migration like with a relational database. You simply add a\nfield with a higher number or delete a field.\n\nBack in our store.go we can add code to generate our protobufs. This line\ncalls the protobuf compiler, \"protoc\", and will generate to\n`internal/internal.pb.go`. If we look in there we can see it's a bunch of ugly\ngenerated code.\n\nOur domain type will convert to and from this protobuf type but we can hide it\nall behind the `encoding.BinaryMarshaler` \u0026 `encoding.BinaryUnmarshaler`\ninterfaces. Our `MarshalBinary()` simply copies our fields in and marshals them\nand the `UnmarshalBinary()` unmarshals the data and copies the fields out. This\nis a bit more work than in relational databases but it's easy to write, test,\nand migrate.\n\n\n#### Initializing the store\n\nOur `Store` type will be our application wrapper around our Bolt database. To\nopen a `bolt.DB`, we simply need to pass in the file path and the file\npermissions to set if the file doesn't exist. This will use the umask so I\ntypically set my permissions to `0666` and let users set the umask to filter\nthat at runtime.\n\nOnce the database is open, we'll start a writable transaction. The `Begin()`\nfunction is what starts a transaction. The `true` argument means that it's a\n\"writable\" transaction. Bolt can have as many read transactions as you want but\nonly one write transaction gets processed at a time. This means it's important\nto keep updates small and break up really large updates into smaller chunks. All\ntransactions operate under serializable isolation which means that all data will\nbe a snapshot of exactly how it was when the transaction started -- even if\nother write transactions commit while the read transaction is in process.\n\nThe deferred rollback can look odd since we want to commit the transaction at\nend. It's important in case you return an error early or your application\npanics. All transactions need rollback or commit when they're done or else they\ncan block other operations later.\n\nWithin our transaction we call `CreateBucketIfNotExists()` to create our bucket\nfor our users. This is similar to a \"CREATE TABLE IF NOT EXISTS\" in SQL. If the\nbucket doesn't exist then it's created. Otherwise it's ignored. Calling this\nduring initialization means that we won't have to check for it whenever we use\nthe \"users\" bucket. It's guaranteed to be there.\n\nFinally, we commit our transaction and return the error, if any occurred while\nsaving. Bolt does not allow partial transactions so if a disk error occurs then\nyour entire transaction will be rolled back. The deferred rollback that we\ncalled earlier will be ignored for this transaction since we have successfully\ncommitted.\n\nClosing the store is a simple task. Simply call `Close()` on the Bolt database\nand it will release it's exclusive file lock and close the file descriptor.\n\n\n#### Creating a user\n\nNow that we have our database ready, let's create a user. In our `CreateUser()`\nmethod, we'll start by creating a writable transaction just like we did before.\nThen we'll grab the \"Users\" bucket from the transaction. We don't need to check\nif it exists because we created it during initialization.\n\nNext, we'll create a new ID for the user. Bolt has a nice feature called\nsequences which are transactionally safe autoincrementing integers for each\nbucket. Whenever we call `NextSequence()`, we'll get the bucket's next integer.\nOnce we grab it, we assign it to our user's ID.\n\nNow our user is ready to be marshaled. We call `MarshalBinary()` and we get a\nset of bytes which represents our encoded user. Easy peasy!\n\nSince Bolt only works with bytes, we'll need to convert our ID to bytes. I\nrecommend using the `binary.BigEndian.PutUint64()` function for this. I use\nbig endian because it will sort our IDs in the right order.\n\n[show big endian vs little endian on slide]\n\nWe'll use the bucket's `Put()` method to associate our encoded user with the\nencoded ID. Then we'll commit and our data is saved.\n\n\n#### Retrieving the user\n\nCreating a user is just a matter of converting objects to bytes so retrieving a\nuser is simply converting bytes to objects. In our `User()` method we'll start\na transaction but this time we'll pass in `false` to specify a read-only\ntransaction. Again, read-only transactions can run completely in parallel so\nthis scales really well across multiple CPU cores.\n\nOnce we have our transaction, we call `Get()` on our bucket with an encoded\nuser ID and we get back the encoded user bytes. We can call `UnmarshalBinary()`\non a new `User` and decode the data. If the encoded bytes comes back as `nil`\nthen we know that the user doesn't exist and we can simply return a `nil` user.\n\n\n#### Retrieving multiple users\n\nReading one user is good but many times we want to return a list of all users.\nFor this we'll need to use a Bolt cursor. A cursor is simply an object for\niterating over a bucket in order. It has a handful of methods we can use to\nmove forward, back or even jump around.\n\nIn our `Users()` method we'll grab a read-only transaction and a cursor from\nour bucket. Then we'll iterate over every key/value in our bucket. We can\ncollapse it all into a simple \"for\" loop where we call `First()` at the\nbeginning and then `Next()` until we receive a `nil` key. For each value, we'll\nunmarshal the user and add it to our slice.\n\nIf you need reverse sorting, you can call `Last()` and then `Prev()` on the\ncursor. You can also use the `Seek()` method to jump to a specific spot. For\nexample, if we wanted to do pagination we could pass in an \"options\" object\ninto the method and have an offset and limit.\n\n\n#### Updating a user\n\nNow that we've created a user, let's update it. Let's look at the\n`SetUsername()` method. This time we'll mix it up and use the `Update()` method\ninstead of `Begin()`. This method works just like `Begin(true)` except that\nit executes a function in the context of the transaction. If the function\nreturns `nil` then the transaction commits. Otherwise if it returns an error or\npanics then it will rollback the transaction.\n\nFirst we'll retrieve our user by ID and unmarshal. In this case we're combining\nthe `Get()` and `UnmarshalBinary()` into a compound `if` block. I find it easier\nto read if I group these related types of calls together. Next we simply update\nthe username on our user we just unmarshaled.\n\nNow that we have our updated user object, we can simply remarshal it and\noverwrite the previous value by calling `Put()` again.\n\n\n#### Deleting a user\n\nFinally, the last part of our CRUD store is the deletion. Delete is incredibly\nsimply. Simply call the `Delete()` method on the bucket. That's it!\n\n\n\n### Bolt in Practice\n\nThat was the basics of doing CRUD operations with Bolt and we can talk about\nmore advanced use cases in a minute but first let's look at what running Bolt\nin production looks like.\n\nInternally, Bolt structures itself as a B+tree of pages which requires a lot of\nrandom access at the file system so it's recommended that you run Bolt on an\nSSD. Other embedded databases such as LevelDB are optimized for spinning disks.\n\nBolt also maps the database to memory using a read-only `mmap()` so byte slices\nreturned from buckets cannot be updated (or else it will SEGFAULT) and the\nbyte slices are only valid for the life of the transaction. The memory map\nprovides two amazing benefits. First, it means that data is never copied from\nthe database. You're accessing it directly from the operating system's page\ncache. Second, since it's in the OS page cache, your hot data will persist in\nmemory across application restarts.\n\n\n#### Backup \u0026 restore\n\nFrom an operations standpoint, Bolt just uses a single file on disk so it's\nsimple to manage. However, as a library, there's not a standard CLI command to\nbackup your database but Bolt does provide a great option.\n\nTransactions in Bolt implement the `io.WriterTo` interface which means they\ncan copy an entire snapshot of the database to an `io.Writer` with one line of\ncode. Depending on your application you may wish to provide an HTTP endpoint so\nyou can `curl` your backups or you can build an hourly snapshot to Amazon S3.\n\nAnother option in the works is a streaming transaction log so that you can\nattach a process over the network to be an async replica. This is similar to\nhow Postgres replication works. This is still early in development though and\nis not currently available.\n\n\n### Performance\n\nWhile performance is not a primary goal of Bolt, it is an important feature to\ntalk about. Bolt is read optimized so if your workload regularly consists of\ntens of thousands of writes per second then you may want to look at write\noptimized databases such as LevelDB.\n\n\n#### Benchmarks\n\nBenchmarks are not typically very useful but it's good to know a ballpark of\nwhat a database can handle. I typically tell people the following on a machine\nwith an SSD. Expect to get up to 2,000 random writes/sec without optimization.\nIf you're bulk loading and sorting data then you can get up to 400K+ writes/sec.\nTypically you're throttled by your hard drive speed.\n\nOn the read side, it depends on if your data is in the page cache. Typical CRUD\napplications have maybe 10-20% of their data hot at any given time. That means\nthat if you have a 20GB database then 2 - 4GB is hot and that will be resident\nin memory assuming you have that much RAM. For hot data, you can expect a read\ntransaction and a single bucket read to take a 1-2µs. If you're iterating over\nhot data then you can see speeds of 20M keys/second. Again, all this data is\nin the page cache and there's no data copy so it's really fast.\n\nIf your data is not hot then you'll be limited by the speed of your hard drive.\nExpect reads of cold data on an SSD to take hundreds of microseconds or a few\nmilliseconds depending on the size of your database.\n\n\n### Scaling with Bolt\n\nOne of the biggest criticisms of embedded databases is that they are embedded\nand don't have a means of scaling or clustering. This is true, however, it's not\nnecessarily a reason to avoid embedded databases. There are several strategies\nthat can be used to obtain the safety required for your application while still\nusing an embedded database.\n\n\n#### Vertical scaling\n\nThe easiest way to scale is still, by far, by scaling vertically. If your\nbottleneck is with read queries then simply adding more CPU cores will scale\nyour Bolt application near linearly. This answer may sound too simplistic but\nthat's the beauty of it. It's really simple.\n\n\n#### Horizontal scaling via sharding\n\nMany applications -- especially SaaS applications -- can be partitioned by users\nor accounts. This means that you can either assign each account to its own\ndatabase or use strategies like consistent hashing to group accounts into a\nnumber of partitions. This will allow you to add additional machines and\nrebalance the load of your application.\n\n\n#### Data integrity in the face of catastrophic failure\n\nUntil streaming replication is ready for Bolt, windows of data loss are still\nan issue that needs consideration when building with an embedded database. Many\ncloud providers provide highly safe, redundant storage but even those can fail\nevery great once in a while.\n\nFor some applications, a daily backup may be reliable enough. For others, a\nstandby machine taking snapshots every 10 minutes may be required. Many\napplications store critical data such as finances in a third party service like\nStripe so a 10 minute window may be acceptable.\n\nAgain, a Bolt database is simply a file and can be copied extremely quickly.\nExpect a copy to take 3-5 seconds per gigabyte on an SSD.\n\n\n\n### Conclusion\n\nI believe that local key/value databases meet the requirements for many\napplications while providing a simple, fast development experience. I have\nshown you how to get up and running with Bolt and build a simple CRUD data\nstore. The code is not just straightforward but also incredibly fast and\ncan vertically scale for many workloads. Please consider using a local\nkey/value database in your next project!\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenbjohnson%2Fapplication-development-using-boltdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenbjohnson%2Fapplication-development-using-boltdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenbjohnson%2Fapplication-development-using-boltdb/lists"}