{"id":19345964,"url":"https://github.com/tolitius/mongodb-write-performance-playground","last_synced_at":"2025-07-08T22:03:41.440Z","repository":{"id":1440001,"uuid":"1669303","full_name":"tolitius/mongodb-write-performance-playground","owner":"tolitius","description":"Playing with MongoDB Write Performance","archived":false,"fork":false,"pushed_at":"2022-12-14T20:44:45.000Z","size":187,"stargazers_count":24,"open_issues_count":2,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-08T22:02:38.314Z","etag":null,"topics":["c","java","mongodb","performance"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tolitius.png","metadata":{"files":{"readme":"README.markdown","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2011-04-27T06:27:26.000Z","updated_at":"2023-02-09T15:27:36.000Z","dependencies_parsed_at":"2022-07-29T13:19:16.289Z","dependency_job_id":null,"html_url":"https://github.com/tolitius/mongodb-write-performance-playground","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tolitius/mongodb-write-performance-playground","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tolitius%2Fmongodb-write-performance-playground","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tolitius%2Fmongodb-write-performance-playground/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tolitius%2Fmongodb-write-performance-playground/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tolitius%2Fmongodb-write-performance-playground/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tolitius","download_url":"https://codeload.github.com/tolitius/mongodb-write-performance-playground/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tolitius%2Fmongodb-write-performance-playground/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264357292,"owners_count":23595575,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","java","mongodb","performance"],"created_at":"2024-11-10T04:08:33.753Z","updated_at":"2025-07-08T22:03:41.419Z","avatar_url":"https://github.com/tolitius.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003e :zap: for `TL;DR` jump to [conclusion](#conclusion)\n\n# What is it?\n\nA place where a mongo-java-driver and a mongo-c-driver are used to INSERT X number of \"some\" records into MongoDB and see how it performs\n\n+ Java version is run with [MongoKiller](https://github.com/tolitius/mongodb-write-performance-playground/blob/master/java/src/main/java/org/dotkam/killer/MongoKiller.java)\n+ Relies on [mongo_killer.yaml](https://github.com/tolitius/mongodb-write-performance-playground/blob/master/java/src/main/resources/mongo_killer.yaml) config by default, but a custom config may be provided though a \"--config\" parameter\n+ Can be run against a single MongoDB instance (MongoS or MongoD), as well as multiple MongoDB instances by specifying hosts in YAML config and running with a \"`--multiple-hosts`\" parameter\n+ When running against multiple hosts, batches of documents ( = numberOfDocuments / gridSize ) are sent to hosts in a Round Robin fashion\n\n# NOTE(:exclamation:)\n\nThe results below are the best benchmarks that could be squeezed out of Mongo on a given hardware. \n\n### HOWEVER: \nAll these results are for a \"Fire and Forget\" writing mode, where WriteConcern is set to NORMAL (which is a default setting btw). Which means the data was pushed through the socket and \"hopefully\" got persisted. In case the WriteConcern is set to something more durable e.g. SAFE / FSYNC_SAFE, the performace goes down really fast.\n\n### HOWEVER II: \nIf plans are to work with \"Big Data\", which (its index) most likely will not fit into RAM, MongoDB performance is unpredictably bad, and mostly averages to low hundreds ( 200 / 300 ) documents per seconds. More about this topic here: [NoRAM DB =\u003e “If It Does Not Fit in RAM, I Will Quietly Die For You”](http://www.dotkam.com/2011/07/06/noram-db-if-it-does-not-fit-in-ram-i-will-quietly-die-for-you/)\n\n### HOWEVER III:\nSince Mongo documents are BSON, the size of a document greatly depends on the key name lengths. For example, a key with a name of \"firstName\" will take 9 bytes JUST for the key name. This creates two immediate disadvantages:\n\n+ A lot more needs to be pushed through the socket =\u003e decreases performance and/or increases cost to maintain a decent performance\n+ Need a lot more storage =\u003e that TB of documents will only really have a fraction of \"useful\" data, everything else are keys, mostly duplicated accross documents\n\n### CONCLUSION\n\nFor a lightweght CRUD webapp, which does not really need high throughput, does not store to keep GB/TB of data, and might benefit from a document oriented schemaless data store, MongoDB would be a perfect choice: very nice query language, fun to work with.\n\n:point_down:\u003cbr/\u003e\nMongo database is good at fire and forget workloads.\n\nIn case of\n\n* medium to high reliable throughput\n* medium to large datasets\n* stored reliably\n\nThere are much better databases.\u003cbr/\u003e\nThey lose Mongo in marketing, but win in quality, reliability, memory efficiency, throughput and features such as _real_ JOINs and others.\n\n# Things Tried Here:\n\n+ Inserting documents One By One\n+ Inserting documents All At Once\n+ Partitioning documents for a given number of threads, and inserting them in parallel ( ThreadPoolExecutor )\n+ Partitioning documents for a given number of threads... Inserting to MongoS having multiple Shards ( Shard cluster )\n+ Partitioning documents for a given number of threads... Inserting to multiple MongoDs directly\n+ Pre-splitting, moving chunks for a known number of threads, so the shard key is effective [ tags: partitioning, sharding ]\n\n## Sample MongoKiller run with '--multiple-hosts':\n\n```bash\nloading mongo killer config from: src/main/resources/mongo_killer.yaml\nkilling multiple Mongo hosts..\n[{name=localhost, port=30000}, {name=localhost, port=30001}, {name=localhost, port=30002}, {name=localhost, port=30003}]\nBringing MultipleHostMongoKiller to life with:\n\n     Number Of Documents:   5200000\n     Document Size ~:       643 bytes\n     Grid Size:             4\n     Number Of Hosts:       4\n     Batch Threshold:       1000000\n\n=\u003e sending 1000000 documents down the Mongo pipe\n=\u003e sending 1000000 documents down the Mongo pipe\n=\u003e sending 1000000 documents down the Mongo pipe\n=\u003e sending 1000000 documents down the Mongo pipe\n=\u003e sending 1000000 documents down the Mongo pipe\n=\u003e sending 200000 documents down the Mongo pipe\n\nStopWatch 'Killing Mongo': running time (millis) = 70427\n-----------------------------------------\nms     %     Task name\n-----------------------------------------\n70427  100%  adding 5200000 number of documents..\n```\n\n## What is \"all this\" for..\n\nThis creation is _meant_ to be \"cloned\" and changed to reflect what _you_ really need: e.g. change documents, indexes, collections, number of documents, etc..\n\n# \"Show Me The Money\"\n\n## Mr. C goes first\n\n+ Running it on Mac Book Pro i7 2.8 GHz..\n+ Single document has 25 fields and its size is roughly *320* bytes\n\n### to compile\n\n```c\ngcc -Isrc --std=c99 ./mongo-c-driver/src/*.c -I ./mongo-c-driver/src/ batch_insert.c -o batch_insert\n```\n\n### to run\n\n```bash\n$ ./batch_insert\n    usage: ./batch_insert number_of_records batch_size\n```\n\n## 100,000 ( One Hundred Thousand ) records \n\n```bash\n$ ./batch_insert 100000 100000\n\ninserting 100000 records with a batch size of 100000 =\u003e took 0.393889 seconds...\n```\n\n```bash\n$ ./batch_insert 100000 10000\n\ninserting 100000 records with a batch size of 10000 =\u003e took 1.351205 seconds...\n```\n\n```bash\n$ ./batch_insert 100000 50000\n\ninserting 100000 records with a batch size of 50000 =\u003e took 0.864108 seconds...\n```\n\n## 10,000,000 ( Ten Million ) records \n\n```bash\n$ ./batch_insert 10000000 100000\n\ninserting 10000000 records with a batch size of 100000 =\u003e took 173.898534 seconds...\n```\n\n## 100,000,000 ( One Hundred Million ) records \n\n```bash\n$ ./batch_insert 100000000 100000\n\ninserting 100000000 records with a batch size of 100000 =\u003e took 2346.321261 seconds...\n```\n\nNOTE(!) C Driver is still in an alpha state where it does not support things like WriteConcern, replica sets, etc..\n\n## Now Ms. Java..\n\n+ Running it on Mac Book Pro i7 2.8 GHz.. =\u003e NOTE: Same trends ( including a \"dead end slow down\" ) are observed when running on a 12 node Linux cluster\n+ Single document has 30 fields, and its size is *665* bytes\n+ A 1,000,000 documents is hungry, so: \"-Xms512m -Xmx1024m -XX:MaxPermSize=384m -Xss128k\"\n\n## 10,000 ( Ten Thousand ) records\n\n    StopWatch '-- MongoDB Insert One By One --': running time (millis) = 901\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    00901  100%  adding 10000 number of records..\n\n    StopWatch '-- MongoDB Insert All At Once --': running time (millis) = 185\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    00185  100%  adding 10000 number of records..\n\n    StopWatch '-- MongoDB Insert All With Partitioning [ grid size = 3 ] --': running time (millis) = 359\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    00359  100%  adding 10000 number of documents..\n\n## 100,000 ( One Hundred Thousand ) records\n\n    StopWatch '-- MongoDB Insert One By One --': running time (millis) = 3692\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    03692  100%  adding 100000 number of records..\n\n    StopWatch '-- MongoDB Insert All At Once --': running time (millis) = 2038\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    02038  100%  adding 100000 number of records..\n\n    StopWatch '-- MongoDB Insert All With Partitioning [ grid size = 4 ] --': running time (millis) = 1142\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    01142  100%  adding 100000 number of documents..\n\n## 1,000,000 ( One Million ) records\n\n    StopWatch '-- MongoDB Insert One By One --': running time (millis) = 31157\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    31157  100%  adding 1000000 number of records..\n\n    StopWatch '-- MongoDB Insert All At Once --': running time (millis) = 20238\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    20238  100%  adding 1000000 number of records..\n\n    StopWatch '-- MongoDB Insert All With Partitioning [ grid size = 3 ] --': running time (millis) = 12785\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    12785  100%  adding 1000000 number of documents..\n\n    StopWatch '-- MongoDB Partitioning / Multiple Hosts [ grid size = 15 / number of hosts = 5 ] --': running time (millis) = 9602\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    09602  100%  adding 1000000 number of documents..\n\n## 100,000,000 ( One Hundred Million ) records\n\n    StopWatch '-- MongoDB Insert All With Partitioning [ grid size = 4 ] --': running time (millis) = 3025403\n    -----------------------------------------\n    ms     %     Task name\n    -----------------------------------------\n    3025403  100%  adding 100000000 number of documents..\n\n### Current version of MongoDB does not provide even distribution over shards\n\n  Unfortunately.\n  The way sharding is done, Mongo looks at the chunk size and moves chunks around in async manner.\n  Which means when X number of records are sent to MongoS it will only use \"next\" shard in case a chunk size is reached.\n  Hence inserts are still \"sequential\".\n  JIRA that is \"kind of\" supposed to address that: https://jira.mongodb.org/browse/SERVER-939 which is open since 2009 (!).\n  \n  Even if chunks are 'pre-split' for a known number of shards / threads, INSERTing speed is way ( at least 3 times ) slower than a manual even distribution with [MongoMultipleHostDocumentWriter.java](https://github.com/tolitius/mongodb-write-performance-playground/blob/master/java/src/main/java/org/dotkam/mongodb/concurrent/MongoMultipleHostDocumentWriter.java)\n\n### Hence real time Even Distribution is needed. Which is done via manual partitioning by:\n\n+ number of documents / grid size [ where in this example grid size = number of threads ]\n+ number of documents / grid size Evenly Distributed over multiple MongoDB Daemons [ this.nextCollectionHost % collectionDataSources.size() ]\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftolitius%2Fmongodb-write-performance-playground","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftolitius%2Fmongodb-write-performance-playground","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftolitius%2Fmongodb-write-performance-playground/lists"}