Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/duckie/mongo_smasher
A small tool to generate randomized datasets
https://github.com/duckie/mongo_smasher
Last synced: about 1 month ago
JSON representation
A small tool to generate randomized datasets
- Host: GitHub
- URL: https://github.com/duckie/mongo_smasher
- Owner: duckie
- License: mit
- Created: 2015-12-13T23:36:38.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2016-03-21T16:55:40.000Z (over 8 years ago)
- Last Synced: 2024-08-01T17:23:16.215Z (4 months ago)
- Language: C++
- Size: 718 KB
- Stars: 34
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-mongodb - MongoDB Smasher - Generate randomized datasets and benchmark your setup (Tools / Administration)
- awesome-mongodb - MongoDB Smasher - Generate randomized datasets and benchmark your setup (Tools / Administration)
- awesome-ccamel - duckie/mongo_smasher - A small tool to generate randomized datasets (C++)
README
MongoDB Smasher
==================This project is a tool to fill up MongoDB databases with huge sets of randomized data to test your architecture and your data model performances.
# How to build
The *only* local dependency is `boost`, be sure it is installed along with a modern C++ compiler and `cmake` then:
git clone https://github.com/duckie/mongo_smasher.git
cd mongo_smasher
git submodule update --init --recursive
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE:STRING=Release ../
make -j${number_of_machine_CPUs}The generated executable is found as `mongo_smasher/build/src/mongo_smasher`. Optional:
make install
**Tip**: Do not use more CPUs than your machine has with `make`or the compilation of the mongo C++11 driver could fail.
**Tip**: If you use Windows, make yourself a favor and install [cash](https://github.com/dthree/cash).
# How to use
`mongo_smasher` must be fed a json file containing your collections and schemas. Check the `examples` directory. Here is one of them:
{
"values":{
"id":{"type":"incremental_id"},
"restaurant_seed":{"type":"string","min_size":10,"max_size":150},
"score":{"type":"int","min":-1,"max":131},
"grade":{"type":"pick", "values":["A","B","C","D","E","F"]},
"grade_date":{"type":"date","min":"2012-01-01 00:00:00","max":"2016-02-29 00:00:00"},
"cuisine":{"type":"pick","file":"ny/cuisines.txt"},
"nyzip":{"type":"pick","file":"ny/zipcodes.txt"},
"nyst":{"type":"pick","file":"ny/street-names.txt"},
"nybuilding":{"type":"pick","file":"ny/buildings.txt"},
"borough":{"type":"pick","values":["Missing","Queens","Bronx","Brooklyn","Manhattan" ,"Staten Island"]},
"geo_x":{"type":"double","min":-28.0168595,"max":52.5388779},
"geo_y":{"type":"double","min":-157.8887924,"max":153.1628795},
"price":{"type":"double","min":1.0,"max":500.0},
"firstname":{"type":"pick","file":"dictionaries/first-names.txt"},
"lastname":{"type":"pick","file":"dictionaries/names.txt"}
},
"collections":{
"restaurants":{
"schema":{
"address":{
"building":"$nybuilding",
"street":"$nyst",
"coord":["$geo_x","$geo_y"],
"zipcode":"$nyzip"
},
"borough":"$borough",
"cuisine?0.7":"$cuisine",
"grades[0:4]":{
"date":"$grade_date",
"grade":"$grade",
"score":"$score"
},
"name":"At $firstname $lastname's $cuisine $estaurant_seed",
"restaurant_id":"$id"
},
"actions":{
"insert":{
"bulk_size":1000
}
}
}
}
}Use `--help` switch to get some details about command line options.
# Contribute
The tool is in very active development. Any contribution is welcome.
Some planned features:
- Updates and finds
- Oplog replayer