Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jehiah/mongosort
Sort records on disk in a mongo database
https://github.com/jehiah/mongosort
Last synced: 24 days ago
JSON representation
Sort records on disk in a mongo database
- Host: GitHub
- URL: https://github.com/jehiah/mongosort
- Owner: jehiah
- Created: 2014-06-01T18:12:58.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2014-06-01T19:16:40.000Z (over 10 years ago)
- Last Synced: 2024-04-17T15:23:23.227Z (7 months ago)
- Language: Go
- Size: 129 KB
- Stars: 0
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# mongosort
Note: this is in prototype development stage and is not yet functional
Disk access (even if mmapped) is slow primarily based on the number of disk seeks required to access data.
While mongo mmaps data into RAM that only provides a speedup if your data is already in RAM, if it's on disk it doesn't help. When you try to scan say 10k records in a mongo query, it must perform disk seeks on both the index and the data extents to complete a query. This creates a significant cold start problem.
mongosort attempts to sort ondisk data ordered by the primary key `_id` so that when using custom `_id` values and querying based on that sort order, it takes as few seeks as possible to map data in from disk.
## Storage Format References
http://2013.nosql-matters.org/bcn/wp-content/uploads/2013/12/storage-talk-mongodb.pdf
https://speakerdeck.com/mathias/storage-internals