Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/foursquare/mongo-hdfs-export
https://github.com/foursquare/mongo-hdfs-export
Last synced: 11 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/foursquare/mongo-hdfs-export
- Owner: foursquare
- License: apache-2.0
- Archived: true
- Created: 2014-01-24T21:21:19.000Z (almost 11 years ago)
- Default Branch: master
- Last Pushed: 2014-01-27T21:58:17.000Z (almost 11 years ago)
- Last Synced: 2024-08-01T22:58:18.453Z (3 months ago)
- Language: Scala
- Size: 137 KB
- Stars: 31
- Watchers: 190
- Forks: 10
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
To run this, copy a mongod executable to this directory. (You can get a copy [here](http://www.mongodb.org/downloads)) Then, run it with `./sbt run `, where args are
* databaseName - the name of the database you are dumping from
* shardName - the shard you are dumping
* inputDir - mongod directory to dump from
* hdfsPath - path to dump data to
* dbPort - any free port for mongod to use
* localTmpDir - local path for temporary dataThriftBsonInputFormat can be used to read BSON files generated in this way from MapReduce jobs. It's configured with:
```scala
conf.setInputFormat(classOf[ThriftBsonInputFormat])
conf.set(ThriftBsonInputFormat.thriftClass, classOf[MyThriftClass].getName)
```