Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jake-low/hadoop.docker

Run a highly configurable, fully distributed Hadoop cluster with Docker!
https://github.com/jake-low/hadoop.docker

Last synced: 3 months ago
JSON representation

Run a highly configurable, fully distributed Hadoop cluster with Docker!

Awesome Lists containing this project

README

        

# Run a Hadoop cluster in Docker

This repository is a collection of Dockerfiles which define images for various
Hadoop ecosystem components. Using these images, you can easily launch a Hadoop
cluster in any configuration you want, running anywhere (your laptop, a VM, or
the cloud), in seconds.

Apache Hadoop is an open source framework which enables storage and processing
of huge data sets. Hadoop is typically installed and run on commodity physical
hardware. The term "Hadoop" has come to refer both to the core project, and to
the ecosystem of data processing tools that exists around it.

Docker is an open source container engine. It allows you to package software
and distribute it so that it can be run anywhere. Docker containers are
frequently likened to lightweight virtual machines.

### Usage

Please see the EXAMPLES directory for guidance on how to make use of these
images.

### Contributing

This project is in active development. Feature requests and/or PRs are welcome!