Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tzolov/vagrant-pivotalhd
Use Vagrant and Ambari Blueprint API to install PivotalHD 3.0 (or Hortonworks HDP2.x) Hadoop cluster with HAWQ 1.3 (SQL on Hadoop) and Spring XD 1.2
https://github.com/tzolov/vagrant-pivotalhd
Last synced: 4 days ago
JSON representation
Use Vagrant and Ambari Blueprint API to install PivotalHD 3.0 (or Hortonworks HDP2.x) Hadoop cluster with HAWQ 1.3 (SQL on Hadoop) and Spring XD 1.2
- Host: GitHub
- URL: https://github.com/tzolov/vagrant-pivotalhd
- Owner: tzolov
- License: apache-2.0
- Created: 2014-03-23T20:21:18.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2016-07-20T16:20:18.000Z (over 8 years ago)
- Last Synced: 2023-03-23T02:47:31.474Z (over 1 year ago)
- Language: Shell
- Homepage:
- Size: 406 KB
- Stars: 24
- Watchers: 11
- Forks: 18
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Multi-VMs PivotalHD3.0 (or Hortonworks HDP2.x) Hadoop Cluster with HAWQ and SpringXD
=================
This project leverages Vagrant and [Apache Ambari](https://ambari.apache.org/) to create multi-VMs [PivotalHD 3.0](http://pivotal.io/big-data/pivotal-hd) or Hortonworks HDP2.x Hadoop cluster including [HAWQ 1.3 (SQL on Hadoop)](http://pivotal.io/big-data/pivotal-hawq) and [Spring XD 1.2](http://projects.spring.io/spring-xd/).![alt text](doc/VAGRANT_AMBARI_PHD3_HAWQ_SPRINGXD.png "Ambari with Vagrant")
The logical structure of the cluster is defined in a [`Blueprint`](blueprints). Related [`Host-Mapping`](blueprints) defines how the blueprint is mapped into physical machines. The [Vagrantfile](Vagrantfile) script provisions Virtual Machines (VMs) for the hosts defined in the `Host-Mapping` and with the help of the [Ambari Blueprint API](https://cwiki.apache.org/confluence/display/AMBARI/Blueprints) deploys the`Blueprint` in the cluster. Vagrant supports PivotalHD3.0 (`PHD`) and Hortonworks 2.x (`HDP`) blueprint stacks.
The default [All-Services-Blueprint](blueprints#all-services-pivotalhd30-hawq-and-springxd) creates four virtual machines — one for Apache Ambari and three for the Pivotal HD cluster where Apache Hadoop® (HDFS, YARN, Pig, Zookeeper, HBase), HAWQ (SQL-on-Hadoop) and SpringXD are installed.
## Prerequisite
* From a hardware standpoint, you need 64-bit architecture, the default blueprint requires at least 16GB of physical memory and around 120GB of free disc space (you can configure with only 24GB of disc space but you will not be able to install all Pivotal services together.
* Install [Vagrant](http://www.vagrantup.com/downloads.html) (1.7.2+).
* Install [VirtualBox](https://www.virtualbox.org/) or VMware Fusion (note that VMWare Fusion requires [paid Vagrant license](http://www.vagrantup.com/vmware)).## Environment Setup
* Clone this project
```
git clone https://github.com/tzolov/vagrant-pivotalhd.git
```
* Follow the [Packages download](https://github.com/tzolov/vagrant-pivotalhd/tree/master/packages) instructions to collect all required tarballs and store them inside the `/packages` subfolder.
* Edit the [Vagrantfile](Vagrantfile) `BLUEPRINT_FILE_NAME` and `HOST_MAPPING_FILE_NAME` properties to select the `Blueprint`/`Host-Mapping` pair to deploy. All blueprints and mapping files are in the [`/blueprint`](blueprints) subfolder. By default the [4 nodes, All-Services](blueprints#all-services-pivotalhd30-hawq-and-springxd) blueprint is used.## Create Hadoop cluster
From the top directory run
```
vagrant up --provider virtualbox
```
Depends on the blueprint stack either PivotalHD or Hortonworks clusters will be created. The default [`blueprint/host-mapping`](blueprints#all-services-pivotalhd30-hawq-and-springxd) will create 4 Virtual Machines.
When the `vagrant up` command returns, the VMs are provisioned, the Ambari Server is installed and the cluster deployment is in progress. Open the Ambari interface to monitor the deployment progress:
```
http://10.211.55.100:8080
```
(username: `admin`, password: `admin`)## Vagrant Configuration Properties
The following [Vagrantfile](Vagrantfile) configuration properties can be used to customize a cluster deployment.
For instructions how to create a custom `Blueprint` or `Host-Mapping` read the [blueprints](blueprints) section.
Property
Description
Default Value
BLUEPRINT_FILE_NAME
Specifies the Blueprint file name to deployed. File must exist in the /blueprints subfolder.
phd-all-services-blueprint.json
HOST_MAPPING_FILE_NAME
Specifies the Host-Mapping file name to deployed. File must exist in the /blueprints subfolder.
4-node-all-services-hostmapping.json
CLUSTER_NAME
Sets the cluster name as it will appear in Ambari
CLUSTER1
VM_BOX
Vagrant box name to use. Tested options are:
- bigdata/centos6.4_x86_64 - 40G disk,
- bigdata/centos6.4_x86_64_small - just 8G of disk space and
- chef/centos-6.6 - CentOS6.6 box.
chef/centos-6.6
AMBARI_NODE_VM_MEMORY_MB
Memory (MB) allocated for the Ambari VM
768
PHD_NODE_VM_MEMORY_MB
Memory (MB) allocated for every PHD VM
2048
AMBARI_HOSTNAME_PREFIX
Set the Ambari host name prefix. The suffix is fixed to '.localdomain'.Note: THE FQDN NAME SHOULD NOT be in the phd[1-N].localdomain range.
ambari
DEPLOY_BLUEPRINT_CLUSTER
Set TRUE to deploy a cluster defined by BLUEPRINT_FILE_NAME and HOST_MAPPING_FILE_NAME. Set to FALSE if you prefer to install the cluster with the Ambari wizard.
TRUE