Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tzolov/vagrant-pivotalhd

Use Vagrant and Ambari Blueprint API to install PivotalHD 3.0 (or Hortonworks HDP2.x) Hadoop cluster with HAWQ 1.3 (SQL on Hadoop) and Spring XD 1.2
https://github.com/tzolov/vagrant-pivotalhd

Last synced: 4 days ago
JSON representation

Use Vagrant and Ambari Blueprint API to install PivotalHD 3.0 (or Hortonworks HDP2.x) Hadoop cluster with HAWQ 1.3 (SQL on Hadoop) and Spring XD 1.2

Host: GitHub
URL: https://github.com/tzolov/vagrant-pivotalhd
Owner: tzolov
License: apache-2.0
Created: 2014-03-23T20:21:18.000Z (over 10 years ago)
Default Branch: master
Last Pushed: 2016-07-20T16:20:18.000Z (over 8 years ago)
Last Synced: 2023-03-23T02:47:31.474Z (over 1 year ago)
Language: Shell
Homepage:
Size: 406 KB
Stars: 24
Watchers: 11
Forks: 18
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        Multi-VMs PivotalHD3.0 (or Hortonworks HDP2.x) Hadoop Cluster with HAWQ and SpringXD

=================

This project leverages Vagrant and [Apache Ambari](https://ambari.apache.org/) to create multi-VMs [PivotalHD 3.0](http://pivotal.io/big-data/pivotal-hd) or Hortonworks HDP2.x Hadoop cluster including [HAWQ 1.3 (SQL on Hadoop)](http://pivotal.io/big-data/pivotal-hawq) and [Spring XD 1.2](http://projects.spring.io/spring-xd/).

![alt text](doc/VAGRANT_AMBARI_PHD3_HAWQ_SPRINGXD.png "Ambari with Vagrant")

The logical structure of the cluster is defined in a [`Blueprint`](blueprints). Related [`Host-Mapping`](blueprints) defines how the blueprint is mapped into physical machines. The [Vagrantfile](Vagrantfile) script provisions Virtual Machines (VMs) for the hosts defined in the `Host-Mapping` and with the help of the [Ambari Blueprint API](https://cwiki.apache.org/confluence/display/AMBARI/Blueprints) deploys the`Blueprint` in the cluster. Vagrant supports PivotalHD3.0 (`PHD`) and Hortonworks 2.x (`HDP`) blueprint stacks. 

The default [All-Services-Blueprint](blueprints#all-services-pivotalhd30-hawq-and-springxd) creates four virtual machines  — one for Apache Ambari and three for the Pivotal HD cluster where Apache Hadoop® (HDFS, YARN, Pig, Zookeeper, HBase), HAWQ (SQL-on-Hadoop) and SpringXD are installed.

## Prerequisite 

* From a hardware standpoint, you need 64-bit architecture, the default blueprint requires at least 16GB of physical memory and around 120GB of free disc space (you can configure with only 24GB of disc space but you will not be able to install all Pivotal services together.

* Install [Vagrant](http://www.vagrantup.com/downloads.html) (1.7.2+).

* Install [VirtualBox](https://www.virtualbox.org/) or VMware Fusion (note that VMWare Fusion requires [paid Vagrant license](http://www.vagrantup.com/vmware)). 

## Environment Setup

* Clone this project

```

git clone https://github.com/tzolov/vagrant-pivotalhd.git

```

* Follow the [Packages download](https://github.com/tzolov/vagrant-pivotalhd/tree/master/packages) instructions to collect all required tarballs and store them inside the `/packages` subfolder.

* Edit the  [Vagrantfile](Vagrantfile) `BLUEPRINT_FILE_NAME` and `HOST_MAPPING_FILE_NAME` properties to select the `Blueprint`/`Host-Mapping` pair to deploy. All blueprints and mapping files are in the [`/blueprint`](blueprints) subfolder. By default the [4 nodes, All-Services](blueprints#all-services-pivotalhd30-hawq-and-springxd) blueprint is used.

## Create Hadoop cluster

From the top directory run

```

vagrant up --provider virtualbox

```

Depends on the blueprint stack either PivotalHD or Hortonworks clusters will be created. The default [`blueprint/host-mapping`](blueprints#all-services-pivotalhd30-hawq-and-springxd) will create 4 Virtual Machines. 

When the `vagrant up` command returns, the VMs are provisioned, the Ambari Server is installed and the cluster deployment is in progress. Open the Ambari interface to monitor the deployment progress:

```

http://10.211.55.100:8080

```

(username: `admin`, password: `admin`)

## Vagrant Configuration Properties

The following [Vagrantfile](Vagrantfile) configuration properties can be used to customize a cluster deployment. 

For instructions how to create a custom `Blueprint` or `Host-Mapping` read the [blueprints](blueprints) section.

	

		

			_Property

			_Description

			_{Default Value}

		

	

	

		

			_{BLUEPRINT_FILE_NAME}

			_{Specifies the Blueprint file name to deployed. File must exist in the /blueprints subfolder.}

			_{phd-all-services-blueprint.json}

		

		

			_{HOST_MAPPING_FILE_NAME}

			_{Specifies the Host-Mapping file name to deployed. File must exist in the /blueprints subfolder.}

			_{4-node-all-services-hostmapping.json}

		

		

			_{CLUSTER_NAME}

			_{Sets the cluster name as it will appear in Ambari}

			_CLUSTER1

		

		

			_{VM_BOX}

			_{Vagrant box name to use. Tested options are: 
- bigdata/centos6.4_x86_64 - 40G disk, 
- bigdata/centos6.4_x86_64_small - just 8G of disk space and 
- chef/centos-6.6 - CentOS6.6 box.}

			_{chef/centos-6.6}

		

		

			_{AMBARI_NODE_VM_MEMORY_MB}

			_{Memory (MB) allocated for the Ambari VM}

			₇₆₈

		

		

			_{PHD_NODE_VM_MEMORY_MB}

			_{Memory (MB) allocated for every PHD VM}

			₂₀₄₈

		

		

			_{AMBARI_HOSTNAME_PREFIX}

			_{Set the Ambari host name prefix. The suffix is fixed to '.localdomain'.Note: THE FQDN NAME SHOULD NOT be in the phd[1-N].localdomain range.}

			_ambari

		

		

			_{DEPLOY_BLUEPRINT_CLUSTER}

			_{Set TRUE to deploy a cluster defined by BLUEPRINT_FILE_NAME and HOST_MAPPING_FILE_NAME. Set to FALSE if you prefer to install the cluster with the Ambari wizard.}

			_TRUE