Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sakserv/hadoop-mini-clusters
hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE
https://github.com/sakserv/hadoop-mini-clusters
hadoop hadoop-mini-clusters ide java test-automation
Last synced: about 12 hours ago
JSON representation
hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE
- Host: GitHub
- URL: https://github.com/sakserv/hadoop-mini-clusters
- Owner: sakserv
- License: apache-2.0
- Created: 2015-01-04T14:36:24.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2023-01-02T22:05:14.000Z (about 2 years ago)
- Last Synced: 2025-01-26T01:08:16.052Z (8 days ago)
- Topics: hadoop, hadoop-mini-clusters, ide, java, test-automation
- Language: Java
- Homepage:
- Size: 212 MB
- Stars: 291
- Watchers: 34
- Forks: 105
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.md
- License: LICENSE
Awesome Lists containing this project
README
hadoop-mini-clusters
====================
hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE, without the need for a full blown development cluster or container orchestration. It allows the user to debug with the full power of the IDE. It provides a consistent API around the existing Mini Clusters across the ecosystem, eliminating the tedious task of learning the nuances of each project's approach.Modules:
------------
The project structure changed with 0.1.0. Each mini cluster now resides in a module of its own. See the module names below.Modules Included:
-----------------
* hadoop-mini-clusters-hdfs - Mini HDFS Cluster
* hadoop-mini-clusters-yarn - Mini YARN Cluster (no MR)
* hadoop-mini-clusters-mapreduce - Mini MapReduce Cluster
* hadoop-mini-clusters-hbase - Mini HBase Cluster
* hadoop-mini-clusters-zookeeper - Curator based Local Cluster
* hadoop-mini-clusters-hiveserver2 - Local HiveServer2 instance
* hadoop-mini-clusters-hivemetastore - Derby backed HiveMetaStore
* hadoop-mini-clusters-storm - Storm LocalCluster
* hadoop-mini-clusters-kafka - Local Kafka Broker
* hadoop-mini-clusters-oozie - Local Oozie Server - Thanks again Vladimir
* hadoop-mini-clusters-mongodb - I know... not Hadoop
* hadoop-mini-clusters-activemq - Thanks Vladimir Zlatkin!
* hadoop-mini-clusters-hyperscaledb - For testing various databases
* hadoop-mini-clusters-knox - Local Knox Gateway
* hadoop-mini-clusters-kdc - Local Key Distribution Center (KDC)Tests:
------
Tests are included to show how to configure and use each of the mini clusters. See the *IntegrationTest classes.Using:
------
* Maven Central - latest release```XML
com.github.sakserv
hadoop-mini-clusters
0.1.16com.github.sakserv
hadoop-mini-clusters-common
0.1.16```
Profile Support:
----------------
Multiple versions of HDP are available. The current list is:* HDP 2.6.5.0 (default)
* HDP 2.6.3.0
* HDP 2.6.2.0
* HDP 2.6.1.0
* HDP 2.6.0.3
* HDP 2.5.3.0
* HDP 2.5.0.0
* HDP 2.4.2.0
* HDP 2.4.0.0
* HDP 2.3.4.0
* HDP 2.3.2.0
* HDP 2.3.0.0To use a different profiles, add the profile name to your maven build:
```
mvn test -P2.3.0.0
```Note that backwards compatibility is not guarenteed.
Examples:
---------### HDFS Example
```XMLcom.github.sakserv
hadoop-mini-clusters-hdfs
0.1.16```
```Java
HdfsLocalCluster hdfsLocalCluster = new HdfsLocalCluster.Builder()
.setHdfsNamenodePort(12345)
.setHdfsNamenodeHttpPort(12341)
.setHdfsTempDir("embedded_hdfs")
.setHdfsNumDatanodes(1)
.setHdfsEnablePermissions(false)
.setHdfsFormat(true)
.setHdfsEnableRunningUserAsProxyUser(true)
.setHdfsConfig(new Configuration())
.build();
hdfsLocalCluster.start();
```### YARN Example
```XMLcom.github.sakserv
hadoop-mini-clusters-yarn
0.1.16```
```Java
YarnLocalCluster yarnLocalCluster = new YarnLocalCluster.Builder()
.setNumNodeManagers(1)
.setNumLocalDirs(Integer.parseInt(1)
.setNumLogDirs(Integer.parseInt(1)
.setResourceManagerAddress("localhost:37001")
.setResourceManagerHostname("localhost")
.setResourceManagerSchedulerAddress("localhost:37002")
.setResourceManagerResourceTrackerAddress("localhost:37003")
.setResourceManagerWebappAddress("localhost:37004")
.setUseInJvmContainerExecutor(false)
.setConfig(new Configuration())
.build();
yarnLocalCluster.start();
```### MapReduce Example
```XMLcom.github.sakserv
hadoop-mini-clusters-mapreduce
0.1.16```
```Java
MRLocalCluster mrLocalCluster = new MRLocalCluster.Builder()
.setNumNodeManagers(1)
.setJobHistoryAddress("localhost:37005")
.setResourceManagerAddress("localhost:37001")
.setResourceManagerHostname("localhost")
.setResourceManagerSchedulerAddress("localhost:37002")
.setResourceManagerResourceTrackerAddress("localhost:37003")
.setResourceManagerWebappAddress("localhost:37004")
.setUseInJvmContainerExecutor(false)
.setConfig(new Configuration())
.build();mrLocalCluster.start();
```### HBase Example
```XMLcom.github.sakserv
hadoop-mini-clusters-hbase
0.1.16```
```Java
HbaseLocalCluster hbaseLocalCluster = new HbaseLocalCluster.Builder()
.setHbaseMasterPort(25111)
.setHbaseMasterInfoPort(-1)
.setNumRegionServers(1)
.setHbaseRootDir("embedded_hbase")
.setZookeeperPort(12345)
.setZookeeperConnectionString("localhost:12345")
.setZookeeperZnodeParent("/hbase-unsecure")
.setHbaseWalReplicationEnabled(false)
.setHbaseConfiguration(new Configuration())
.activeRestGateway()
.setHbaseRestHost("localhost")
.setHbaseRestPort(28000)
.setHbaseRestReadOnly(false)
.setHbaseRestThreadMax(100)
.setHbaseRestThreadMin(2)
.build()
.build();hbaseLocalCluster.start();
```### Zookeeper Example
```XMLcom.github.sakserv
hadoop-mini-clusters-zookeeper
0.1.16```
```Java
ZookeeperLocalCluster zookeeperLocalCluster = new ZookeeperLocalCluster.Builder()
.setPort(12345)
.setTempDir("embedded_zookeeper")
.setZookeeperConnectionString("localhost:12345")
.setMaxClientCnxns(60)
.setElectionPort(20001)
.setQuorumPort(20002)
.setDeleteDataDirectoryOnClose(false)
.setServerId(1)
.setTickTime(2000)
.build();zookeeperLocalCluster.start();
```### HiveServer2 Example
```XMLcom.github.sakserv
hadoop-mini-clusters-hiveserver2
0.1.16```
```Java
HiveLocalServer2 hiveLocalServer2 = new HiveLocalServer2.Builder()
.setHiveServer2Hostname("localhost")
.setHiveServer2Port(12348)
.setHiveMetastoreHostname("localhost")
.setHiveMetastorePort(12347)
.setHiveMetastoreDerbyDbDir("metastore_db")
.setHiveScratchDir("hive_scratch_dir")
.setHiveWarehouseDir("warehouse_dir")
.setHiveConf(new HiveConf())
.setZookeeperConnectionString("localhost:12345")
.build();hiveLocalServer2.start();
```### HiveMetastore Example
```XMLcom.github.sakserv
hadoop-mini-clusters-hivemetastore
0.1.16```
```Java
HiveLocalMetaStore hiveLocalMetaStore = new HiveLocalMetaStore.Builder()
.setHiveMetastoreHostname("localhost")
.setHiveMetastorePort(12347)
.setHiveMetastoreDerbyDbDir("metastore_db")
.setHiveScratchDir("hive_scratch_dir")
.setHiveWarehouseDir("warehouse_dir")
.setHiveConf(new HiveConf())
.build();hiveLocalMetaStore.start();
```### Storm Example
```XMLcom.github.sakserv
hadoop-mini-clusters-storm
0.1.16```
```Java
StormLocalCluster stormLocalCluster = new StormLocalCluster.Builder()
.setZookeeperHost("localhost")
.setZookeeperPort(12345)
.setEnableDebug(true)
.setNumWorkers(1)
.setStormConfig(new Config())
.build();stormLocalCluster.start();
```### Kafka Example
```XMLcom.github.sakserv
hadoop-mini-clusters-kafka
0.1.16```
```Java
KafkaLocalBroker kafkaLocalBroker = new KafkaLocalBroker.Builder()
.setKafkaHostname("localhost")
.setKafkaPort(11111)
.setKafkaBrokerId(0)
.setKafkaProperties(new Properties())
.setKafkaTempDir("embedded_kafka")
.setZookeeperConnectionString("localhost:12345")
.build();kafkaLocalBroker.start();
```### Oozie Example
```XMLcom.github.sakserv
hadoop-mini-clusters-oozie
0.1.16```
```Java
OozieLocalServer oozieLocalServer = new OozieLocalServer.Builder()
.setOozieTestDir("embedded_oozie")
.setOozieHomeDir("oozie_home")
.setOozieUsername(System.getProperty("user.name"))
.setOozieGroupname("testgroup")
.setOozieYarnResourceManagerAddress("localhost")
.setOozieHdfsDefaultFs("hdfs://localhost:8020/")
.setOozieConf(new Configuration())
.setOozieHdfsShareLibDir("/tmp/oozie_share_lib")
.setOozieShareLibCreate(Boolean.TRUE)
.setOozieLocalShareLibCacheDir("share_lib_cache")
.setOoziePurgeLocalShareLibCache(Boolean.FALSE)
.setOozieShareLibFrameworks(
Lists.newArrayList(Framework.MAPREDUCE_STREAMING, Framework.OOZIE))
.build();OozieShareLibUtil oozieShareLibUtil = new OozieShareLibUtil(
oozieLocalServer.getOozieHdfsShareLibDir(),
oozieLocalServer.getOozieShareLibCreate(),
oozieLocalServer.getOozieLocalShareLibCacheDir(),
oozieLocalServer.getOoziePurgeLocalShareLibCache(),
hdfsLocalCluster.getHdfsFileSystemHandle(),
oozieLocalServer.getOozieShareLibFrameworks());
oozieShareLibUtil.createShareLib();oozieLocalServer.start();
```### MongoDB Example
```XMLcom.github.sakserv
hadoop-mini-clusters-mongodb
0.1.16```
```Java
MongodbLocalServer mongodbLocalServer = new MongodbLocalServer.Builder()
.setIp("127.0.0.1")
.setPort(11112)
.build();mongodbLocalServer.start();
```### ActiveMQ Example
```XMLcom.github.sakserv
hadoop-mini-clusters-activemq
0.1.16```
```Java
ActivemqLocalBroker amq = new ActivemqLocalBroker.Builder()
.setHostName("localhost")
.setPort(11113)
.setQueueName("defaultQueue")
.setStoreDir("activemq-data")
.setUriPrefix("vm://")
.setUriPostfix("?create=false")
.build();amq.start();
```### HyperSQL DB Example
```XMLcom.github.sakserv
hadoop-mini-clusters-hyperscaledb
0.1.16```
```Java
hsqldbLocalServer = new HsqldbLocalServer.Builder()
.setHsqldbHostName("127.0.0.1")
.setHsqldbPort("44111")
.setHsqldbTempDir("embedded_hsqldb")
.setHsqldbDatabaseName("testdb")
.setHsqldbCompatibilityMode("mysql")
.setHsqldbJdbcDriver("org.hsqldb.jdbc.JDBCDriver")
.setHsqldbJdbcConnectionStringPrefix("jdbc:hsqldb:hsql://")
.build();hsqldbLocalServer.start();
```### Knox Example
```XMLcom.github.sakserv
hadoop-mini-clusters-knox
0.1.16```
```Java
KnoxLocalCluster knoxCluster = new KnoxLocalCluster.Builder()
.setPort(8888)
.setPath("gateway")
.setHomeDir("embedded_knox")
.setCluster("mycluster")
.setTopology(XMLDoc.newDocument(true)
.addRoot("topology")
.addTag("gateway")
.addTag("provider")
.addTag("role").addText("authentication")
.addTag("enabled").addText("false")
.gotoParent()
.addTag("provider")
.addTag("role").addText("identity-assertion")
.addTag("enabled").addText("false")
.gotoParent()
.gotoParent()
.addTag("service")
.addTag("role").addText("NAMENODE")
.addTag("url").addText("hdfs://localhost:8020")
.gotoParent()
.addTag("service")
.addTag("role").addText("WEBHDFS")
.addTag("url").addText("http://localhost:50070/webhdfs")
.gotoRoot().toString())
.build();knoxCluster.start();
```### KDC Example
```XMLcom.github.sakserv
hadoop-mini-clusters-kdc
0.1.16```
```Java
KdcLocalCluster kdcLocalCluster = new KdcLocalCluster.Builder()
.setPort(34340)
.setHost("127.0.0.1")
.setBaseDir("embedded_kdc")
.setOrgDomain("ORG")
.setOrgName("ACME")
.setPrincipals("hdfs,hbase,yarn,oozie,oozie_user,zookeeper,storm,mapreduce,HTTP".split(","))
.setKrbInstance("127.0.0.1")
.setInstance("DefaultKrbServer")
.setTransport("TCP")
.setMaxTicketLifetime(86400000)
.setMaxRenewableLifetime(604800000)
.setDebug(false)
.build();
kdcLocalCluster.start();
```Find how to integrate KDC with HDFS, Zookeeper or HBase in the tests under hadoop-mini-clusters-kdc/src/test/java/com/github/sakserv/minicluster/impl
Modifying Properties
--------------------
To change the defaults used to construct the mini clusters, modify src/main/java/resources/default.properties as needed.Intellij Testing
----------------If you desire running the full test suite from Intellij, make sure Fork Mode is set to method (Run -> Edit Configurations -> fork mode)
InJvmContainerExecutor
----------------------
YarnLocalCluster now supports Oleg Z's InJvmContainerExecutor. See [Oleg Z's Github](https://github.com/hortonworks/mini-dev-cluster/wiki/Core-Features) for more.