{"id":15297533,"url":"https://github.com/sakserv/hadoop-mini-clusters","last_synced_at":"2025-04-04T16:13:30.134Z","repository":{"id":25349024,"uuid":"28776576","full_name":"sakserv/hadoop-mini-clusters","owner":"sakserv","description":"hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE","archived":false,"fork":false,"pushed_at":"2023-01-02T22:05:14.000Z","size":222248,"stargazers_count":291,"open_issues_count":11,"forks_count":105,"subscribers_count":33,"default_branch":"master","last_synced_at":"2025-03-28T15:05:39.078Z","etag":null,"topics":["hadoop","hadoop-mini-clusters","ide","java","test-automation"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"philidem/rest-options-parser","license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sakserv.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-01-04T14:36:24.000Z","updated_at":"2025-01-17T09:06:20.000Z","dependencies_parsed_at":"2022-08-24T02:30:53.856Z","dependency_job_id":null,"html_url":"https://github.com/sakserv/hadoop-mini-clusters","commit_stats":null,"previous_names":[],"tags_count":32,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakserv%2Fhadoop-mini-clusters","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakserv%2Fhadoop-mini-clusters/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakserv%2Fhadoop-mini-clusters/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sakserv%2Fhadoop-mini-clusters/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sakserv","download_url":"https://codeload.github.com/sakserv/hadoop-mini-clusters/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247208139,"owners_count":20901570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hadoop","hadoop-mini-clusters","ide","java","test-automation"],"created_at":"2024-09-30T19:18:12.017Z","updated_at":"2025-04-04T16:13:30.115Z","avatar_url":"https://github.com/sakserv.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"hadoop-mini-clusters\n====================\nhadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE, without the need for a full blown development cluster or container orchestration. It allows the user to debug with the full power of the IDE. It provides a consistent API around the existing Mini Clusters across the ecosystem, eliminating the tedious task of learning the nuances of each project's approach.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://travis-ci.org/sakserv/hadoop-mini-clusters.svg?branch=master\"/\u003e     \u003ca href='https://coveralls.io/github/sakserv/hadoop-mini-clusters?branch=master'\u003e\u003cimg src='https://coveralls.io/repos/sakserv/hadoop-mini-clusters/badge.svg?branch=master\u0026service=github' alt='Coverage Status' /\u003e\u003c/a\u003e     \u003cimg src=\"https://maven-badges.herokuapp.com/maven-central/com.github.sakserv/hadoop-mini-clusters/badge.svg\"/\u003e\n\u003c/p\u003e\n\nModules:\n------------\nThe project structure changed with 0.1.0. Each mini cluster now resides in a module of its own. See the module names below.\n\nModules Included:\n-----------------\n*   hadoop-mini-clusters-hdfs - Mini HDFS Cluster\n*   hadoop-mini-clusters-yarn - Mini YARN Cluster (no MR)\n*   hadoop-mini-clusters-mapreduce - Mini MapReduce Cluster\n*   hadoop-mini-clusters-hbase - Mini HBase Cluster\n*   hadoop-mini-clusters-zookeeper - Curator based Local Cluster\n*   hadoop-mini-clusters-hiveserver2 - Local HiveServer2 instance\n*   hadoop-mini-clusters-hivemetastore - Derby backed HiveMetaStore\n*   hadoop-mini-clusters-storm - Storm LocalCluster\n*   hadoop-mini-clusters-kafka - Local Kafka Broker\n*   hadoop-mini-clusters-oozie - Local Oozie Server - Thanks again Vladimir\n*   hadoop-mini-clusters-mongodb - I know... not Hadoop\n*   hadoop-mini-clusters-activemq - Thanks Vladimir Zlatkin!\n*   hadoop-mini-clusters-hyperscaledb - For testing various databases\n*   hadoop-mini-clusters-knox - Local Knox Gateway\n*   hadoop-mini-clusters-kdc - Local Key Distribution Center (KDC)\n\nTests:\n------\nTests are included to show how to configure and use each of the mini clusters. See the *IntegrationTest classes.\n\nUsing:\n------\n*  Maven Central - latest release\n\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-common\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\nProfile Support:\n----------------\nMultiple versions of HDP are available. The current list is:\n\n*   HDP 2.6.5.0 (default)\n*   HDP 2.6.3.0\n*   HDP 2.6.2.0\n*   HDP 2.6.1.0\n*   HDP 2.6.0.3\n*   HDP 2.5.3.0\n*   HDP 2.5.0.0\n*   HDP 2.4.2.0\n*   HDP 2.4.0.0\n*   HDP 2.3.4.0\n*   HDP 2.3.2.0\n*   HDP 2.3.0.0\n\nTo use a different profiles, add the profile name to your maven build:\n```\nmvn test -P2.3.0.0\n```\n\nNote that backwards compatibility is not guarenteed.\n\nExamples:\n---------\n\n### HDFS Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-hdfs\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nHdfsLocalCluster hdfsLocalCluster = new HdfsLocalCluster.Builder()\n    .setHdfsNamenodePort(12345)\n    .setHdfsNamenodeHttpPort(12341)\n    .setHdfsTempDir(\"embedded_hdfs\")\n    .setHdfsNumDatanodes(1)\n    .setHdfsEnablePermissions(false)\n    .setHdfsFormat(true)\n    .setHdfsEnableRunningUserAsProxyUser(true)\n    .setHdfsConfig(new Configuration())\n    .build();\n                \nhdfsLocalCluster.start();\n```\n\n### YARN Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-yarn\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nYarnLocalCluster yarnLocalCluster = new YarnLocalCluster.Builder()\n    .setNumNodeManagers(1)\n    .setNumLocalDirs(Integer.parseInt(1)\n    .setNumLogDirs(Integer.parseInt(1)\n    .setResourceManagerAddress(\"localhost:37001\")\n    .setResourceManagerHostname(\"localhost\")\n    .setResourceManagerSchedulerAddress(\"localhost:37002\")\n    .setResourceManagerResourceTrackerAddress(\"localhost:37003\")\n    .setResourceManagerWebappAddress(\"localhost:37004\")\n    .setUseInJvmContainerExecutor(false)\n    .setConfig(new Configuration())\n    .build();\n   \nyarnLocalCluster.start();\n```\n\n### MapReduce Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-mapreduce\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nMRLocalCluster mrLocalCluster = new MRLocalCluster.Builder()\n    .setNumNodeManagers(1)\n    .setJobHistoryAddress(\"localhost:37005\")\n    .setResourceManagerAddress(\"localhost:37001\")\n    .setResourceManagerHostname(\"localhost\")\n    .setResourceManagerSchedulerAddress(\"localhost:37002\")\n    .setResourceManagerResourceTrackerAddress(\"localhost:37003\")\n    .setResourceManagerWebappAddress(\"localhost:37004\")\n    .setUseInJvmContainerExecutor(false)\n    .setConfig(new Configuration())\n    .build();\n\nmrLocalCluster.start();\n```\n\n### HBase Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-hbase\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nHbaseLocalCluster hbaseLocalCluster = new HbaseLocalCluster.Builder()\n    .setHbaseMasterPort(25111)\n    .setHbaseMasterInfoPort(-1)\n    .setNumRegionServers(1)\n    .setHbaseRootDir(\"embedded_hbase\")\n    .setZookeeperPort(12345)\n    .setZookeeperConnectionString(\"localhost:12345\")\n    .setZookeeperZnodeParent(\"/hbase-unsecure\")\n    .setHbaseWalReplicationEnabled(false)\n    .setHbaseConfiguration(new Configuration())\n    .activeRestGateway()\n        .setHbaseRestHost(\"localhost\")\n        .setHbaseRestPort(28000)\n        .setHbaseRestReadOnly(false)\n        .setHbaseRestThreadMax(100)\n        .setHbaseRestThreadMin(2)\n        .build()\n    .build();\n\nhbaseLocalCluster.start();\n```\n\n### Zookeeper Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-zookeeper\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nZookeeperLocalCluster zookeeperLocalCluster = new ZookeeperLocalCluster.Builder()\n    .setPort(12345)\n    .setTempDir(\"embedded_zookeeper\")\n    .setZookeeperConnectionString(\"localhost:12345\")\n    .setMaxClientCnxns(60)\n    .setElectionPort(20001)\n    .setQuorumPort(20002)\n    .setDeleteDataDirectoryOnClose(false)\n    .setServerId(1)\n    .setTickTime(2000)\n    .build();\n\nzookeeperLocalCluster.start();\n```\n\n### HiveServer2 Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-hiveserver2\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nHiveLocalServer2 hiveLocalServer2 = new HiveLocalServer2.Builder()\n    .setHiveServer2Hostname(\"localhost\")\n    .setHiveServer2Port(12348)\n    .setHiveMetastoreHostname(\"localhost\")\n    .setHiveMetastorePort(12347)\n    .setHiveMetastoreDerbyDbDir(\"metastore_db\")\n    .setHiveScratchDir(\"hive_scratch_dir\")\n    .setHiveWarehouseDir(\"warehouse_dir\")\n    .setHiveConf(new HiveConf())\n    .setZookeeperConnectionString(\"localhost:12345\")\n    .build();\n\nhiveLocalServer2.start();\n```\n\n### HiveMetastore Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-hivemetastore\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nHiveLocalMetaStore hiveLocalMetaStore = new HiveLocalMetaStore.Builder()\n    .setHiveMetastoreHostname(\"localhost\")\n    .setHiveMetastorePort(12347)\n    .setHiveMetastoreDerbyDbDir(\"metastore_db\")\n    .setHiveScratchDir(\"hive_scratch_dir\")\n    .setHiveWarehouseDir(\"warehouse_dir\")\n    .setHiveConf(new HiveConf())\n    .build();\n\nhiveLocalMetaStore.start();\n```\n\n### Storm Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-storm\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nStormLocalCluster stormLocalCluster = new StormLocalCluster.Builder()\n    .setZookeeperHost(\"localhost\")\n    .setZookeeperPort(12345)\n    .setEnableDebug(true)\n    .setNumWorkers(1)\n    .setStormConfig(new Config())\n    .build();\n\nstormLocalCluster.start();\n```\n\n### Kafka Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-kafka\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nKafkaLocalBroker kafkaLocalBroker = new KafkaLocalBroker.Builder()\n    .setKafkaHostname(\"localhost\")\n    .setKafkaPort(11111)\n    .setKafkaBrokerId(0)\n    .setKafkaProperties(new Properties())\n    .setKafkaTempDir(\"embedded_kafka\")\n    .setZookeeperConnectionString(\"localhost:12345\")\n    .build();\n\nkafkaLocalBroker.start();\n```\n\n### Oozie Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-oozie\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nOozieLocalServer oozieLocalServer = new OozieLocalServer.Builder()\n    .setOozieTestDir(\"embedded_oozie\")\n    .setOozieHomeDir(\"oozie_home\")\n    .setOozieUsername(System.getProperty(\"user.name\"))\n    .setOozieGroupname(\"testgroup\")\n    .setOozieYarnResourceManagerAddress(\"localhost\")\n    .setOozieHdfsDefaultFs(\"hdfs://localhost:8020/\")\n    .setOozieConf(new Configuration())\n    .setOozieHdfsShareLibDir(\"/tmp/oozie_share_lib\")\n    .setOozieShareLibCreate(Boolean.TRUE)\n    .setOozieLocalShareLibCacheDir(\"share_lib_cache\")\n    .setOoziePurgeLocalShareLibCache(Boolean.FALSE)\n    .setOozieShareLibFrameworks(\n        Lists.newArrayList(Framework.MAPREDUCE_STREAMING, Framework.OOZIE))\n    .build();\n\nOozieShareLibUtil oozieShareLibUtil = new OozieShareLibUtil(\n    oozieLocalServer.getOozieHdfsShareLibDir(),\n    oozieLocalServer.getOozieShareLibCreate(), \n    oozieLocalServer.getOozieLocalShareLibCacheDir(),\n    oozieLocalServer.getOoziePurgeLocalShareLibCache(), \n    hdfsLocalCluster.getHdfsFileSystemHandle(),\n    oozieLocalServer.getOozieShareLibFrameworks());\noozieShareLibUtil.createShareLib();\n\noozieLocalServer.start();\n```\n\n### MongoDB Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-mongodb\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nMongodbLocalServer mongodbLocalServer = new MongodbLocalServer.Builder()\n    .setIp(\"127.0.0.1\")\n    .setPort(11112)\n    .build();\n\nmongodbLocalServer.start();\n```\n\n### ActiveMQ Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-activemq\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nActivemqLocalBroker amq = new ActivemqLocalBroker.Builder()\n    .setHostName(\"localhost\")\n    .setPort(11113)\n    .setQueueName(\"defaultQueue\")\n    .setStoreDir(\"activemq-data\")\n    .setUriPrefix(\"vm://\")\n    .setUriPostfix(\"?create=false\")\n    .build();\n\namq.start();\n```\n\n### HyperSQL DB Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-hyperscaledb\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nhsqldbLocalServer = new HsqldbLocalServer.Builder()\n    .setHsqldbHostName(\"127.0.0.1\")\n    .setHsqldbPort(\"44111\")\n    .setHsqldbTempDir(\"embedded_hsqldb\")\n    .setHsqldbDatabaseName(\"testdb\")\n    .setHsqldbCompatibilityMode(\"mysql\")\n    .setHsqldbJdbcDriver(\"org.hsqldb.jdbc.JDBCDriver\")\n    .setHsqldbJdbcConnectionStringPrefix(\"jdbc:hsqldb:hsql://\")\n    .build();\n\nhsqldbLocalServer.start();\n```\n\n### Knox Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-knox\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nKnoxLocalCluster knoxCluster = new KnoxLocalCluster.Builder()\n    .setPort(8888)\n    .setPath(\"gateway\")\n    .setHomeDir(\"embedded_knox\")\n    .setCluster(\"mycluster\")\n    .setTopology(XMLDoc.newDocument(true)\n        .addRoot(\"topology\")\n            .addTag(\"gateway\")\n                .addTag(\"provider\")\n                    .addTag(\"role\").addText(\"authentication\")\n                    .addTag(\"enabled\").addText(\"false\")\n                    .gotoParent()\n                .addTag(\"provider\")\n                    .addTag(\"role\").addText(\"identity-assertion\")\n                    .addTag(\"enabled\").addText(\"false\")\n                    .gotoParent()\n                .gotoParent()\n            .addTag(\"service\")\n                .addTag(\"role\").addText(\"NAMENODE\")\n                .addTag(\"url\").addText(\"hdfs://localhost:8020\")\n                .gotoParent()\n            .addTag(\"service\")\n                .addTag(\"role\").addText(\"WEBHDFS\")\n                .addTag(\"url\").addText(\"http://localhost:50070/webhdfs\")\n        .gotoRoot().toString())\n        .build();\n\nknoxCluster.start();\n```\n\n### KDC Example\n```XML\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.sakserv\u003c/groupId\u003e\n    \u003cartifactId\u003ehadoop-mini-clusters-kdc\u003c/artifactId\u003e\n    \u003cversion\u003e0.1.16\u003c/version\u003e\n\u003c/dependency\u003e\n```\n```Java\nKdcLocalCluster kdcLocalCluster = new KdcLocalCluster.Builder()\n        .setPort(34340)\n        .setHost(\"127.0.0.1\")\n        .setBaseDir(\"embedded_kdc\")\n        .setOrgDomain(\"ORG\")\n        .setOrgName(\"ACME\")\n        .setPrincipals(\"hdfs,hbase,yarn,oozie,oozie_user,zookeeper,storm,mapreduce,HTTP\".split(\",\"))\n        .setKrbInstance(\"127.0.0.1\")\n        .setInstance(\"DefaultKrbServer\")\n        .setTransport(\"TCP\")\n        .setMaxTicketLifetime(86400000)\n        .setMaxRenewableLifetime(604800000)\n        .setDebug(false)\n        .build();\nkdcLocalCluster.start();\n```\n\nFind how to integrate KDC with HDFS, Zookeeper or HBase in the tests under hadoop-mini-clusters-kdc/src/test/java/com/github/sakserv/minicluster/impl\n\nModifying Properties\n--------------------\nTo change the defaults used to construct the mini clusters, modify src/main/java/resources/default.properties as needed.\n\n\nIntellij Testing\n----------------\n\nIf you desire running the full test suite from Intellij, make sure Fork Mode is set to method (Run -\u003e Edit Configurations -\u003e fork mode)\n\n\nInJvmContainerExecutor\n----------------------\nYarnLocalCluster now supports Oleg Z's InJvmContainerExecutor. See [Oleg Z's Github](https://github.com/hortonworks/mini-dev-cluster/wiki/Core-Features) for more.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsakserv%2Fhadoop-mini-clusters","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsakserv%2Fhadoop-mini-clusters","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsakserv%2Fhadoop-mini-clusters/lists"}