{"id":16532835,"url":"https://github.com/mjstealey/hadoop","last_synced_at":"2025-10-28T11:31:30.986Z","repository":{"id":93420400,"uuid":"121538635","full_name":"mjstealey/hadoop","owner":"mjstealey","description":"Apache Hadoop - Docker distribution based on CentOS 7 and Oracle Java 8","archived":false,"fork":false,"pushed_at":"2018-02-20T19:46:33.000Z","size":26,"stargazers_count":11,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-01T14:51:12.650Z","etag":null,"topics":["centos7","docker","hadoop","java8"],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mjstealey.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-02-14T17:18:26.000Z","updated_at":"2024-09-21T19:33:23.000Z","dependencies_parsed_at":"2023-03-13T17:18:27.155Z","dependency_job_id":null,"html_url":"https://github.com/mjstealey/hadoop","commit_stats":{"total_commits":6,"total_committers":1,"mean_commits":6.0,"dds":0.0,"last_synced_commit":"37b0239dfe1ba7a48e367553866b2fef7fedf15f"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjstealey%2Fhadoop","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjstealey%2Fhadoop/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjstealey%2Fhadoop/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjstealey%2Fhadoop/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mjstealey","download_url":"https://codeload.github.com/mjstealey/hadoop/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238645729,"owners_count":19506902,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["centos7","docker","hadoop","java8"],"created_at":"2024-10-11T18:13:32.192Z","updated_at":"2025-10-28T11:31:25.716Z","avatar_url":"https://github.com/mjstealey.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Apache Hadoop in Docker\n\nThis work has been inspired by:\n\n- techadmin.net: [Setup Hadoop cluster on CentOS](https://tecadmin.net/setup-hadoop-single-node-cluster-on-centos-redhat/)\n- Oracle Java 8: [binarybabel/docker-jdk](https://github.com/binarybabel/docker-jdk/blob/master/src/centos.Dockerfile)\n- CentOS 7 base image: [krallin/tini-images](https://github.com/krallin/tini-images)\n- ExoGENI Recipes: [RENCI-NRIG/exogeni-recipes/hadoop](https://github.com/RENCI-NRIG/exogeni-recipes/tree/master/hadoop/hadoop-2)\n\n### What Is Apache Hadoop?\n\nThe Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.\n\nThe Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.\n\nSee [official documentation](http://hadoop.apache.org) for more information.\n\n## How to use this image\n\n### Build locally\n\n\n```\n$ docker build -t renci/hadoop:2.9.0 ./2.9.0/\n  ...\n$ docker images\nREPOSITORY            TAG                 IMAGE ID            CREATED             SIZE\nrenci/hadoop          2.9.0               4a4de8ed48b2        3 minutes ago       1.92GB\n...\n```\n\nExample `docker-compose.yml` file included that builds from local repository and deploys a single node cluster based on [[1](https://tecadmin.net/setup-hadoop-single-node-cluster-on-centos-redhat/)].\n\n```\n$ docker-compose build\n  ...\n$ docker-compose up -d\n  ...\n$ docker-compose ps\n Name               Command               State                                             Ports\n--------------------------------------------------------------------------------------------------------------------------------------------\nhadoop   /usr/local/bin/tini -- /do ...   Up      22/tcp, 0.0.0.0:50070-\u003e50070/tcp, 0.0.0.0:50075-\u003e50075/tcp, 0.0.0.0:50090-\u003e50090/tcp,\n                                                  0.0.0.0:8042-\u003e8042/tcp, 0.0.0.0:8088-\u003e8088/tcp\n```\n\n- Port mappings from above:\n\n\t```\n\tports:\n\t  - '8042:8042'    # NodeManager web ui\n\t  - '8088:8088'    # ResourceManager web ui\n\t  - '50070:50070'  # NameNode web ui \n\t  - '50075:50075'  # DataNode web ui\n\t  - '50090:50090'  # Secondary NameNode web ui\n\t```\n\n### From Docker Hub\n\nAutomated builds are generated at: [https://hub.docker.com/u/renci](https://hub.docker.com/u/renci/dashboard/) and can be pulled as follows.\n\n```\n$ docker pull renci/hadoop:2.9.0\n```\n\n## Example: Five node cluster\n\nUsing the provided [`5-node-cluster.yml`](5-node-cluster.yml) file to stand up a five node Hadoop cluster that includes a `namenode`, `resourcemanager` and three workers (`worker1`, `worker2` and `worker3`).\n\nHadoop docker network and port mappings (specific network values subject to change based on system):\n\n\u003cimg width=\"80%\" alt=\"Hadoop docker network\" src=\"https://user-images.githubusercontent.com/5332509/36402998-16456864-15b0-11e8-823e-807e434ebab8.png\"\u003e\n\nThe nodes will use the definitions found in the [site-files](site-files) directory to configure the cluster. These files can be modified as needed to configure your cluster as needed at runtime.\n\nA docker volume named `hadoop-public` is also created to allow the nodes to exchange SSH key information between themselves on startup.\n\n```yaml\nversion: '3.1'\n\nservices:\n  namenode:\n    image: renci/hadoop:2.9.0\n    container_name: namenode\n    volumes:\n      - hadoop-public:/home/hadoop/public\n      - ./site-files:/site-files\n    restart: always\n    hostname: namenode\n    networks:\n      - hadoop\n    ports:\n      - '50070:50070'\n    environment:\n      IS_NODE_MANAGER: 'false'\n      IS_NAME_NODE: 'true'\n      IS_SECONDARY_NAME_NODE: 'false'\n      IS_DATA_NODE: 'false'\n      IS_RESOURCE_MANAGER: 'false'\n      CLUSTER_NODES: namenode resourcemanager worker1 worker2 worker3\n\n  resourcemanager:\n    image: renci/hadoop:2.9.0\n    depends_on:\n      - namenode\n    container_name: resourcemanager\n    volumes:\n      - hadoop-public:/home/hadoop/public\n      - ./site-files:/site-files\n    restart: always\n    hostname: resourcemanager\n    networks:\n      - hadoop\n    ports:\n      - '8088:8088'\n    environment:\n      IS_NODE_MANAGER: 'false'\n      IS_NAME_NODE: 'false'\n      IS_SECONDARY_NAME_NODE: 'false'\n      IS_DATA_NODE: 'false'\n      IS_RESOURCE_MANAGER: 'true'\n      CLUSTER_NODES: namenode resourcemanager worker1 worker2 worker3\n\n  worker1:\n    image: renci/hadoop:2.9.0\n    depends_on:\n      - namenode\n    container_name: worker1\n    volumes:\n      - hadoop-public:/home/hadoop/public\n      - ./site-files:/site-files\n    restart: always\n    hostname: worker1\n    networks:\n      - hadoop\n    ports:\n      - '8042:8042'\n      - '50075:50075'\n    environment:\n      IS_NODE_MANAGER: 'true'\n      IS_NAME_NODE: 'false'\n      IS_SECONDARY_NAME_NODE: 'false'\n      IS_DATA_NODE: 'true'\n      IS_RESOURCE_MANAGER: 'false'\n      CLUSTER_NODES: namenode resourcemanager worker1 worker2 worker3\n\n  worker2:\n    image: renci/hadoop:2.9.0\n    depends_on:\n      - namenode\n    container_name: worker2\n    volumes:\n      - hadoop-public:/home/hadoop/public\n      - ./site-files:/site-files\n    restart: always\n    hostname: worker2\n    networks:\n      - hadoop\n    ports:\n      - '8043:8042'\n      - '50076:50075'\n    environment:\n      IS_NODE_MANAGER: 'true'\n      IS_NAME_NODE: 'false'\n      IS_SECONDARY_NAME_NODE: 'false'\n      IS_DATA_NODE: 'true'\n      IS_RESOURCE_MANAGER: 'false'\n      CLUSTER_NODES: namenode resourcemanager worker1 worker2 worker3\n\n  worker3:\n    image: renci/hadoop:2.9.0\n    depends_on:\n      - namenode\n    container_name: worker3\n    volumes:\n      - hadoop-public:/home/hadoop/public\n      - ./site-files:/site-files\n    restart: always\n    hostname: worker3\n    networks:\n      - hadoop\n    ports:\n      - '8044:8042'\n      - '50077:50075'\n    environment:\n      IS_NODE_MANAGER: 'true'\n      IS_NAME_NODE: 'false'\n      IS_SECONDARY_NAME_NODE: 'false'\n      IS_DATA_NODE: 'true'\n      IS_RESOURCE_MANAGER: 'false'\n      CLUSTER_NODES: namenode resourcemanager worker1 worker2 worker3\n\nvolumes:\n  hadoop-public:\n\nnetworks:\n  hadoop:\n```\n\n### Start the cluster \n\nUsing `docker-compose`\n\n```\n$ docker-compose -f 5-node-cluster.yml up -d\n```\n\nAfter a few moments all containers will be running and should display in a `ps` call.\n\n```\n$ docker-compose -f 5-node-cluster.yml ps\n     Name                    Command               State                            Ports\n-------------------------------------------------------------------------------------------------------------------\nnamenode          /usr/local/bin/tini -- /do ...   Up      22/tcp, 0.0.0.0:50070-\u003e50070/tcp\nresourcemanager   /usr/local/bin/tini -- /do ...   Up      22/tcp, 0.0.0.0:8088-\u003e8088/tcp\nworker1           /usr/local/bin/tini -- /do ...   Up      22/tcp, 0.0.0.0:50075-\u003e50075/tcp, 0.0.0.0:8042-\u003e8042/tcp\nworker2           /usr/local/bin/tini -- /do ...   Up      22/tcp, 0.0.0.0:50076-\u003e50075/tcp, 0.0.0.0:8043-\u003e8042/tcp\nworker3           /usr/local/bin/tini -- /do ...   Up      22/tcp, 0.0.0.0:50077-\u003e50075/tcp, 0.0.0.0:8044-\u003e8042/tcp\n```\n\nSince the ports of the containers were mapped to the host the various web ui's can be observed using a local browser.\n\n**namenode container**: NameNode Web UI on port 50070\n\nNameNode: [http://localhost:50070/dfshealth.html#tab-datanode](http://localhost:50070/dfshealth.html#tab-datanode)\n\n\u003cimg width=\"50%\" alt=\"NameNode\" src=\"https://user-images.githubusercontent.com/5332509/36226272-5546e344-119b-11e8-9076-ca65ae2c0c55.png\"\u003e\n\n**resource manager container**: ResourceManager Web UI on port 8088\n\nResourceManger: [http://localhost:8088/cluster](http://localhost:8088/cluster)\n\n\u003cimg width=\"50%\" alt=\"ResourceManager\" src=\"https://user-images.githubusercontent.com/5332509/36403411-c540a2e6-15b2-11e8-9857-bf5d605d52c7.png\"\u003e\n\n\n**worker1, worker2 and worker3 containers**: DataNode Web UI on ports 50075, 50076 and 50077, NodeManager Web UI on ports 8042, 8043 and 8044.\n\nDataNode (worker1): [http://localhost:50075/datanode.html](http://localhost:50075/datanode.html)\n\n\u003cimg width=\"50%\" alt=\"Worker1 DataManager\" src=\"https://user-images.githubusercontent.com/5332509/36226302-6c3f2fac-119b-11e8-8d90-824c8cd39490.png\"\u003e\n\nNodeManager (worker1): [http://localhost:8042/node](http://localhost:8042/node)\n\n\u003cimg width=\"50%\" alt=\"NodeManager\" src=\"https://user-images.githubusercontent.com/5332509/36226239-434059a0-119b-11e8-8c08-d33dd66bfdce.png\"\u003e\n\nWorker2 DataNode: [http://localhost:50076/datanode.html](http://localhost:50076/datanode.html)\n\n\u003cimg width=\"50%\" alt=\"Worker2 DataManager\" src=\"https://user-images.githubusercontent.com/5332509/36226329-8322fa3c-119b-11e8-8f96-4111eebe0c0e.png\"\u003e\n\nWorker3 DataNode: [http://localhost:50077/datanode.html](http://localhost:50077/datanode.html)\n\n\u003cimg width=\"50%\" alt=\"Worker3 DataManager\" src=\"https://user-images.githubusercontent.com/5332509/36226346-8fd9fa0a-119b-11e8-9a08-0133c36ed3ee.png\"\u003e\n\n### Stop the cluster\n\nThe cluster can be stopped by issuing a `stop` call.\n\n```\n$ docker-compose -f 5-node-cluster.yml stop\nStopping worker2         ... done\nStopping resourcemanager ... done\nStopping worker1         ... done\nStopping worker3         ... done\nStopping namenode        ... done\n```\n\n### Restart the cluster\n\nSo long as the container definitions have not been removed, the cluster can be restarted by using a `start` call.\n\n```\n$ docker-compose -f 5-node-cluster.yml start\nStarting namenode        ... done\nStarting worker1         ... done\nStarting worker3         ... done\nStarting worker2         ... done\nStarting resourcemanager ... done\n```\n\nAfter a few moments all cluster activity should be back to normal.\n\n### Remove the cluster\n\nThe entire cluster can be removed by first stopping it, and then removing the containers from the local machine.\n\n```\n$ docker-compose -f 5-node-cluster.yml stop \u0026\u0026 docker-compose -f 5-node-cluster.yml rm -f\nStopping worker2         ... done\nStopping resourcemanager ... done\nStopping worker1         ... done\nStopping worker3         ... done\nStopping namenode        ... done\nGoing to remove worker2, resourcemanager, worker1, worker3, namenode\nRemoving worker2         ... done\nRemoving resourcemanager ... done\nRemoving worker1         ... done\nRemoving worker3         ... done\nRemoving namenode        ... done\n```\n\n## Example: Map Reduce\n\n**NOTE**: Assumes the existence of the five node cluster from the previous example.\n\nA simple map reduce example has been provided in the [mapreduce-example.sh](mapreduce-example.sh) script.\n\nThe script is meant to be run from the host machine and uses `docker exec` to relay commands to the docker `namenode` container as the `hadoop` user.\n\n\n```\n$ ./mapreduce-example.sh\nINFO: remove input/output HDFS directories if they already exist\nrm: `input': No such file or directory\nrm: `output': No such file or directory\nINFO: hdfs dfs -mkdir -p /user/hadoop/input\nINFO: hdfs dfs -put hadoop/README.txt /user/hadoop/input/\nINFO: hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar wordcount input output\n18/02/17 19:42:38 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager/172.19.0.5:8032\n18/02/17 19:42:39 INFO input.FileInputFormat: Total input files to process : 1\n18/02/17 19:42:39 INFO mapreduce.JobSubmitter: number of splits:1\n18/02/17 19:42:39 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled\n18/02/17 19:42:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1518896527275_0001\n18/02/17 19:42:40 INFO impl.YarnClientImpl: Submitted application application_1518896527275_0001\n18/02/17 19:42:40 INFO mapreduce.Job: The url to track the job: http://resourcemanager:8088/proxy/application_1518896527275_0001/\n18/02/17 19:42:40 INFO mapreduce.Job: Running job: job_1518896527275_0001\n18/02/17 19:42:51 INFO mapreduce.Job: Job job_1518896527275_0001 running in uber mode : false\n18/02/17 19:42:51 INFO mapreduce.Job:  map 0% reduce 0%\n18/02/17 19:42:58 INFO mapreduce.Job:  map 100% reduce 0%\n18/02/17 19:43:05 INFO mapreduce.Job:  map 100% reduce 100%\n18/02/17 19:43:05 INFO mapreduce.Job: Job job_1518896527275_0001 completed successfully\n18/02/17 19:43:05 INFO mapreduce.Job: Counters: 49\n\tFile System Counters\n\t\tFILE: Number of bytes read=1836\n\t\tFILE: Number of bytes written=407057\n\t\tFILE: Number of read operations=0\n\t\tFILE: Number of large read operations=0\n\t\tFILE: Number of write operations=0\n\t\tHDFS: Number of bytes read=1480\n\t\tHDFS: Number of bytes written=1306\n\t\tHDFS: Number of read operations=6\n\t\tHDFS: Number of large read operations=0\n\t\tHDFS: Number of write operations=2\n\tJob Counters\n\t\tLaunched map tasks=1\n\t\tLaunched reduce tasks=1\n\t\tRack-local map tasks=1\n\t\tTotal time spent by all maps in occupied slots (ms)=3851\n\t\tTotal time spent by all reduces in occupied slots (ms)=3718\n\t\tTotal time spent by all map tasks (ms)=3851\n\t\tTotal time spent by all reduce tasks (ms)=3718\n\t\tTotal vcore-milliseconds taken by all map tasks=3851\n\t\tTotal vcore-milliseconds taken by all reduce tasks=3718\n\t\tTotal megabyte-milliseconds taken by all map tasks=3943424\n\t\tTotal megabyte-milliseconds taken by all reduce tasks=3807232\n\tMap-Reduce Framework\n\t\tMap input records=31\n\t\tMap output records=179\n\t\tMap output bytes=2055\n\t\tMap output materialized bytes=1836\n\t\tInput split bytes=114\n\t\tCombine input records=179\n\t\tCombine output records=131\n\t\tReduce input groups=131\n\t\tReduce shuffle bytes=1836\n\t\tReduce input records=131\n\t\tReduce output records=131\n\t\tSpilled Records=262\n\t\tShuffled Maps =1\n\t\tFailed Shuffles=0\n\t\tMerged Map outputs=1\n\t\tGC time elapsed (ms)=114\n\t\tCPU time spent (ms)=1330\n\t\tPhysical memory (bytes) snapshot=482201600\n\t\tVirtual memory (bytes) snapshot=3950104576\n\t\tTotal committed heap usage (bytes)=281018368\n\tShuffle Errors\n\t\tBAD_ID=0\n\t\tCONNECTION=0\n\t\tIO_ERROR=0\n\t\tWRONG_LENGTH=0\n\t\tWRONG_MAP=0\n\t\tWRONG_REDUCE=0\n\tFile Input Format Counters\n\t\tBytes Read=1366\n\tFile Output Format Counters\n\t\tBytes Written=1306\nINFO: hdfs dfs -ls /user/hadoop/output\nFound 2 items\n-rw-r--r--   2 hadoop supergroup          0 2018-02-17 19:43 /user/hadoop/output/_SUCCESS\n-rw-r--r--   2 hadoop supergroup       1306 2018-02-17 19:43 /user/hadoop/output/part-r-00000\nINFO: cat hadoop/README.txt\nFor the latest information about Hadoop, please visit our website at:\n\n   http://hadoop.apache.org/core/\n\nand our wiki, at:\n\n   http://wiki.apache.org/hadoop/\n\nThis distribution includes cryptographic software.  The country in\nwhich you currently reside may have restrictions on the import,\npossession, use, and/or re-export to another country, of\nencryption software.  BEFORE using any encryption software, please\ncheck your country's laws, regulations and policies concerning the\nimport, possession, or use, and re-export of encryption software, to\nsee if this is permitted.  See \u003chttp://www.wassenaar.org/\u003e for more\ninformation.\n\nThe U.S. Government Department of Commerce, Bureau of Industry and\nSecurity (BIS), has classified this software as Export Commodity\nControl Number (ECCN) 5D002.C.1, which includes information security\nsoftware using or performing cryptographic functions with asymmetric\nalgorithms.  The form and manner of this Apache Software Foundation\ndistribution makes it eligible for export under the License Exception\nENC Technology Software Unrestricted (TSU) exception (see the BIS\nExport Administration Regulations, Section 740.13) for both object\ncode and source code.\n\nThe following provides more details on the included cryptographic\nsoftware:\n  Hadoop Core uses the SSL libraries from the Jetty project written\nby mortbay.org.\nINFO: hdfs dfs -cat /user/hadoop/output/part-r-00000\n(BIS),\t1\n(ECCN)\t1\n(TSU)\t1\n(see\t1\n5D002.C.1,\t1\n740.13)\t1\n\u003chttp://www.wassenaar.org/\u003e\t1\nAdministration\t1\nApache\t1\nBEFORE\t1\nBIS\t1\nBureau\t1\nCommerce,\t1\nCommodity\t1\nControl\t1\nCore\t1\nDepartment\t1\nENC\t1\nException\t1\nExport\t2\nFor\t1\nFoundation\t1\nGovernment\t1\nHadoop\t1\nHadoop,\t1\nIndustry\t1\nJetty\t1\nLicense\t1\nNumber\t1\nRegulations,\t1\nSSL\t1\nSection\t1\nSecurity\t1\nSee\t1\nSoftware\t2\nTechnology\t1\nThe\t4\nThis\t1\nU.S.\t1\nUnrestricted\t1\nabout\t1\nalgorithms.\t1\nand\t6\nand/or\t1\nanother\t1\nany\t1\nas\t1\nasymmetric\t1\nat:\t2\nboth\t1\nby\t1\ncheck\t1\nclassified\t1\ncode\t1\ncode.\t1\nconcerning\t1\ncountry\t1\ncountry's\t1\ncountry,\t1\ncryptographic\t3\ncurrently\t1\ndetails\t1\ndistribution\t2\neligible\t1\nencryption\t3\nexception\t1\nexport\t1\nfollowing\t1\nfor\t3\nform\t1\nfrom\t1\nfunctions\t1\nhas\t1\nhave\t1\nhttp://hadoop.apache.org/core/\t1\nhttp://wiki.apache.org/hadoop/\t1\nif\t1\nimport,\t2\nin\t1\nincluded\t1\nincludes\t2\ninformation\t2\ninformation.\t1\nis\t1\nit\t1\nlatest\t1\nlaws,\t1\nlibraries\t1\nmakes\t1\nmanner\t1\nmay\t1\nmore\t2\nmortbay.org.\t1\nobject\t1\nof\t5\non\t2\nor\t2\nour\t2\nperforming\t1\npermitted.\t1\nplease\t2\npolicies\t1\npossession,\t2\nproject\t1\nprovides\t1\nre-export\t2\nregulations\t1\nreside\t1\nrestrictions\t1\nsecurity\t1\nsee\t1\nsoftware\t2\nsoftware,\t2\nsoftware.\t2\nsoftware:\t1\nsource\t1\nthe\t8\nthis\t3\nto\t2\nunder\t1\nuse,\t2\nuses\t1\nusing\t2\nvisit\t1\nwebsite\t1\nwhich\t2\nwiki,\t1\nwith\t1\nwritten\t1\nyou\t1\nyour\t1\nHDFS directories at: http://localhost:50070/explorer.html#/user/hadoop\n```\n\nNameNode: [http://localhost:50070/explorer.html#/user/hadoop](http://localhost:50070/explorer.html#/user/hadoop)\n\n\u003cimg width=\"50%\" alt=\"MapReduce Example\" src=\"https://user-images.githubusercontent.com/5332509/36345032-c4957f5a-13f1-11e8-95a1-6fadb9988157.png\"\u003e\n\n### References\n\n1. [https://tecadmin.net/setup-hadoop-single-node-cluster-on-centos-redhat/](https://tecadmin.net/setup-hadoop-single-node-cluster-on-centos-redhat/)\n2. [https://github.com/RENCI-NRIG/exogeni-recipes/hadoop/hadoop-2/hadoop\\_exogeni\\_postboot.sh](https://github.com/RENCI-NRIG/exogeni-recipes/blob/master/hadoop/hadoop-2/hadoop_exogeni_postboot.sh)\n3. Hadoop configuration files\n\t- Common: [hadoop-common/core-default.xml](http://hadoop.apache.org/docs/r2.9.0/hadoop-project-dist/hadoop-common/core-default.xml)\n\t- HDFS: [hadoop-hdfs/hdfs-default.xml](http://hadoop.apache.org/docs/r2.9.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml)\n\t- MapReduce: [hadoop-mapreduce-client-core/mapred-default.xml](http://hadoop.apache.org/docs/r2.9.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml)\n\t- Yarn: [hadoop-yarn-common/yarn-default.xml](http://hadoop.apache.org/docs/r2.9.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml)\n\t- Deprecated Properties: [hadoop-common/DeprecatedProperties.html](http://hadoop.apache.org/docs/r2.9.0/hadoop-project-dist/hadoop-common/DeprecatedProperties.html)\n4. Example MapReduce: [https://tecadmin.net/hadoop-running-a-wordcount-mapreduce-example/](https://tecadmin.net/hadoop-running-a-wordcount-mapreduce-example/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmjstealey%2Fhadoop","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmjstealey%2Fhadoop","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmjstealey%2Fhadoop/lists"}