1. Installing Hadoop On Windows 7
  2. Hadoop Download For Windows 10
  3. Run Hadoop On Windows 10
Install apache hadoop on windows

Building and configuring Hadoop on Windows Select Start — All Programs — Microsoft Windows SDK v7.1 and open the Windows SDK 7 command prompt as the administrator. Change the directory to C: hadoop (if it doesn t exist, create it). I installed Hadoop on windows 7 but after installing Hadoop, Namenode is not running. I guess there is some. At java.net.URI.(URI.java:595).

Last modified: July 11, 2017

Installing hadoop on windows 7

Ask all your coding questions @ codequery.io

Hadoop runs in three different modes. They are:

local/standalone mode

This is the default configuration for Hadoop out of the box. In standalone mode, Hadoop runs as a single process on your machine.

pseudo-distributed mode

In this mode, Hadoop runs each daemon as a separate Java process. This mimics a distributed implementation while running on a single machine.

fully distributed mode

This is a production level implementation that runs on a minimum of two or more machines.

For this tutorial, we will be implementing Hadoop in pseudo-distributed mode. This will allow you to practice a distributed implementation without the physical hardware needed to run a fully distributed cluster.

Configuring Hadoop

If you followed Hadoop environment setup then you should already have Java and Hadoop installed. To configure Hadoop for pseudo-distributed mode, you'll need to configure the following files located in /usr/local/hadoop/etc/hadoop:

Installing Hadoop On Windows 7

core-site.xml

This file defines port number, memory, memory limits, size of read/write buffers used by Hadoop. Find this file in the etc/hadoop directory and give it the following contents:

Hadoop on windows 10

This sets the URI for all filesystem requests in Hadoop.

hdfs-site.xml

This is the main configuration file for HDFS. It defines the namenode and datanode paths as well as replication factor. Find this file in the etc/hadoop/ directory and replace it with the following:

Notice how we set the replication factor via the dfs.replication property. We define the namenode path dfs.name.dir to point to an hdfs directory under the hadoop user folder. We point the data node path dfs.data.dir to a similar destination.

It's important to remember that the paths we define for the namenode and datanode should be under the user we created for hadoop. This keeps hdfs isolated within the context of the hadoop user and also ensures the hadoop user will have read/write access to the file paths it needs to create.

yarn-site.xml

Yarn is a resource management platform for Hadoop. To configure Yarn, find the yarn-site.xml file in the /etc/hadoop/ directory and replace it with the following:

mapred-site.xml

This file defines the MapReduce framework for Hadoop. Hadoop provides a mapred-site.xml.template file out of the box, so first copy this into a new mapred-site.xml file via:

Now replace the contents of the mapred-site.xml with the following:

Configuring the Hadoop User Environment

Now that you've configured your Hadoop instance for pseudo-distributed mode, it's time to configure the hadoop user environment.

Log in as the hadoop user you created in Hadoop Environment Setup via:

Hadoop

As the Hadoop user, add the following to your ~/.bashrc profile:

This will add all of the required path variables to your profile so you can execute Hadoop commands and scripts. To register the changes to your profile, run:

Configuring Java for Hadoop

To use Java with Hadoop, you must add the java_home environment variable in hadoop-env.sh. Find the hadoop-env.sh file in the same /etc/hadoop/ directory and add the following:

This points Hadoop to your Java installation from Hadoop Environment Setup. You don't need to worry about running the source command, just update and save the file.

Verify Hadoop Configuration

You should be all set to start working with HDFS. To make sure everything is configured properly, navigate to the home directory for the hadoop user and run:

This will set up the namenode for HDFS. If everything is configured correctly, you should see something similar to this:

Verify Yarn

To start Yarn, run the following:

If yarn is configured properly, you should see something similar to the following output:

Configure

Verify HDFS

To ensure dfs is working properly, run the following command to start dfs:

Hadoop Download For Windows 10

If dfs starts successfully, you won't see any stack-trace errors and should see something similar to the output below:

Conclusion

Run Hadoop On Windows 10

Hadoop should now be properly configured for pseudo-distributed mode. You can verify things are working through the browser as well. Visit http://localhost:50070/ to see current running Hadoop services and http://localhost:8088/ to see a list of all applications running on the cluster.

Next we'll look at HDFS including basic architecture and commands.

be sure to check out Hadoop Environment setup to get information on configuring SSH and a dedicated hadoop user for your environment

You might also like: