1. Download Hadoop  available at: http://hadoop.apache.org/releases.html#Download
  2. Unpack the download file on your local filesystem and take note of the directory created, lets call this the installation directory for $HADOOP_HOME eg. My Hadoop installation directory is: /Users/wagied/Libs/hadoop-1.1.2
  3.  Edit your .bashrc file in your HOME directory eg. /Users/wagied/.bashrc HADOOP_HOME=”${LIBS_HOME}/hadoop-1.1.2″; PATH=”${PATH}:${HADOOP_HOME}/bin”;
  4.  Execute command to take effect: > source ~/.bashrc
  5.  Test hadoop command:
Execute hadoop command
Execute hadoop command

6. Format the HDFS (Hadoop Distributed File System)

Format hdfs
Format hdfs

7.  Create a password-less ssh key:

Password-less ssh key
Password-less ssh key

Screen Shot 2013-05-24 at 8.28.37 PM

8. Try to execute bin/start-all.sh – Connection refused port 22 Screen Shot 2013-05-24 at 8.27.44 PM

Trouble-shoot execute command: ssh localhost 9. Enable remote login on your mac

Enable Remote login
Enable Remote login

10. Set-up a psuedo-cluster configuration

Edit: conf/core-sites.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

Edit conf/hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

Edit conf/mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>

11. Re-format the HDFS and re-run bin/start-all.sh

Re-format and execute start-all.sh
Re-format and execute start-all.sh
bin/start-all.sh
bin/start-all.sh

12.

Browse: http://localhost:50070/dfshealth.jsp

Screen Shot 2013-05-24 at 9.22.21 PM

http://localhost:50030/jobtracker.jsp

Jobtracker server
Jobtracker server

12. Example job

Screen Shot 2013-05-24 at 9.29.00 PM

13. Check Jobs status

Screen Shot 2013-05-24 at 9.33.14 PM

 

14. Stop the server:

Stopping the server
Stopping the server