In this article, I will take you through step by step on how to easily install Hadoop 3.3.0 on a mac OS – Big Sur (version 11.2.1) with HomeBrew for a single node cluster in pseudo-distributed mode.
Install Hadoop on Mac
The installation of Hadoop is divided into these steps:
The adoptopenjdk cask will automatically upgrade to the newest patch or major release as soon as it comes out. To stay with a specific major release, activate the AdoptOpenJDK tap with brew tap and then install the desired version with brew install -cask: $ brew tap AdoptOpenJDK/openjdk $ brew install -cask. From the below output, we can see there are three JDK that has been installed, one ( adoptopenjdk-8.jdk ) is installed with homebrew, the other two ( jdk-12.0.1.jdk, jdk1.8.0211.jdk ) are installed by download installation file from oracle. Install and Update on macOS. Manage multiple java versions on macOS can be tricky. There’s different ways to install: homebrew cask; download java distribution from Oracle.
- Install Java environment
- Install SSH
- Install HomeBrew
- Install Hadoop through HomeBrew
- Health Check
Install Java Environment
- Open the terminal and enter java -version to check the current Java version. If no version is returned then go for the official website to install it
- After the installation is complete, Configure the environment variables of JAVA. JDK is installed in the directory /Library/Java/JavaVirtualMachines.
- Enter vim ~/.bash_profile in the terminal to configure the Java path.
- Place the following statement in the blank line export JAVA_HOME=”/Library/Java/JavaVirtualMachines/ jdk version.jdk/Contents/Home”
- Then, execute source ~/.bash_profile in the terminal to make the configuration file effective.
- Then enter the java -version in the terminal, you can see the Java version. (Similar to below)
Note: In recent versions, the mac should have built-in java, but it is possible that the version will be lower, and the lower version will affect the installation of Hadoop.
Install Homebrew
Homebrew is very commonly used on mac, not much to describe, installation method
Install SSH
After this step open terminal and enter “ssh localhost”, you should log in without a password and that indicates your settings is successful
Install hadoop
Note: latest hadoop will be installed. In our case Hadoop 3.3.0 at the time of writing this article.
After the installation is complete, we enter “hadoop version” to view the version. If there is an information receipt, the installation is successful.
Hadoop configuration
Please make the changes with the below configuration details to the following files under $HADOOP_HOME/etc/hadoop/ to set HDFS.
- core-site.xml
- hdfs-site.xml
- mapred-site.xml
- yarn-site.xml
- hadoop-env.sh
Get the configuration variables in ~/.bash_profile file in $HOME directory
Hadoop Installed Path
Core-site.xml
Open $HADOOP_HOME/etc/hadoop/Core-site.xml in terminal and add below properties
Hdfs-site.xml
Open $HADOOP_HOME/etc/hadoop/Hdfs-site.xml file in terminal and add below properties
yarn-site.xml
Open $HADOOP_HOME/etc/hadoop/yarn-site.xml file and add below properties
Mapred-site.xml
Open $HADOOP_HOME/etc/hadoop/mapred-site.xml file in termial and add below properties.
hadoop-env.sh
Open $HADOOP_HOME/etc/hadoop/hadoop-env.sh file in terminal and add below properties
# export JAVA_HOME [Same as in .profile file]
hdfs format
Note: Open terminal and Initialize Hadoop cluster by formatting HDFS directory
Final Step
Run start-all.sh in the sbin folder
Use JPS command to check if all name node, Data node, resource manager is started successfully
Health Check
Running Basic HDFS Command
Related Online Courses
1. Online Courses – Hands-On Hadoop
What you’ll get from it: This course provides hands-on Hadoop with MapReduce, HDFS, Spark, Flink, Hive, HBase, MongoDB, Cassandra, Kafka, etc.
05 Feb 2019MacOS has OpenJDK installed by default however I prefer to use Oracle’s versionof JDK because its the official version. I don’t want to install it the same wayOracle instructs it on their docs as I find it very tedious. I’m a guy who lovesautomating stuff so I prefer to install it via Homebrew. I frequently do a cleaninstall on my Mac every time there is a new version of OSX so I have to installJDK again and again. I’d rather just run a single installation script instead ofheading over to Oracle’s website and following their instructions.
Steps to install and configure the Oracle JDK:
Homebrew and Cask
Homebrew is a package manager for Mac and has always been my preferred way toinstall my command line tools because I can integrate it with my setup scripts.To install it I’ll run
Notes
then I’ll install Homebrew Cask which is an extension of Homebrew. It makesthe installation of large binaries and graphical applications simpler.
JDK Installation
Before I install the JDK, I’ll check first which version it will install bydefault. I’m very picky about the version because most of the time I just useJava for Android development. I also prefer the older and more stable version ofJDK so I run
which will output
This means that the latest version is JDK 11. I can install it now by running
but I prefer to install JDK 8 over 11 so instead I’ll run
Setup Java_HOME environment variable
Once installed, I will set the JAVA_HOME environment variable by editing my.bash_profile
and inserting this line
and applying these changes by running
Verifying Installation
Now to confirm if the installation was sucessful I’ll run this command.
If the installation is successful, its output would be similiar to this
This tells me that I have installed the Oracle version of the JDK. However ifthe output is like this
then I may have failed to install the JDK properly or the changes may not havebeen applied yet because I can see that OpenJDK is still being used. I’ll try tofix this by restarting my Mac then running “java -version” again.
Automating installation with a script
Below is a simple script to automate the installation of the latest Oracle JDK.
That’s it! Now you can automate your JDK installation on you Mac by running thescript.