Thursday, October 13, 2022

Setting up a three node Spark cluster on Ubuntu using VirtualBox (Apr 2020)

Setting hostname in three Guest OS(s):
$ sudo gedit /etc/hostname
    "to master, slave1, and slave2 on different machines"

-----------------------------------------------------------

ON MASTER (Host OS IP: 192.168.1.12):

$ cat /etc/hosts

192.169.1.12	master
192.168.1.3		slave1
192.168.1.4		slave2

Note: Mappings "127.0.0.1  master" and "127.0.1.1  master" should not be there.

$ cd /usr/local/spark/conf
$ sudo gedit slaves

slave1
slave2

$ sudo gedit spark-env.sh

# YOU CAN SET PORTS HERE, IF PORT-USE ISSUE COMES: SPARK_MASTER_PORT=10000 / SPARK_MASTER_WEBUI_PORT=8080
# REMEMBER TO ADD THESE PORTS ALSO IN THE VM SETTING FOR PORT FORWARDING.

export SPARK_WORKER_CORES=2
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/

$ sudo gedit ~/.bashrc
Add the below line at the end of the file.
export SPARK_HOME=/usr/local/spark

Later we will start all the Spark cluster, using the following command:
$ cd /usr/local/spark/sbin
$ source start-all.sh

-----------------------------------------------------------

ON SLAVE2 (Host OS IP: 192.168.1.4):

$ cat /etc/hostname
slave2

$ cat /etc/hosts
192.169.1.12	master
192.168.1.3		slave1
192.168.1.4		slave2

Note: Localhost mappings are removed.
---

$ cd /usr/local/spark/conf
$ sudo gedit spark-env.sh
#Setting SPARK_LOCAL_IP to "192.168.1.4" (the Host OS IP) would be wrong and would result in port failure logs.
SPARK_LOCAL_IP=127.0.0.1

SPARK_MASTER_HOST=192.168.1.12
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/

-----------------------------------------------------------

FOLLOW THE STEPS MENTIONED FOR SLAVE2 ALSO FOR SLAVE1 (Host OS IP: 192.168.1.3)

-----------------------------------------------------------

Configuring Key Based Login

Setup SSH in every node such that they can communicate with one another without any prompt for password.

First: Follow steps 1 to 7 on every node.

1) sudo apt-get install openssh-server openssh-client 2) sudo iptables -A INPUT -p tcp --dport ssh -j ACCEPT 3) Use the network adapter 'NAT' in the Guest OS settings, and create a new port forwarding rule "SSH" for port 22. 4) sudo reboot 5) ssh-keygen -t rsa -f ~/.ssh/id_rsa -P "" 6) sudo service ssh stop 7) sudo service ssh start

Second: After 'First' is done, follow steps 8 to 10 on every node.

8) ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@master 9) ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@slave1 10) ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@slave2 LOGS: (base) ashish@ashish-VirtualBox:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@ashish-VirtualBox /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/ashish/.ssh/id_rsa.pub" The authenticity of host 'ashish-virtualbox (127.0.1.1)' can't be established. ECDSA key fingerprint is SHA256:FfT9M7GMzBA/yv8dw+7hKa91B1D68gLlMCINhbj3mt4. Are you sure you want to continue connecting (yes/no)? y Please type 'yes' or 'no': yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys ashish@ashish-virtualbox's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'ashish@ashish-VirtualBox'" and check to make sure that only the key(s) you wanted were added. ----------------------------------------------------------- FEW SSH COMMANDS: 1) Checking the SSH status: (base) ashish@ashish-VirtualBox:~$ sudo service ssh status [sudo] password for ashish: ● ssh.service - OpenBSD Secure Shell server Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2019-07-24 18:03:50 IST; 1h 1min ago Process: 953 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS) Process: 946 ExecReload=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS) Process: 797 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS) Main PID: 819 (sshd) Tasks: 1 (limit: 4915) CGroup: /system.slice/ssh.service └─819 /usr/sbin/sshd -D Jul 24 18:03:28 ashish-VirtualBox systemd[1]: Starting OpenBSD Secure Shell server... Jul 24 18:03:50 ashish-VirtualBox sshd[819]: Server listening on 0.0.0.0 port 22. Jul 24 18:03:50 ashish-VirtualBox sshd[819]: Server listening on :: port 22. Jul 24 18:03:50 ashish-VirtualBox systemd[1]: Started OpenBSD Secure Shell server. Jul 24 18:04:12 ashish-VirtualBox systemd[1]: Reloading OpenBSD Secure Shell server. Jul 24 18:04:12 ashish-VirtualBox sshd[819]: Received SIGHUP; restarting. Jul 24 18:04:12 ashish-VirtualBox sshd[819]: Server listening on 0.0.0.0 port 22. Jul 24 18:04:12 ashish-VirtualBox sshd[819]: Server listening on :: port 22. 2) (base) ashish@ashish-VirtualBox:/etc/ssh$ sudo gedit ssh_config You can change SSH port here. 3) Use the network adapter 'NAT' in the GuestOS settings, and create a new port forwarding rule "SSH" for port you are mentioning in Step 2. 4.A) sudo service ssh stop 4.B) sudo service ssh start LOGS: ashish@master:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@slave1 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/ashish/.ssh/id_rsa.pub" The authenticity of host 'slave1 (192.168.1.3)' can't be established. ECDSA key fingerprint is SHA256:+GsO1Q6ilqwIYfZLIrBTtt/5HqltZPSjVlI36C+f7ZE. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys ashish@slave1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'ashish@slave1'" and check to make sure that only the key(s) you wanted were added. ashish@master:~$ ssh slave1 Welcome to Ubuntu 19.04 (GNU/Linux 5.0.0-21-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage 99 updates can be installed immediately. 0 of these updates are security updates. ----------------------------------------------------------- LOGS FROM SPARK MASTER ON SUCCESSFUL START: $ cd /usr/local/spark/sbin $ source start-all.sh ashish@master:/usr/local/spark/sbin$ cat /usr/local/spark/logs/spark-ashish-org.apache.spark.deploy.master.Master-1-master.out Spark Command: /usr/lib/jvm/java-8-openjdk-amd64//bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host master --port 60000 --webui-port 50000 ======================================== Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/08/05 17:32:09 INFO Master: Started daemon with process name: 1664@master 19/08/05 17:32:10 INFO SignalUtils: Registered signal handler for TERM 19/08/05 17:32:10 INFO SignalUtils: Registered signal handler for HUP 19/08/05 17:32:10 INFO SignalUtils: Registered signal handler for INT 19/08/05 17:32:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 19/08/05 17:32:11 INFO SecurityManager: Changing view acls to: ashish 19/08/05 17:32:11 INFO SecurityManager: Changing modify acls to: ashish 19/08/05 17:32:11 INFO SecurityManager: Changing view acls groups to: 19/08/05 17:32:11 INFO SecurityManager: Changing modify acls groups to: 19/08/05 17:32:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ashish); groups with view permissions: Set(); users with modify permissions: Set(ashish); groups with modify permissions: Set() 19/08/05 17:32:13 INFO Utils: Successfully started service 'sparkMaster' on port 60000. 19/08/05 17:32:13 INFO Master: Starting Spark master at spark://master:60000 19/08/05 17:32:13 INFO Master: Running Spark version 2.4.3 19/08/05 17:32:13 INFO Utils: Successfully started service 'MasterUI' on port 50000. 19/08/05 17:32:13 INFO MasterWebUI: Bound MasterWebUI to 127.0.0.1, and started at http://master:50000 19/08/05 17:32:14 INFO Master: I have been elected leader! New state: ALIVE -----------------------------------------------------------
Tags: Technology,Spark,Linux

No comments:

Post a Comment