Setting hostname in three Guest OS(s): $ sudo gedit /etc/hostname "to master, slave1, and slave2 on different machines" ----------------------------------------------------------- ON MASTER (Host OS IP: 192.168.1.12): $ cat /etc/hosts 192.169.1.12 master 192.168.1.3 slave1 192.168.1.4 slave2 Note: Mappings "127.0.0.1 master" and "127.0.1.1 master" should not be there. $ cd /usr/local/spark/conf $ sudo gedit slaves slave1 slave2 $ sudo gedit spark-env.sh # YOU CAN SET PORTS HERE, IF PORT-USE ISSUE COMES: SPARK_MASTER_PORT=10000 / SPARK_MASTER_WEBUI_PORT=8080 # REMEMBER TO ADD THESE PORTS ALSO IN THE VM SETTING FOR PORT FORWARDING. export SPARK_WORKER_CORES=2 export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ $ sudo gedit ~/.bashrc Add the below line at the end of the file. export SPARK_HOME=/usr/local/spark Later we will start all the Spark cluster, using the following command: $ cd /usr/local/spark/sbin $ source start-all.sh ----------------------------------------------------------- ON SLAVE2 (Host OS IP: 192.168.1.4): $ cat /etc/hostname slave2 $ cat /etc/hosts 192.169.1.12 master 192.168.1.3 slave1 192.168.1.4 slave2 Note: Localhost mappings are removed. --- $ cd /usr/local/spark/conf $ sudo gedit spark-env.sh #Setting SPARK_LOCAL_IP to "192.168.1.4" (the Host OS IP) would be wrong and would result in port failure logs. SPARK_LOCAL_IP=127.0.0.1 SPARK_MASTER_HOST=192.168.1.12 export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ ----------------------------------------------------------- FOLLOW THE STEPS MENTIONED FOR SLAVE2 ALSO FOR SLAVE1 (Host OS IP: 192.168.1.3) ----------------------------------------------------------- Configuring Key Based Login Setup SSH in every node such that they can communicate with one another without any prompt for password.First: Follow steps 1 to 7 on every node.
1) sudo apt-get install openssh-server openssh-client 2) sudo iptables -A INPUT -p tcp --dport ssh -j ACCEPT 3) Use the network adapter 'NAT' in the Guest OS settings, and create a new port forwarding rule "SSH" for port 22. 4) sudo reboot 5) ssh-keygen -t rsa -f ~/.ssh/id_rsa -P "" 6) sudo service ssh stop 7) sudo service ssh startSecond: After 'First' is done, follow steps 8 to 10 on every node.
8) ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@master 9) ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@slave1 10) ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@slave2 LOGS: (base) ashish@ashish-VirtualBox:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@ashish-VirtualBox /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/ashish/.ssh/id_rsa.pub" The authenticity of host 'ashish-virtualbox (127.0.1.1)' can't be established. ECDSA key fingerprint is SHA256:FfT9M7GMzBA/yv8dw+7hKa91B1D68gLlMCINhbj3mt4. Are you sure you want to continue connecting (yes/no)? y Please type 'yes' or 'no': yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys ashish@ashish-virtualbox's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'ashish@ashish-VirtualBox'" and check to make sure that only the key(s) you wanted were added. ----------------------------------------------------------- FEW SSH COMMANDS: 1) Checking the SSH status: (base) ashish@ashish-VirtualBox:~$ sudo service ssh status [sudo] password for ashish: ● ssh.service - OpenBSD Secure Shell server Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2019-07-24 18:03:50 IST; 1h 1min ago Process: 953 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS) Process: 946 ExecReload=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS) Process: 797 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=0/SUCCESS) Main PID: 819 (sshd) Tasks: 1 (limit: 4915) CGroup: /system.slice/ssh.service └─819 /usr/sbin/sshd -D Jul 24 18:03:28 ashish-VirtualBox systemd[1]: Starting OpenBSD Secure Shell server... Jul 24 18:03:50 ashish-VirtualBox sshd[819]: Server listening on 0.0.0.0 port 22. Jul 24 18:03:50 ashish-VirtualBox sshd[819]: Server listening on :: port 22. Jul 24 18:03:50 ashish-VirtualBox systemd[1]: Started OpenBSD Secure Shell server. Jul 24 18:04:12 ashish-VirtualBox systemd[1]: Reloading OpenBSD Secure Shell server. Jul 24 18:04:12 ashish-VirtualBox sshd[819]: Received SIGHUP; restarting. Jul 24 18:04:12 ashish-VirtualBox sshd[819]: Server listening on 0.0.0.0 port 22. Jul 24 18:04:12 ashish-VirtualBox sshd[819]: Server listening on :: port 22. 2) (base) ashish@ashish-VirtualBox:/etc/ssh$ sudo gedit ssh_config You can change SSH port here. 3) Use the network adapter 'NAT' in the GuestOS settings, and create a new port forwarding rule "SSH" for port you are mentioning in Step 2. 4.A) sudo service ssh stop 4.B) sudo service ssh start LOGS: ashish@master:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub ashish@slave1 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/ashish/.ssh/id_rsa.pub" The authenticity of host 'slave1 (192.168.1.3)' can't be established. ECDSA key fingerprint is SHA256:+GsO1Q6ilqwIYfZLIrBTtt/5HqltZPSjVlI36C+f7ZE. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys ashish@slave1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'ashish@slave1'" and check to make sure that only the key(s) you wanted were added. ashish@master:~$ ssh slave1 Welcome to Ubuntu 19.04 (GNU/Linux 5.0.0-21-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage 99 updates can be installed immediately. 0 of these updates are security updates. ----------------------------------------------------------- LOGS FROM SPARK MASTER ON SUCCESSFUL START: $ cd /usr/local/spark/sbin $ source start-all.sh ashish@master:/usr/local/spark/sbin$ cat /usr/local/spark/logs/spark-ashish-org.apache.spark.deploy.master.Master-1-master.out Spark Command: /usr/lib/jvm/java-8-openjdk-amd64//bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host master --port 60000 --webui-port 50000 ======================================== Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/08/05 17:32:09 INFO Master: Started daemon with process name: 1664@master 19/08/05 17:32:10 INFO SignalUtils: Registered signal handler for TERM 19/08/05 17:32:10 INFO SignalUtils: Registered signal handler for HUP 19/08/05 17:32:10 INFO SignalUtils: Registered signal handler for INT 19/08/05 17:32:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 19/08/05 17:32:11 INFO SecurityManager: Changing view acls to: ashish 19/08/05 17:32:11 INFO SecurityManager: Changing modify acls to: ashish 19/08/05 17:32:11 INFO SecurityManager: Changing view acls groups to: 19/08/05 17:32:11 INFO SecurityManager: Changing modify acls groups to: 19/08/05 17:32:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ashish); groups with view permissions: Set(); users with modify permissions: Set(ashish); groups with modify permissions: Set() 19/08/05 17:32:13 INFO Utils: Successfully started service 'sparkMaster' on port 60000. 19/08/05 17:32:13 INFO Master: Starting Spark master at spark://master:60000 19/08/05 17:32:13 INFO Master: Running Spark version 2.4.3 19/08/05 17:32:13 INFO Utils: Successfully started service 'MasterUI' on port 50000. 19/08/05 17:32:13 INFO MasterWebUI: Bound MasterWebUI to 127.0.0.1, and started at http://master:50000 19/08/05 17:32:14 INFO Master: I have been elected leader! New state: ALIVE -----------------------------------------------------------
Thursday, October 13, 2022
Setting up a three node Spark cluster on Ubuntu using VirtualBox (Apr 2020)
Labels:
Linux,
Spark,
Technology
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment