Centos 7版本下离线安装Hadoop完全分布式环境
Centos 7版本下离线安装Hadoop完全分布式环境
1、编辑hosts文件
[root@master ~]# vim /etc/hosts
2、将master中的/etc/hosts文件通过scp传输到各个节点中
[root@master ~]# scp /etc/hosts [email protected]:/etc/hosts
The authenticity of host '192.168.5.128 (192.168.5.128)' can't be established.
ECDSA key fingerprint is SHA256:WIen0BimPcoPziD6DYeAJzV2JeHQSBZVHosXWrczvaU.
ECDSA key fingerprint is MD5:b5:9b:b6:b7:ee:3a:75:2a:98:60:65:d2:69:43:84:02.
Are you sure you want to continue connecting (yes/no)? yes
Warning:Permanently added '192.168.5.128' (ECDSA) to the list of known hosts.
[email protected]'s password:
hosts 100% 221 137.2KB/s 00:00
[root@master ~]# scp /etc/hosts [email protected]:/etc/hosts
可以进行如下测试:
[hadoop@master ~]$ ping slave1
PING slave1 (192.168.5.129) 56(84) bytes of data.
64 bytes from slave1 (192.168.5.129): icmp_seq=1 ttl=64 time=1.67 ms
64 bytes from slave1 (192.168.5.129): icmp_seq=2 ttl=64 time=0.580 ms
[hadoop@master ~]$ ping slave2
PING slave2 (192.168.5.130) 56(84) bytes of data.
64 bytes from slave2 (192.168.5.130): icmp_seq=1 ttl=64 time=2.58 ms
64 bytes from slave2 (192.168.5.130): icmp_seq=2 ttl=64 time=1.19 ms
3、配置SSH免密登录
由于搭建的是完全分布式环境,可以使用3台机器完成环境搭建,而环境搭建中又需要3台机器互相通信,如果不采用免密互信,每次都需要输入用户名和密码,非常麻烦。
(1)生成公钥和私钥
[hadoop@master ~]$ ssh-keygen
回车确认即可
(2)将公钥进行存储
什么是authorized_keys :authorized_keys 是linux 操作系统下,专门用来存放公钥的地方,只要公钥放到了服务器的正确位置,并且拥有正确的权限,你才可以通过你的私钥,免密登录linux服务器。
[hadoop@master ~]$ cat .ssh/id_rsa.pub >> .ssh/authorized_keys
//将.ssh/id_rsa.pub中的公钥存到authorized_keys中
[hadoop@master ~]$ chmod 600 .ssh/authorized_keys
//修改authorized_keys的权限
(3)将.ssh文件夹传输到slave1和slave2中
[hadoop@master ~]$ scp -r .ssh hadoop@slave1:~/
[hadoop@master ~]$ scp -r .ssh hadoop@slave2:~/
测试是否能免密登录:
[hadoop@master ~]$ ssh slave1
Last login: Fri Mar 19 02:40:48 2021 from master
[hadoop@master ~]$ ssh slave2
Last login: Fri Mar 19 02:15:56 2021
能免密登录,说明ssh配置成功
4、将jre-8u281-linux-x64.tar.gz和hadoop-3.2.2.tar.gz两个软件包传输到master、slave1和slave2的/home/hadoop/目录,解压缩。
5、由于Centos 7安装的Minimal版本,无法运行jps,所以,需要安装java-1.8.0-openjdk相关组件。
安装方法:yum -y install java-1.8.0-openjdk*即可
6.添加hadoop的PATH环境变量,方便运行hadoop程序
[root@master ~]# su - hadoop
[hadoop@master ~]$ vim .bash_profile
7、将环境配置文件分发到slave1、slave2节点
[hadoop@master ~]$ scp -r .bash_profile [email protected]:~/
[hadoop@master ~]$ scp -r .bash_profile [email protected]:~/
8、编辑hadoop配置文件
(1)[hadoop@master hadoop]$ vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/tmp</value>
</property>
</configuration>
(2)[hadoop@master hadoop]$ vim hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/tmp/dfs/data</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
(3)[hadoop@master hadoop]$ vim yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resoucemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
(4)[hadoop@master hadoop]$ vim mapred-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(5)[hadoop@master hadoop]$ vim workers
slave1
slave2
9、将java和hadoop文件夹传至slave1和slave2主机上
[hadoop@master hadoop]$ scp -r jre1.8.0_281/ hadoop@slave1:~/
[hadoop@master hadoop]$ scp -r jre1.8.0_281/ hadoop@slave2:~/
[hadoop@master hadoop]$ scp -r /home/hadoop/hadoop-3.2.2 hadoop@slave1:~/
[hadoop@master hadoop]$ scp -r /home/hadoop/hadoop-3.2.2 hadoop@slave2:~/
10、将java和hadoop文件夹传至slave1和slave2主机上
[hadoop@master hadoop]$ scp -r jre1.8.0_281/ hadoop@slave1:~/
[hadoop@master hadoop]$ scp -r jre1.8.0_281/ hadoop@slave2:~/
[hadoop@master hadoop]$ scp -r /home/hadoop/hadoop-3.2.2 hadoop@slave1:~/
[hadoop@master hadoop]$ scp -r /home/hadoop/hadoop-3.2.2 hadoop@slave2:~/
11、如出现以下提示说明格式化文件系统成功
2021-03-21 01:59:10,350 INFO common.Storage: Storage directory /home/hadoop/tmp/dfs/name has been successfully formatted.
12、启动hadoop
[hadoop@master hadoop]$ start-dfs.sh
Starting namenodes on [master]
master: Warning: Permanently added 'master,192.168.5.128' (ECDSA) to the list of known hosts.
master: ERROR: JAVA_HOME is not set and could not be found.
Starting datanodes
slave2: ERROR: JAVA_HOME is not set and could not be found.
slave1: ERROR: JAVA_HOME is not set and could not be found.
Starting secondary namenodes [master]
master: ERROR: JAVA_HOME is not set and could not be found.
如果提示以上错误,说明hadoop-env.sh环境中未配置JAVA_HOME
编辑以下文件:
[hadoop@master ~]$ vim hadoop-3.2.2/etc/hadoop/hadoop-env.sh
添加以下文件
export JAVA_HOME=/home/hadoop/jre1.8.0_281
并将此文件传到slave1和slave2中
[hadoop@master ~]$ scp hadoop-3.2.2/etc/hadoop/hadoop-env.sh hadoop@slave1:/home/hadoop/hadoop-3.2.2/etc/hadoop/hadoop-env.sh
hadoop-env.sh 100% 16KB 4.6MB/s 00:00
[hadoop@master ~]$ scp hadoop-3.2.2/etc/hadoop/hadoop-env.sh hadoop@slave2:/home/hadoop/hadoop-3.2.2/etc/hadoop/hadoop-env.sh
hadoop-env.sh
100% 16KB 1.7MB/s 00:00
再启动hadoop
[hadoop@master ~]$ start-dfs.sh
Starting namenodes on [master]
Starting datanodes
slave1: WARNING: /home/hadoop/hadoop-3.2.2/logs does not exist. Creating.
slave2: WARNING: /home/hadoop/hadoop-3.2.2/logs does not exist. Creating.
Starting secondary namenodes [master]
此时已经成功,下面使用jps查看运行进程
[hadoop@master ~]$ jps
10178 Jps
9840 NameNode
10060 SecondaryNameNode
启动YARN进程:
[hadoop@master ~]$ start-yarn.sh
Starting resourcemanager
Starting nodemanagers
[hadoop@master ~]$ jps
9840 NameNode
10610 Jps
10310 ResourceManager
10060 SecondaryNameNode
启动MapReduce JobHistory Server,并在指定服务器上以mapred运行:
[hadoop@master ~]$ mapred --daemon start historyserver
[hadoop@master ~]$ jps
9840 NameNode
10722 Jps
10310 ResourceManager
10666 JobHistoryServer
10060 SecondaryNameNode
停止以上进程使用方法:
[hadoop@master ~]$ mapred --daemon stop historyserver
[hadoop@master ~]$ stop-yarn.sh
[hadoop@master ~]$ stop-dfs.sh
13、NameNode启动后界面
14、ResourceManager启动后WEB界面
15、MapReduce JobHistory Server启动后WEB界面
文字提供:王世刚
编辑排版:祝润丽、刘雨轩
技术指导:袁鸿琴