Home Database Mysql Tutorial hadoop1.0 高可靠性(HA)安装与总结

hadoop1.0 高可靠性(HA)安装与总结

Jun 07, 2016 pm 04:41 PM
reliability Install Summarize

继上次安装完Kerberos安全认证后,现在我在这基础上,又给CDH加上了HA(high availability),也就是高可靠性,具体来讲就是双NameNode,双Jobtracker(我还是在MRv1模式下),有了HA后,这下集群的健壮性就能够得到很好的保证了。 我还是按照官方文档来操作

继上次安装完Kerberos安全认证后,现在我在这基础上,又给CDH加上了HA(high availability),也就是高可靠性,具体来讲就是双NameNode,双Jobtracker(我还是在MRv1模式下),有了HA后,这下集群的健壮性就能够得到很好的保证了。

我还是按照官方文档来操作的,有了上次的经验,建议大家在具体操作实施前,先快速阅读一遍,做到心中有数,我还阅读了Apache官方的说明,也不用怎么详细,大概知道怎么回事就行了。

首先说明一点的就是,CDH5 只支持Quorum Journal Manager(QJM)模式下的HA,不支持NFS模式的,这点和Apache官方的不一样,大家要留意下。

下面说说我遇到的坑:

  • 按照software_config上面说的配置一步步来,如果要实现自动的Failover,需要安装zookeeper,安装也很简单,从http://archive.cloudera.com/cdh5/cdh/5/下载zookeeper-3.4.5-cdh5.0.2.tar.gz,然后按照zookeeper的安装说明安装即可,官方推荐zookeeper的集群数目为奇数,推荐值为3,我这样也配置了3台,zookeeper服务在在启动时会向集群内其他服务器发送认证数据,但是在第一次启动时难免有个先后顺序,所以先启动的节点向还没有启动的服务器发生数据时会报错,类型下面的错误信息
2014-07-17 14:49:06,151 [myid:1] - INFO  [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 1 (n.leader), 0x100000106 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x2 (n.peerEPoch), LOOKING (my state)2014-07-17 14:49:06,153 [myid:1] - WARN  [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open channel to 2 at election address node1/10.4.13.63:3888java.net.ConnectException: Connection refused    at java.net.PlainSocketImpl.socketConnect(Native Method)    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)    at java.net.Socket.connect(Socket.java:579)    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354)    at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:327)    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:393)    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:365)    at java.lang.Thread.run(Thread.java:745)
Copy after login

这个是正常的,等3台全部启动后,有如下日志就证明没问题了

2014-07-17 11:26:44,425 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)2014-07-17 11:26:44,426 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)
Copy after login
  • 在配置Securing access to ZooKeeper这步时,我也能得到像官方教程上说的
digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
Copy after login

与这个类似的信息,但是在执行zkfc -formatZK时,老是说的我得到的字符串(’->’ 后面的那部分)不对,我也不知道为什么,不知道是不是哪步少了什么,因为zookeeper集群在内网,集群内安全性一般不用考虑,我这里就直接忽略了这步,以后机会再找原因。

  • 在配置Fencing Configuration时,我用了sshfence的方式,这里需要配置ssh的密钥,我直接把hdfs用户的密钥路径给上,后来在我验证双Namenode是否生效(通过kill掉active的NN,看看standby的NN能不能变为active的NN)发现不对,老是报错,连接不上另一个Namenode,后来发现需要用root的密钥,但是hdfs用户又不能读取root的密钥,所以我这里直接把root的.ssh文件下的文件全copy到hdfs用户的$HOME下,并设置为hdfs为其owner(我的root用户在集群内也是可以免密码登录的),这样就没问题了。

  • 需要说明的是,在开启namenode之前,必须先开启journalnode,因为namenode开启时会去连接journalnode

  • 然后就是开启双Namenode的步骤了,下面记录一些需要用到的命令

sudo -u hdfs bin/hdfs zkfc -formatZKsudo -u hdfs sbin/hadoop-daemon.sh start journalnode #开启journalnode进程sudo -u hdfs sbin/hadoop-daemon.sh start zkfc #开启automatic failover进程sudo -u hdfs bin/hdfs namenode -initializeSharedEdits #把一个non-HA的NameNode转为HA时用到sudo -u hdfs bin/hdfs namenode -bootstrapStandby sudo -u hdfs sbin/hadoop-daemon.sh start namenode#上面命这两个命令在运行第二个Namenode服务器上执行,必须先执行-bootstrapStandby 这行命令再开启namenode #下面这些命令之前,需要以hdfs用户用kinit拿到TGT,否则会报错sudo -u hdfs bin/hdfs haadmin -getServiceState nn1 #查看nn1是active的还是standby的
Copy after login
  • 按照jobtracker的HA官方配置进行配置后,使用
sudo -u mapred sbin/hadoop-daemon.sh start jobtrackerha
Copy after login

命令开启jobtrackerha

  • 通过
#运行下面这些命令之前,要先以mapred用户用kinit拿到TGT,否则会报错sudo -u mapred bin/hadoop mrhaadmin -getServiceState jt1
Copy after login

查看jt1是active的还是standby的

  • 最后一个,还是关于HDFS的权限问题,因为mapreduce在执行任务时会向HDFS上写一些临时文件,如果权限不对,肯定就会报错了,不过这种错误也很好该,根据错误信息就能知道那个目录权限不对,然后改过来就行了,我这里进行下总结:

  • 根据官方的教程配置教程,配置了如下选项:

<property>  <name>mapred.job.tracker.persist.jobstatus.dir</name>  <value>/jobtracker/jobsInfo</value></property>
Copy after login

所以需要在HDFS上创建相应目录,并修改其owner为mapred

  • 其次是staging目录,如果没有配置,其默认值从默认配置可以看到mapreduce.jobtracker.staging.root.dir的值为${hadoop.tmp.dir}/mapred/staging,而${hadoop.tmp.dir}的值从这里可以看到值默认是/tmp/hadoop-${user.name},有因为我们使用mapred用户来执行tasktracker进行的,所以需要创建/tmp/hadoop-mapred/mapred/staging文件夹,并且其owner为mapred,权限为1777,可以用下面的命令来实现:
sudo -u hdfs bin/hdfs dfs -mkdir -p /tmp/hadoop-mapred/mapred/stagingsudo -u hdfs bin/hdfs dfs -chown mapred /tmp/hadoop-mapred/mapred/stagingsudo -u hdfs bin/hdfs dfs -chmod 1777 /tmp/hadoop-mapred/mapred/staging
Copy after login

此外,还需要配置mapreduce.jobtracker.system.dir指定的文件,默认为${hadoop.tmp.dir}/mapred/system,所以还需要执行下面的命令:

sudo -u hdfs bin/hdfs dfs -mkdir -p /tmp/hadoop-mapred/mapred/systemsudo -u hdfs bin/hdfs dfs -chown mapred /tmp/hadoop-mapred/mapred/system
Copy after login

这个目录只由mapred用户来写入,所以不用再修改其权限(的755即可)。

总结:这次配置HA的整个过程还是比较顺利的,除了烦人的各种权限问题,我觉得这也是我没有弄明白hadoop各个进程是如何工作导致的,通过支持配置HA,算是对job的运行又有了更深的的认识。

继上次安装完Kerberos安全认证后,现在我在这基础上,又给CDH加上了HA(high availability),也就是高可靠性,具体来讲就是双NameNode,双Jobtracker(我还是在MRv1模式下),有了HA后,这下集群的健壮性就能够得到很好的保证了。

我还是按照官方文档来操作的,有了上次的经验,建议大家在具体操作实施前,先快速阅读一遍,做到心中有数,我还阅读了Apache官方的说明,也不用怎么详细,大概知道怎么回事就行了。

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1663
14
PHP Tutorial
1266
29
C# Tutorial
1239
24
Solution to the problem that Win11 system cannot install Chinese language pack Solution to the problem that Win11 system cannot install Chinese language pack Mar 09, 2024 am 09:48 AM

Solution to the problem that Win11 system cannot install Chinese language pack With the launch of Windows 11 system, many users began to upgrade their operating system to experience new functions and interfaces. However, some users found that they were unable to install the Chinese language pack after upgrading, which troubled their experience. In this article, we will discuss the reasons why Win11 system cannot install the Chinese language pack and provide some solutions to help users solve this problem. Cause Analysis First, let us analyze the inability of Win11 system to

Unable to install guest additions in VirtualBox Unable to install guest additions in VirtualBox Mar 10, 2024 am 09:34 AM

You may not be able to install guest additions to a virtual machine in OracleVirtualBox. When we click on Devices>InstallGuestAdditionsCDImage, it just throws an error as shown below: VirtualBox - Error: Unable to insert virtual disc C: Programming FilesOracleVirtualBoxVBoxGuestAdditions.iso into ubuntu machine In this post we will understand what happens when you What to do when you can't install guest additions in VirtualBox. Unable to install guest additions in VirtualBox If you can't install it in Virtua

What should I do if Baidu Netdisk is downloaded successfully but cannot be installed? What should I do if Baidu Netdisk is downloaded successfully but cannot be installed? Mar 13, 2024 pm 10:22 PM

If you have successfully downloaded the installation file of Baidu Netdisk, but cannot install it normally, it may be that there is an error in the integrity of the software file or there is a problem with the residual files and registry entries. Let this site take care of it for users. Let’s introduce the analysis of the problem that Baidu Netdisk is successfully downloaded but cannot be installed. Analysis of the problem that Baidu Netdisk downloaded successfully but could not be installed 1. Check the integrity of the installation file: Make sure that the downloaded installation file is complete and not damaged. You can download it again, or try to download the installation file from another trusted source. 2. Turn off anti-virus software and firewall: Some anti-virus software or firewall programs may prevent the installation program from running properly. Try disabling or exiting the anti-virus software and firewall, then re-run the installation

How to install Android apps on Linux? How to install Android apps on Linux? Mar 19, 2024 am 11:15 AM

Installing Android applications on Linux has always been a concern for many users. Especially for Linux users who like to use Android applications, it is very important to master how to install Android applications on Linux systems. Although running Android applications directly on Linux is not as simple as on the Android platform, by using emulators or third-party tools, we can still happily enjoy Android applications on Linux. The following will introduce how to install Android applications on Linux systems.

How to install Podman on Ubuntu 24.04 How to install Podman on Ubuntu 24.04 Mar 22, 2024 am 11:26 AM

If you have used Docker, you must understand daemons, containers, and their functions. A daemon is a service that runs in the background when a container is already in use in any system. Podman is a free management tool for managing and creating containers without relying on any daemon such as Docker. Therefore, it has advantages in managing containers without the need for long-term backend services. Additionally, Podman does not require root-level permissions to be used. This guide discusses in detail how to install Podman on Ubuntu24. To update the system, we first need to update the system and open the Terminal shell of Ubuntu24. During both installation and upgrade processes, we need to use the command line. a simple

How to install creo-creo installation tutorial How to install creo-creo installation tutorial Mar 04, 2024 pm 10:30 PM

Many novice friends still don’t know how to install creo, so the editor below brings relevant tutorials on creo installation. Friends in need should take a look at it. I hope it can help you. 1. Open the downloaded installation package and find the License folder, as shown in the figure below: 2. Then copy it to the directory on the C drive, as shown in the figure below: 3. Double-click to enter and see if there is a license file, as shown below As shown in the picture: 4. Then copy the license file to this file, as shown in the following picture: 5. In the PROGRAMFILES file of the C drive, create a new PLC folder, as shown in the following picture: 6. Copy the license file as well Click in, as shown in the figure below: 7. Double-click the installation file of the main program. To install, check the box to install new software.

Detailed steps to install Go language on Win7 computer Detailed steps to install Go language on Win7 computer Mar 27, 2024 pm 02:00 PM

Detailed steps to install Go language on Win7 computer Go (also known as Golang) is an open source programming language developed by Google. It is simple, efficient and has excellent concurrency performance. It is suitable for the development of cloud services, network applications and back-end systems. . Installing the Go language on a Win7 computer allows you to quickly get started with the language and start writing Go programs. The following will introduce in detail the steps to install the Go language on a Win7 computer, and attach specific code examples. Step 1: Download the Go language installation package and visit the Go official website

How to Install and Run the Ubuntu Notes App on Ubuntu 24.04 How to Install and Run the Ubuntu Notes App on Ubuntu 24.04 Mar 22, 2024 pm 04:40 PM

While studying in high school, some students take very clear and accurate notes, taking more notes than others in the same class. For some, note-taking is a hobby, while for others, it is a necessity when they easily forget small information about anything important. Microsoft's NTFS application is particularly useful for students who wish to save important notes beyond regular lectures. In this article, we will describe the installation of Ubuntu applications on Ubuntu24. Updating the Ubuntu System Before installing the Ubuntu installer, on Ubuntu24 we need to ensure that the newly configured system has been updated. We can use the most famous &quot;a&quot; in Ubuntu system

See all articles