CentOS 6.5 配置hadoop 2.6.0伪分布式 http://www.centoscn.com/hadoop/2015/0118/4525.html
本地无密码连接centos:
1 2 |
ssh-copy-id root@192.168.0.50 ssh root@192.168.0.50 |
禁用IP6
1 2 3 |
vim /etc/sysctl.conf 然后添加以下内容: # 禁用整个系统所有接口的IPv6 net.ipv6.conf.all.disable_ipv6 = 1 |
修改hosts
1 2 3 4 |
vim /etc/hosts // 增加127.0.0.1映射到h50 127.0.0.1 h50 192.168.0.50 h50 //注释掉::1的那一行, 那一行是ip6的映射 |
全部关闭防火墙:
1 2 3 4 5 6 7 8 |
//关闭iptables service iptables stop chkconfig ip6tables off chkconfig iptables off service iptables status //关闭selinux vim /etc/selinux/config SELINUX=disabled |
重启一下
1.创建用户组和用户
使用su命令切换用户为root
1 2 3 |
groupadd hadoop useradd -g hadoop hadoop passwd hadoop #为用户添加密码 可以不设置密码 p*********2 |
2.安装ssh
1 2 3 4 5 |
rpm -qa |grep ssh #检查是否装了SSH包 yum install openssh-server # 安装ssh chkconfig --list sshd #检查SSHD是否设置为开机启动 chkconfig --level 2345 sshd on #如果没设置启动就设置下. service sshd restart #重新启动 |
3.配置ssh无密码登录
切换至hadoop用户, 生成密钥
1 2 3 4 5 |
su hadoop ssh-keygen -t rsa cd /home/hadoop/.ssh cat id_rsa.pub >> authorized_keys chmod 600 authorized_keys # 修改用户权限 |
测试是否可以登录
1 |
ssh localhost # 执行后会提示输入 yes or no. 输入yes后 如果提示为最后一次登录时间 则表明成功。 |
4.安装hadoop
将下载的hadoop解压并移动到期望的安装目录,修改其访问权限
传送到centos: scp hadoop-2.6.0.tar.gz root@192.168.0.50:/home/hadoop
1 2 3 4 |
su root mkdir -p /usr/opt/hadoop chmod -R 775 /usr/opt/hadoop chown -R hadoop:hadoop /usr/opt/hadoop |
1 2 3 4 |
su hadoop cd /home/hadoop tar -xvf hadoop-2.6.0.tar.gz mv hadoop-2.6.0/* /usr/opt/hadoop |
接下来创建需要的文件夹
1 2 3 |
mkdir -p /usr/opt/hadoop/dfs/tmp mkdir -p /usr/opt/hadoop/dfs/name mkdir -p /usr/opt/hadoop/dfs/data |
配置hadoop 的环境变量(所有的环境变量都是必须的)
1 2 3 4 5 6 7 8 9 10 11 |
su root vim /etc/profile export HADOOP_INSTALL=/usr/opt/hadoop export HADOOP_MAPRED_HOME=${HADOOP_INSTALL} export HADOOP_COMMON_HOME=${HADOOP_INSTALL} export HADOOP_HDFS_HOME=${HADOOP_INSTALL} export YARN_HOME=${HADOOP_INSTALLL} export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native export HADOOP_OPTS="-Djava.library.path=${HADOOP_INSTALL}/lib:${HADOOP_INSTALL}/lib/native" export PATH=${HADOOP_INSTALL}/bin:${HADOOP_INSTALL}/sbin:${PATH} |
马上生效:
1 |
source /etc/profile |
安装java:
传送到centos: scp jdk-7u67-linux-x64.rpm root@192.168.0.50:/home/hadoop
1 2 3 |
cd /home/hadoop rpm -ivh jdk-7u67-linux-x64.rpm java -version |
设置hadoop-env.sh中的java环境变量
1 2 3 4 5 |
su hadoop vim /usr/opt/hadoop/etc/hadoop/hadoop-env.sh export JAVA_HOME=/usr/java/jdk1.7.0_67 export HADOOP_PREFIX=/usr/opt/hadoop export HADOOP_OPTS="-Djava.library.path=/usr/opt/hadoop/lib:/usr/opt/hadoop/lib/native" |
5.配置伪分布式
hadoop的配置文件主要有 core-site.xml 、 hdfs-site.xml 、 yarn-site.xml 三个文件。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
vim /usr/opt/hadoop/etc/hadoop/core-site.xml <configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/opt/hadoop/dfs/tmp</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> <property> <name>hadoop.native.lib</name> <value>true</value> </property> </configuration> vim /usr/opt/hadoop/etc/hadoop/hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/opt/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/opt/hadoop/dfs/data</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> </configuration> cp /usr/opt/hadoop/etc/hadoop/mapred-site.xml.template /usr/opt/hadoop/etc/hadoop/mapred-site.xml vim /usr/opt/hadoop/etc/hadoop/mapred-site.xml <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> vim /usr/opt/hadoop/etc/hadoop/yarn-site.xml <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> |
到目前为止所有的配置都已经完成。
6.运行
首先格式化文件系统
1 2 3 4 |
sh /usr/opt/hadoop/bin/hdfs namenode -format sh /usr/opt/hadoop/bin/hdfs datanode -format // 或者 hadoop namenode -format , hadoop datanode -format |
输出:
……
6/11/21 13:19:34 INFO namenode.NNConf: Maximum size of an xattr: 16384
16/11/21 13:19:35 INFO namenode.FSImage: Allocated new BlockPoolId: BP-872202057-127.0.0.1-1479705574923
16/11/21 13:19:35 INFO common.Storage: Storage directory /usr/opt/hadoop/dfs/name has been successfully formatted.
16/11/21 13:19:35 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
16/11/21 13:19:35 INFO util.ExitUtil: Exiting with status 0
16/11/21 13:19:35 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
1 |
启动: sh /usr/opt/hadoop/sbin/start-dfs.sh |
输出:
……
Are you sure you want to continue connecting (yes/no)? y
Please type ‘yes’ or ‘no’: yes
……
1 |
启动: sh /usr/opt/hadoop/sbin/start-yarn.sh |
输出:
……
提示如下则表明成功了。
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/opt/hadoop-2.6.0/logs/hadoop-hadoop-namenode-.out
localhost: starting datanode, logging to /usr/opt/hadoop-2.6.0/logs/hadoop-hadoop-datanode-.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/opt/hadoop-2.6.0/logs/hadoop-hadoop-secondarynamenode-.out
访问: http://192.168.0.50:50070 , 就可以看见hadoop的网页了。
安装hbase 1.0
Ubuntu14.04搭建Hbase1.1.1伪分布式环境 http://www.jianshu.com/p/27c385800da8
传到服务器: scp hbase-1.1.0-bin.tar.gz root@192.168.0.50:/home/hadoop
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
su root mkdir -p /usr/opt/hbase chmod -R 775 /usr/opt/hbase chown -R hadoop:hadoop /usr/opt/hbase mkdir -p /usr/opt/zk_data chmod -R 775 /usr/opt/zk_data chown -R hadoop:hadoop /usr/opt/zk_data su hadoop tar xzvf hbase-1.1.0-bin.tar.gz mv /home/hadoop/hbase-1.1.0/* /usr/opt/hbase vim /usr/opt/hbase/conf/hbase-env.sh export JAVA_HOME=/usr/java/jdk1.7.0_67 #export HBASE_MANAGES_ZK=false |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
vim /usr/opt/hbase/conf/hbase-site.xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <!--说明:使用hbase自带zookeeper的时候,此项设置zookeeper文件存放的地址,注:没有存放在hdfs系统里边--> <name>hbase.zookeeper.property.dataDir</name> <value>/usr/opt/zk_data</value> </property> <property> <name>zookeeper.znode.parent</name> <value>/hbase-unsecure</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> |
因为我们采用的是伪分布模式,这里需要将HBase的数据存储到之前的Hadoop的HDFS上,hbase.rootdir的值便是HDFS上HBase数据存储的位置,值中的主机名和端口号要和之前Hadoop的 core-site.xml中的fs.default.name的值相同
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
su root vim /etc/profile 修改如下: export HADOOP_INSTALL=/usr/opt/hadoop export HADOOP_MAPRED_HOME=${HADOOP_INSTALL} export HADOOP_COMMON_HOME=${HADOOP_INSTALL} export HADOOP_HDFS_HOME=${HADOOP_INSTALL} export YARN_HOME=${HADOOP_INSTALLL} export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_INSTALL}/lib/native export HADOOP_OPTS="-Djava.library.path=${HADOOP_INSTALL}/lib:${HADOOP_INSTALL}/lib/native" export HBASE_HOME=/usr/opt/hbase export HBASE_CONF_DIR=$HBASE_HOME/conf export HBASE_CLASS_PATH=$HBASE_CONF_DIR export JAVA_HOME=/usr/java/jdk1.7.0_67 export PATH=${HADOOP_INSTALL}/bin:${HADOOP_INSTALL}/sbin:$HBASE_HOME/bin:$JAVA_HOME/bin:${PATH} |
马上生效:source /etc/profile
1 2 |
su hadoop 启动: sh /usr/opt/hbase/bin/start-hbase.sh |
输出:
localhost: starting zookeeper, logging to /usr/opt/hbase/bin/../logs/hbase-hadoop-zookeeper-localhost.localdomain.out
starting master, logging to /usr/opt/hbase/logs/hbase-hadoop-master-localhost.localdomain.out
starting regionserver, logging to /usr/opt/hbase/logs/hbase-hadoop-1-regionserver-localhost.localdomain.out
安装zookeeper http://blog.csdn.net/dream_an/article/details/52089883 使用自身的zookeeper, 所以不需要安装独立的.
传到服务器: scp zookeeper-3.4.8.tar.gz root@192.168.0.50:/home/hadoop
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
su root mkdir -p /usr/opt/zookeeper chmod -R 775 /usr/opt/zookeeper chown -R hadoop:hadoop /usr/opt/zookeeper su hadoop cd /home/hadoop tar -xvf zookeeper-3.4.8.tar.gz mv /home/hadoop/zookeeper-3.4.8/* /usr/opt/zookeeper cd /usr/opt/zookeeper mkdir -p /usr/opt/zk_data/tmp cp /usr/opt/zookeeper/conf/zoo_sample.cfg /usr/opt/zookeeper/conf/zoo.cfg vim /usr/opt/zookeeper/conf/zoo.cfg dataDir=/usr/opt/zk_data/tmp 修改这个位置,避免重启之后出问题 //在hbase的配置文件上增加, 否则spring-data无法访问 vim /usr/opt/hbase/conf/hbase-site.xml <property> <name>zookeeper.znode.parent</name> <value>/hbase-unsecure</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/usr/opt/zookeeper</value> <description>Property from ZooKeeper's config zoo.cfg. The directory where the snapshot is stored.</description> </property> //先停止hbase sh /usr/opt/hbase/bin/stop-hbase.sh |
命令:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
su hadoop 停止所有: sh /usr/opt/hadoop/sbin/stop-all.sh 启动: sh /usr/opt/hadoop/sbin/start-dfs.sh 停止: sh /usr/opt/hadoop/sbin/stop-dfs.sh 启动: sh /usr/opt/hadoop/sbin/start-yarn.sh 停止: sh /usr/opt/hadoop/sbin/stop-yarn.sh 启动: sh /usr/opt/hbase/bin/start-hbase.sh 启动第三方服务器: sh /usr/opt/hbase/bin/hbase-daemon.sh start thrift 停止: sh /usr/opt/hbase/bin/stop-hbase.sh 其他: sh /usr/opt/hbase/bin/local-regionservers.sh start 1 sh /usr/opt/hbase/bin/local-regionservers.sh start 2 3 4 5 sh /usr/opt/hbase/bin/local-regionservers.sh stop 1 2 3 4 5 启动: cd /usr/opt/zookeeper/bin/ && sh ./zkServer.sh start 停止: cd /usr/opt/zookeeper/bin/ && sh ./zkServer.sh stop 测试连接: cd /usr/opt/zookeeper/bin/ && sh ./zkCli.sh -server 127.0.0.1:2181 输入: ls / 输出: [hbase-unsecure, zookeeper] 清除日志: rm -rf /usr/opt/zookeeper/bin/zookeeper.out && rm -rf /usr/opt/hbase/logs/* //几个日志文件 [root@localhost ~]# find / -name hbase*.log tail -f -n3000 /usr/opt/hbase/logs/hbase-hadoop-zookeeper-h50.log tail -f -n3000 /usr/opt/hbase/logs/hbase-hadoop-1-regionserver-h50.log tail -f -n3000 /usr/opt/hbase/logs/hbase-hadoop-master-h50.log tail -f -n3000 /usr/opt/zookeeper/bin/zookeeper.out |
验证Hbase:
(1)执行jps,发现新增加了3个java进程,分别是HMaster、HRegionServer、HQuorumPeer,证明安装成功。 如果没有启动成功HRegionServer 或者 HQuorumPeer
可以使用:
sh /usr/opt/hbase/bin/local-regionservers.sh stop 1 2 3 4 5
(2)使用浏览器访问http://192.168.0.50:60010
WebUI:
查看webUI. Master: http://192.168.0.50:16010/master-status
Hbase Region Server: http://192.168.0.50:16301/rs-status
http://192.168.0.50:50030 : 这个里面,我们可以看到Map/Reduce的管理情况
http://192.168.0.50:50070 : 这里可以看到HDFS的管理情况。
为何java能访问分布式, 却不能访问伪分布式的hbase? 找不到原因, 放弃继续研究!!
进入hbase环境
1 |
hbase shell |
问题:
A: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
解决: http://blog.csdn.net/lalaguozhe/article/details/10580727
vim /usr/opt/hadoop/etc/hadoop/core-site.xml
<property>
<name>hadoop.native.lib</name>
<value>false</value>
vim /usr/opt/hadoop/etc/hadoop/hadoop-env.sh
export HADOOP_PREFIX=/usr/opt/hadoop
export HADOOP_OPTS=”-Djava.library.path=/usr/opt/hadoop/lib:/usr/opt/hadoop/lib/native”
cd /usr/opt/hadoop/lib/native
file libhadoop.so.1.0.0
输出: libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped 已经是64位的编译文件
奇怪,还是不能解决
B: Master is initializing
1 2 3 4 5 |
hadoop dfsadmin -safemode leave //启动Hbase之前尽量关闭Hadoop的HDFS的安全模式,未关闭可能会造成Hbase在HDFS上创建文件不成功 hadoop fs -rm -r /hbase/WALs/ cd /usr/opt/zk_data && ls && rm -rf /usr/opt/zk_data/version-2 && ls cd /usr/opt/zk_data/tmp && ls && rm -rf /usr/opt/zk_data/tmp/version-2 && ls hbase hbck -fix 或者 hbase hbck -fixAssignments 然后重启hbase |
D: zookeeper.out日志显示2181端口被占用, 应该是hbase已经运行了自身的zookeeper, 就不能在运行一个独立的zookeeper了. 二选一
1 2 |
vim /usr/opt/hbase/conf/hbase-env.sh export HBASE_MANAGES_ZK=false |