八、hive3.1.2 安装及其配置(本地模式和远程模式)(8、 Hive3.1.2 installation and configuration (local mode and remote mode))

  • 前文
  • hive3.1.2 安装及其配置(本地模式和远程模式)
  • 1、下载hive包
  • 2、修改其环境变量
  • 3、MySQL配置Centos7 MySQL安装步骤:1、设置MySQL源2、安装MySQL Community Server3、启动MySQL服务4、查找MySQL root密码5、重置root密码6、MySQL 服务加入开机启动项,并启动mysql进程7、登录MySQL8、配置mysql允许远程访问9、附mysql常用命令:
  • Centos7 MySQL安装步骤:1、设置MySQL源2、安装MySQL Community Server3、启动MySQL服务4、查找MySQL root密码5、重置root密码6、MySQL 服务加入开机启动项,并启动mysql进程7、登录MySQL8、配置mysql允许远程访问9、附mysql常用命令:
  • 1、设置MySQL源
  • 2、安装MySQL Community Server
  • 3、启动MySQL服务
  • 4、查找MySQL root密码
  • 5、重置root密码
  • 6、MySQL 服务加入开机启动项,并启动mysql进程
  • 7、登录MySQL
  • 8、配置mysql允许远程访问
  • 9、附mysql常用命令:
  • 4、将MySQL的JDBC驱动拷贝到Hive的lib目录下
  • 5、配置hive文件
  • 6、配置hive-env.sh
  • 7、配置Metastore到MySQL报错原因1:1、Jar包冲突报错原因2:hive-site.xml配置文件中,3278行(见报错记录第二行)有特殊字符报错原因3:一、问题二、解决
  • 报错原因1:1、Jar包冲突
  • 1、Jar包冲突
  • 报错原因2:hive-site.xml配置文件中,3278行(见报错记录第二行)有特殊字符
  • hive-site.xml配置文件中,3278行(见报错记录第二行)有特殊字符
  • 报错原因3:一、问题二、解决
  • 一、问题
  • 二、解决
  • 8、下载connecor/J驱动
  • 9、远程服务

前文

  • 一、CentOS7 hadoop3.3.1安装(单机分布式、伪分布式、分布式
  • 二、JAVA API实现HDFS
  • 三、MapReduce编程实例
  • 四、Zookeeper3.7安装
  • 五、Zookeeper的Shell操作
  • 六、Java API操作zookeeper节点
  • 七、Hadoop3.3.1 HA 高可用集群QJM (基于Zookeeper,NameNode高可用+Yarn高可用)

hive3.1.2 安装及其配置(本地模式和远程模式)

下载

1、下载hive包

Index of /dist/hive (apache.org)

wget http://archive.apache.org/dist/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
#可选用清华镜像
http://mirror.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz

2、修改其环境变量

vim /etc/profile
	#增加环境变量
	export HIVE_HOME=/export/servers/apache-hive-3.1.2-src
	export PATH=$PATH:$HIVE_HOME/bin
source /etc/profile

3、MySQL配置

为什么配置MySQL?

metastore是hive元数据的集中存放地
metastore默认使用内嵌的derby数据库作为存储引擎
Derby引擎的缺点:一次只能打开一个会话
使用Mysql作为外置存储引擎,多用户同时访问
Hive安装
内嵌模式:元数据保持在内嵌的Derby模式,只允许一个会话连接
本地独立模式:在本地安装MySQL,把元数据放到MySQL内
远程模式:元数据放置在远程的Mysql数据库。
hive只是个工具,包括它的数据分析,依赖于mapreduce,它的数据管理,依赖于外部系统
这一步其实不是必须的,因为Hive默认的metadata(元数据)是存储在Derby里面的,但是有一个弊端就是同一时间只能有一个Hive实例访问,这适合做开发程序时做本地测试。
Hive提供了增强配置,可将数据库替换成MySQL等关系型数据库,将存储数据独立出来在多个服务示例之间共享。
由此可见,你在哪路径下,执行hive指令,就在哪路径下生成metastore_db。建一套数据库文件,这样是极其不合适的,公司里每个人若不一样,则会显得非常混杂。导致员工之间无法公用交流。
为此,需公用的MySQL。
这也是为什么,在安装Hive时,也需要配置MySQL了。

metastore是hive元数据的集中存放地
metastore默认使用内嵌的derby数据库作为存储引擎
Derby引擎的缺点:一次只能打开一个会话
使用Mysql作为外置存储引擎,多用户同时访问
Hive安装

内嵌模式:元数据保持在内嵌的Derby模式,只允许一个会话连接

本地独立模式:在本地安装MySQL,把元数据放到MySQL内

远程模式:元数据放置在远程的Mysql数据库。

hive只是个工具,包括它的数据分析,依赖于mapreduce,它的数据管理,依赖于外部系统

这一步其实不是必须的,因为Hive默认的metadata(元数据)是存储在Derby里面的,但是有一个弊端就是同一时间只能有一个Hive实例访问,这适合做开发程序时做本地测试。

Hive提供了增强配置,可将数据库替换成MySQL等关系型数据库,将存储数据独立出来在多个服务示例之间共享。

由此可见,你在哪路径下,执行hive指令,就在哪路径下生成metastore_db。建一套数据库文件,这样是极其不合适的,公司里每个人若不一样,则会显得非常混杂。导致员工之间无法公用交流。

为此,需公用的MySQL。

这也是为什么,在安装Hive时,也需要配置MySQL了。

Centos7 MySQL安装步骤:

1、设置MySQL源

首先,您需要在MySQL提供的系统中启用MySQL yum存储库。 根据您的操作系统版本执行以下命令之一,这里我们选择CentOS 7 / RHEL 7的源

###在CentOS 7 / RHEL 7系统上### 
rpm -Uvh https://repo.mysql.com/mysql80-community-release-el7-3.noarch.rpm

###在CentOS 7 / RHEL 6系统上### 
rpm -Uvh https://repo.mysql.com/mysql80-community-release-el6-3.noarch.rpm

2、安装MySQL Community Server

MySQL yum存储库包含用于多个MySQL版本的多个存储库配置。 因此,首先禁用mysql repo文件中的所有存储库

sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/mysql-community.repo

启用centos mysql8版本

## CentOS & RedHat 版本
yum --enablerepo=mysql80-community install mysql-community-server
## Fedora Systems 版本
dnf --enablerepo=mysql80-community install mysql-community-server

3、启动MySQL服务

service mysqld start

使用Systemd

systemctl start mysqld.service

4、查找MySQL root密码

安装MySQL 8.0后,将为MySQL根用户创建一个临时密码。 您可以在日志文件中找到生成的临时密码。密码文件位置:/var/log/mysqld.log

grep "A temporary password" /var/log/mysqld.log

5、重置root密码

首次安装MySQL后,执行mysql_secure_installation命令以保护MySQL服务器,包含重置密码步骤

密码需要特殊要求,八位,大小写,特殊符号

密码需要特殊要求,八位,大小写,特殊符号

mysql_secure_installation
Enter password for user root:

The existing password for the user account root has expired. Please set a new password.

New password:
Re-enter new password:

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : y

Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y

6、MySQL 服务加入开机启动项,并启动mysql进程

### Using Systemd
systemctl enable mysqld.service
systemctl restart mysqld.service

开放3306端口

如果服务器开启了防火墙,记得开启3306端口

systemctl enable iptables
systemctl start iptables
vim /etc/sysconfig/iptables
##规则中加入
-A INPUT -p tcp -m state --state NEW -m tcp --dport 3306 -j ACCEPT
##重启防火墙
systemctl enable iptables.service
systemctl start iptables.service

7、登录MySQL

mysql -uroot -p

修改您的密码不满足当前的策略要求,可修改成简单密码

*# 查看密码策略* 
> mysql> SHOW VARIABLES LIKE 'validate_password%';
+--------------------------------------+--------+
| Variable_name                        | Value  |
+--------------------------------------+--------+
| validate_password.check_user_name    | ON     |
| validate_password.dictionary_file    |        |
| validate_password.length             | 8      |
| validate_password.mixed_case_count   | 1      |
| validate_password.number_count       | 1      |
| validate_password.policy             | MEDIUM |
| validate_password.special_char_count | 1      |
+--------------------------------------+--------+

MySQL8密码策略
1)、validate_password_length 固定密码的总长度;
2)、validate_password_dictionary_file 指定密码验证的文件路径;
3)、validate_password_mixed_case_count 整个密码中至少要包含大/小写字母的总个数;
4)、validate_password_number_count 整个密码中至少要包含阿拉伯数字的个数;
5)、validate_password_policy 指定密码的强度验证等级,默认为 MEDIUM;
取值:
0/LOW:只验证长度;
1/MEDIUM:验证长度、数字、大小写、特殊字符;
2/STRONG:验证长度、数字、大小写、特殊字符、字典文件;
6)、validate_password_special_char_count 整个密码中至少要包含特殊字符的个数;

MySQL8密码策略
1)、validate_password_length 固定密码的总长度;
2)、validate_password_dictionary_file 指定密码验证的文件路径;
3)、validate_password_mixed_case_count 整个密码中至少要包含大/小写字母的总个数;
4)、validate_password_number_count 整个密码中至少要包含阿拉伯数字的个数;
5)、validate_password_policy 指定密码的强度验证等级,默认为 MEDIUM;
取值:
0/LOW:只验证长度;
1/MEDIUM:验证长度、数字、大小写、特殊字符;
2/STRONG:验证长度、数字、大小写、特殊字符、字典文件;

6)、validate_password_special_char_count 整个密码中至少要包含特殊字符的个数;

set global validate_password.policy=0;
set global validate_password.length=1;

8、配置mysql允许远程访问

#登录
mysql -u root -p
#查看用户表
select user,host,authentication_string from mysql.user;
#切换数据库
use mysql;
#更新用户表 :
update user set host='%' where user='root';
#配置mysql允许远程访问
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%';
#强制刷新权限
flush privileges;

9、附mysql常用命令:

登录mysql

mysql -u username -p

启动mysql

systemctl start mysqld.service

结束mysql

systemctl stop mysqld.service

重启mysql

systemctl restart mysqld.service

开机自启

systemctl enable mysqld.service

4、将MySQL的JDBC驱动拷贝到Hive的lib目录下

我这里选用的8.0.27自己可以选择版本去下载

[MySQL官网下载驱动](MySQL :: Download Connector/J)

#Platform Independent
wget https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-8.0.27.tar.gz
#解压文件到Hive的lib目录下

5、配置hive文件

cd /export/servers/apache-hive-3.1.2-src/conf

把初始化的配置文件 复制一份出来 并且改名

cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml
cp hive-log4j2.properties.template hive-log4j2.properties
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties

6、配置hive-env.sh

vi hive-env.sh

文件末尾处

# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/export/servers/hadoop-3.3.1 #hadoop安装路径

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/export/servers/apache-hive-3.1.2-src/conf #hive路径

# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/export/servers/apache-hive-3.1.2-src/lib #hivejar包路径

配置hive-site.xml文件
首先使用hadoop创建3个文件夹

hdfs dfs -mkdir -p /user/hive/warehouse
hadoop fs -mkdir -p /user/hive/tmp
hadoop fs -mkdir -p /user/hive/log

修改文件夹权限

hadoop fs -chmod -R 777 /user/hive/warehouse
hadoop fs -chmod -R 777 /user/hive/tmp
hadoop fs -chmod -R 777 /user/hive/log

7、配置Metastore到MySQL

在标签加入下面配置:

vim hive-site.xml

在$HIVE_HOME/conf修改新建hive-site.xml文件,根据自己的情况更在主机名,mysql的用户名和密码

<configuration>
<!-- jdbc连接的URL -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://hadoop1:3306/metastore?createDatabaseIfNotExist=true&useSSL=false&serverTimezone=GMT</value>
</property>

    <!-- jdbc连接的Driver-->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
        <!-- <value>com.mysql.cj.jdbc.Driver</value> -->
</property>

        <!-- jdbc连接的username-->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>

    <!-- jdbc连接的password -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>000000</value>
    </property>

    <property>
        <name>system:java.io.tmpdir</name>
        <value>/export/servers/data/hive/tmp</value>
        <description/>
    </property>
</configuration>

注意: 代表&

注意: 代表&

&
<configuration>
    <!--以下可选作,建议Hive3.x不要用-->
    <!-- Hive默认在HDFS的工作目录 -->
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
    
   <!-- Hive元数据存储的验证 -->
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
   
    <!-- 元数据存储授权  -->
    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>
    <!-- 指定存储元数据要连接的地址 -->
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://hadoop2:9083</value>
    </property>
    <!-- 指定hiveserver2连接的host -->
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>hadoop2</value>
    </property>
 
    <!-- 指定hiveserver2连接的端口号 -->
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>
</configuration>

报错原因1:

1、Jar包冲突

因为跟hadoop的jar包存在冲突,主要有两个jar包冲突,一个log4j-slf4j-impl-2.10.0.jar跟hadoop冲突,可以删除,我这里就打个标记弃用。

mv $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.jar $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.bak

另一个时guava-19.0.jar,跟hadoop里面的guava-27.0-jre.jar冲突,采用高版本替换低版本的方式。

cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib

rm $HIVE_HOME/lib/guava-19.0.jar

报错原因2:

hive-site.xml配置文件中,3278行(见报错记录第二行)有特殊字符

Exception in thread “main” java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3278,96,”file:/export/servers/apache-hive-3.1.2-bin/conf/hive-site.xml”]

[root@hadoop1 apache-hive-3.1.2-bin]# schematool -dbType mysql -initSchema
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3278,96,"file:/export/servers/apache-hive-3.1.2-bin/conf/hive-site.xml"]
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3069)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3018)
        at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2893)
        at org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:1034)
        at org.apache.hadoop.conf.Configuration.addResource(Configuration.java:939)
        at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5151)
        at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5104)
        at org.apache.hive.beeline.HiveSchemaTool.<init>(HiveSchemaTool.java:96)
        at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3278,96,"file:/export/servers/apache-hive-3.1.2-bin/conf/hive-site.xml"]
        at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:634)
        at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:504)
        at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2469)
        at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2416)
        at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2382)
        at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1528)
        at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2818)
        at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1121)
        at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3373)
        at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3159)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3052)
        ... 14 more

进入hive-site.xml文件,跳转到对应行,删除里面的&#8特殊字符即可

报错原因3:

问题 Exception in thread “main” java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative

一、问题

Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        at org.apache.hadoop.fs.Path.initialize(Path.java:254)
        at org.apache.hadoop.fs.Path.<init>(Path.java:212)
        at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:663)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:586)
        at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:553)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        at java.net.URI.checkPath(URI.java:1823)
        at java.net.URI.<init>(URI.java:745)
        at org.apache.hadoop.fs.Path.initialize(Path.java:251)

二、解决

在hive下新建tmp文件:
mkdir iotmp
查看hive-site.xml:
查找xml中"system:java.io.tmpdir"的配置项,如果存在直接修改地址路径到自己新建的路径下。
不存在则加入配置:
 <property>
    <name>system:java.io.tmpdir</name>
    <value>xxxxxx</value>                                          自己新建的路径
    <description/>
 </property>
保存退出,重新启动hive成功。

8、下载connecor/J驱动

下载MySQL的java版本的驱动解压,放到hive/lib文件内

wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-8.0.11.tar.gz
tar -zxvf mysql-connector-java-8.0.11.tar.gz
cd mysql-connector-java-8.0.11
chmod 777 mysql-connector-java-8.0.11.jar
cp mysql-connector-java-8.0.11.jar /usr/share/hive/lib/

9、远程服务

hadoop1服务安装好的Hive给hadoop2和hadoop3

scp -r apache-hive-3.1.2-bin/ hadoop2:/export/servers/
scp -r apache-hive-3.1.2-bin/ hadoop3:/export/servers/

hadoop1节点:

启动Hive中的hiveserver2节点

bin/hiveserver2

hadoop2节点/hive2节点

beeline> !connect jdbc:hive2://hadoop1:10000
#默认端口号是10000
Connecting to jdbc:hive2://hadoop1:10000
#输入账号密码
Enter username for jdbc:hive2://hadoop1:10000: root
Enter password for jdbc:hive2://hadoop1:10000: ******
————————
  • Above
  • Hive3.1.2 installation and configuration (local mode and remote mode)
  • 1. Download hive package
  • 2. Modify its environment variables
  • 3、MySQL配置Centos7 MySQL安装步骤:1、设置MySQL源2、安装MySQL Community Server3、启动MySQL服务4、查找MySQL root密码5、重置root密码6、MySQL 服务加入开机启动项,并启动mysql进程7、登录MySQL8、配置mysql允许远程访问9、附mysql常用命令:
  • Centos7 MySQL安装步骤:1、设置MySQL源2、安装MySQL Community Server3、启动MySQL服务4、查找MySQL root密码5、重置root密码6、MySQL 服务加入开机启动项,并启动mysql进程7、登录MySQL8、配置mysql允许远程访问9、附mysql常用命令:
  • 1、设置MySQL源
  • 2、安装MySQL Community Server
  • 3. Start MySQL service
  • 4、查找MySQL root密码
  • 5. Reset root password
  • 6. The MySQL service adds a startup item and starts the MySQL process
  • 7、登录MySQL
  • 8. Configure Mysql to allow remote access
  • 9. Common MySQL commands are attached:
  • 4、将MySQL的JDBC驱动拷贝到Hive的lib目录下
  • 5. Configure hive file
  • 6、配置hive-env.sh
  • 7. Configure Metastore to MySQL error reporting reason 1: 1. Jar package conflict error reporting reason 2: in the hive-site.xml configuration file, there are special characters in line 3278 (see the second line of the error reporting record). Error reporting reason 3: 1. Problem 2. Solution
  • Error reporting reason 1:1. Jar package conflict
  • 1. Jar package conflict
  • Error reporting reason 2: in hive-site.xml configuration file, there are special characters in line 3278 (see the second line of error reporting record)
  • In the hive-site.xml configuration file, there are special characters in line 3278 (see the second line of the error report record)
  • Error reporting reason 3: I. problem II. Solution
  • 1、 Question
  • 2、 Settle
  • 8、下载connecor/J驱动
  • 9. Remote service

Above

  • 1、 Centos7 Hadoop 3.3.1 installation (single machine distributed, pseudo distributed, distributed)
  • 二、JAVA API实现HDFS
  • 三、MapReduce编程实例
  • 四、Zookeeper3.7安装
  • 五、Zookeeper的Shell操作
  • 六、Java API操作zookeeper节点
  • 七、Hadoop3.3.1 HA 高可用集群QJM (基于Zookeeper,NameNode高可用+Yarn高可用)

Hive3.1.2 installation and configuration (local mode and remote mode)

download

1. Download hive package

Index of /dist/hive (apache.org)

wget http://archive.apache.org/dist/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
#可选用清华镜像
http://mirror.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz

2. Modify its environment variables

vim /etc/profile
	#增加环境变量
	export HIVE_HOME=/export/servers/apache-hive-3.1.2-src
	export PATH=$PATH:$HIVE_HOME/bin
source /etc/profile

3、MySQL配置

Why configure MySQL?

Metastore is a centralized repository of hive metadata
Metastore uses the embedded Derby database as the storage engine by default
Disadvantage of Derby engine: only one session can be opened at a time
Using MySQL as the external storage engine, multiple users can access it at the same time
Hive installation
Embedded mode: the metadata is maintained in the embedded Derby mode, and only one session connection is allowed
Local independent mode: install MySQL locally and put metadata into mysql
Remote mode: metadata is placed in the remote MySQL database.
Hive is just a tool, including its data analysis, which depends on MapReduce, and its data management depends on external systems
This step is actually not necessary, because hive’s default metadata is stored in Derby, but one disadvantage is that only one hive instance can be accessed at the same time, which is suitable for local testing when developing programs.
Hive provides enhanced configuration, which can replace the database with relational databases such as MySQL and share the stored data among multiple service examples independently.
It can be seen that the path where you execute the hive instruction generates the Metastore_ db。 It is extremely inappropriate to build a set of database files. If everyone in the company is different, it will be very mixed. As a result, there is no public communication between employees.
To do this, you need a public mysql.
This is why MySQL needs to be configured when installing hive.

Metastore is a centralized repository of hive metadata
Metastore uses the embedded Derby database as the storage engine by default
Disadvantage of Derby engine: only one session can be opened at a time
Using MySQL as the external storage engine, multiple users can access it at the same time
Hive installation

Embedded mode: the metadata is maintained in the embedded Derby mode, and only one session connection is allowed

Local independent mode: install MySQL locally and put metadata into mysql

Remote mode: metadata is placed in the remote MySQL database.

Hive is just a tool, including its data analysis, which depends on MapReduce, and its data management depends on external systems

This step is actually not necessary, because hive’s default metadata is stored in Derby, but one disadvantage is that only one hive instance can be accessed at the same time, which is suitable for local testing when developing programs.

Hive provides enhanced configuration, which can replace the database with relational databases such as MySQL and share the stored data among multiple service examples independently.

It can be seen that the path where you execute the hive instruction generates the Metastore_ db。 It is extremely inappropriate to build a set of database files. If everyone in the company is different, it will be very mixed. As a result, there is no public communication between employees.

To do this, you need a public mysql.

This is why MySQL needs to be configured when installing hive.

Centos7 MySQL安装步骤:

1、设置MySQL源

First, you need to enable the MySQL Yum repository in the system provided by mysql. Execute one of the following commands according to your operating system version. Here we select the source of CentOS 7 / RHEL 7

###在CentOS 7 / RHEL 7系统上### 
rpm -Uvh https://repo.mysql.com/mysql80-community-release-el7-3.noarch.rpm

###在CentOS 7 / RHEL 6系统上### 
rpm -Uvh https://repo.mysql.com/mysql80-community-release-el6-3.noarch.rpm

2、安装MySQL Community Server

The MySQL Yum repository contains multiple repository configurations for multiple MySQL versions. Therefore, first disable all repositories in the MySQL repo file

sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/mysql-community.repo

Enable CentOS mysql8 version

## CentOS & RedHat 版本
yum --enablerepo=mysql80-community install mysql-community-server
## Fedora Systems 版本
dnf --enablerepo=mysql80-community install mysql-community-server

3. Start MySQL service

service mysqld start

使用Systemd

systemctl start mysqld.service

4、查找MySQL root密码

After installing MySQL 8.0, a temporary password will be created for the MySQL root user. You can find the generated temporary password in the log file. Password file location: / var / log / mysqld.log

grep "A temporary password" /var/log/mysqld.log

5. Reset root password

After installing MySQL for the first time, execute mysql_ secure_ The installation command protects the MySQL server, including the steps to reset the password

The password requires special requirements, eight digits, case, and special symbols

The password requires special requirements, eight digits, case, and special symbols

mysql_secure_installation
Enter password for user root:

The existing password for the user account root has expired. Please set a new password.

New password:
Re-enter new password:

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : y

Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y

6. The MySQL service adds a startup item and starts the MySQL process

### Using Systemd
systemctl enable mysqld.service
systemctl restart mysqld.service

< strong > open 3306 port < / strong >

If the server has a firewall turned on, remember to turn on port 3306

systemctl enable iptables
systemctl start iptables
vim /etc/sysconfig/iptables
##规则中加入
-A INPUT -p tcp -m state --state NEW -m tcp --dport 3306 -j ACCEPT
##重启防火墙
systemctl enable iptables.service
systemctl start iptables.service

7、登录MySQL

mysql -uroot -p

Changing your password does not meet the current policy requirements. You can change it to a simple password

*# 查看密码策略* 
> mysql> SHOW VARIABLES LIKE 'validate_password%';
+--------------------------------------+--------+
| Variable_name                        | Value  |
+--------------------------------------+--------+
| validate_password.check_user_name    | ON     |
| validate_password.dictionary_file    |        |
| validate_password.length             | 8      |
| validate_password.mixed_case_count   | 1      |
| validate_password.number_count       | 1      |
| validate_password.policy             | MEDIUM |
| validate_password.special_char_count | 1      |
+--------------------------------------+--------+

MySQL 8 password policy
1)、validate_ password_ Length the total length of the fixed password;
2)、validate_ password_ dictionary_ File specifies the file path for password authentication;
3)、validate_ password_ mixed_ case_ Count the total number of large / small letters in the whole password;
4)、validate_ password_ number_ Count the number of Arabic numerals at least in the whole password;
5)、validate_ password_ Policy specifies the strength and authentication level of the password. The default value is medium;
Value:
0 / low: only verify the length;
1 / medium: verify the length, number, case and special characters;
2 / strong: verify the length, number, case, special characters and dictionary file;
6)、validate_ password_ special_ char_ Count the number of special characters at least in the whole password;

MySQL 8 password policy
1)、validate_ password_ Length the total length of the fixed password;
2)、validate_ password_ dictionary_ File specifies the file path for password authentication;
3)、validate_ password_ mixed_ case_ Count the total number of large / small letters in the whole password;
4)、validate_ password_ number_ Count the number of Arabic numerals at least in the whole password;
5)、validate_ password_ Policy specifies the strength and authentication level of the password. The default value is medium;
Value:
0 / low: only verify the length;
1 / medium: verify the length, number, case and special characters;
2 / strong: verify the length, number, case, special characters and dictionary file;

6)、validate_ password_ special_ char_ Count the number of special characters at least in the whole password;

set global validate_password.policy=0;
set global validate_password.length=1;

8. Configure Mysql to allow remote access

#登录
mysql -u root -p
#查看用户表
select user,host,authentication_string from mysql.user;
#切换数据库
use mysql;
#更新用户表 :
update user set host='%' where user='root';
#配置mysql允许远程访问
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%';
#强制刷新权限
flush privileges;

9. Common MySQL commands are attached:

Login to MySQL

mysql -u username -p

mysql start

systemctl start mysqld.service

End MySQL

systemctl stop mysqld.service

Restart MySQL

systemctl restart mysqld.service

Startup and self start

systemctl enable mysqld.service

4、将MySQL的JDBC驱动拷贝到Hive的lib目录下

I choose 8.0.27 here. I can choose the version to download

[MySQL官网下载驱动](MySQL :: Download Connector/J)

#Platform Independent
wget https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-8.0.27.tar.gz
#解压文件到Hive的lib目录下

5. Configure hive file

cd /export/servers/apache-hive-3.1.2-src/conf

Copy the initialized configuration file and rename it

cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml
cp hive-log4j2.properties.template hive-log4j2.properties
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties

6、配置hive-env.sh

vi hive-env.sh

At the end of the document

# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/export/servers/hadoop-3.3.1 #hadoop安装路径

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/export/servers/apache-hive-3.1.2-src/conf #hive路径

# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/export/servers/apache-hive-3.1.2-src/lib #hivejar包路径

Configure hive-site.xml file
First create three folders using Hadoop

hdfs dfs -mkdir -p /user/hive/warehouse
hadoop fs -mkdir -p /user/hive/tmp
hadoop fs -mkdir -p /user/hive/log

Modify folder permissions

hadoop fs -chmod -R 777 /user/hive/warehouse
hadoop fs -chmod -R 777 /user/hive/tmp
hadoop fs -chmod -R 777 /user/hive/log

7、配置Metastore到MySQL

Add the following configuration to the label:

vim hive-site.xml

At $hive_ Home / conf modify and create a new hive-site.xml file, and change the host name, MySQL user name and password according to your own situation

<configuration>
<!-- jdbc连接的URL -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://hadoop1:3306/metastore?createDatabaseIfNotExist=true&useSSL=false&serverTimezone=GMT</value>
</property>

    <!-- jdbc连接的Driver-->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
        <!-- <value>com.mysql.cj.jdbc.Driver</value> -->
</property>

        <!-- jdbc连接的username-->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>

    <!-- jdbc连接的password -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>000000</value>
    </property>

    <property>
        <name>system:java.io.tmpdir</name>
        <value>/export/servers/data/hive/tmp</value>
        <description/>
    </property>
</configuration>

Note: on behalf of & amp;

Note: on behalf of & amp;

&
<configuration>
    <!--以下可选作,建议Hive3.x不要用-->
    <!-- Hive默认在HDFS的工作目录 -->
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
    
   <!-- Hive元数据存储的验证 -->
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
   
    <!-- 元数据存储授权  -->
    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>
    <!-- 指定存储元数据要连接的地址 -->
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://hadoop2:9083</value>
    </property>
    <!-- 指定hiveserver2连接的host -->
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>hadoop2</value>
    </property>
 
    <!-- 指定hiveserver2连接的端口号 -->
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>
</configuration>

Error reporting reason 1:

1. Jar package conflict

Because it conflicts with Hadoop’s jar package, there are mainly two jar packages. One log4j-slf4j-impl-2.10.0.jar conflicts with Hadoop and can be deleted. I’ll mark it here and discard it.

mv $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.jar $HIVE_HOME/lib/log4j-slf4j-impl-2.10.0.bak

On the other hand, guava-19.0.jar conflicts with guava-27.0-jre.jar in Hadoop. The higher version is used to replace the lower version.

cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib

rm $HIVE_HOME/lib/guava-19.0.jar

Error reporting reason 2:

In the hive-site.xml configuration file, there are special characters in line 3278 (see the second line of the error report record)

Exception in thread “main” java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3278,96,”file:/export/servers/apache-hive-3.1.2-bin/conf/hive-site.xml”]

[root@hadoop1 apache-hive-3.1.2-bin]# schematool -dbType mysql -initSchema
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3278,96,"file:/export/servers/apache-hive-3.1.2-bin/conf/hive-site.xml"]
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3069)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3018)
        at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2893)
        at org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:1034)
        at org.apache.hadoop.conf.Configuration.addResource(Configuration.java:939)
        at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5151)
        at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5104)
        at org.apache.hive.beeline.HiveSchemaTool.<init>(HiveSchemaTool.java:96)
        at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3278,96,"file:/export/servers/apache-hive-3.1.2-bin/conf/hive-site.xml"]
        at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:634)
        at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:504)
        at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2469)
        at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2416)
        at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2382)
        at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1528)
        at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2818)
        at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1121)
        at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3373)
        at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3159)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3052)
        ... 14 more

Enter the hive-site.xml file, jump to the corresponding line, and delete < strong > & amp# 8 < / strong > special characters are enough

Error reporting reason 3:

问题 Exception in thread “main” java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative

1、 Question

Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        at org.apache.hadoop.fs.Path.initialize(Path.java:254)
        at org.apache.hadoop.fs.Path.<init>(Path.java:212)
        at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:663)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:586)
        at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:553)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        at java.net.URI.checkPath(URI.java:1823)
        at java.net.URI.<init>(URI.java:745)
        at org.apache.hadoop.fs.Path.initialize(Path.java:251)

2、 Settle

在hive下新建tmp文件:
mkdir iotmp
查看hive-site.xml:
查找xml中"system:java.io.tmpdir"的配置项,如果存在直接修改地址路径到自己新建的路径下。
不存在则加入配置:
 <property>
    <name>system:java.io.tmpdir</name>
    <value>xxxxxx</value>                                          自己新建的路径
    <description/>
 </property>
保存退出,重新启动hive成功。

8、下载connecor/J驱动

Download the Java version of MySQL driver, unzip it, and put it in the hive / lib file

wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-8.0.11.tar.gz
tar -zxvf mysql-connector-java-8.0.11.tar.gz
cd mysql-connector-java-8.0.11
chmod 777 mysql-connector-java-8.0.11.jar
cp mysql-connector-java-8.0.11.jar /usr/share/hive/lib/

9. Remote service

The Hadoop 1 service installs hive for Hadoop 2 and Hadoop 3

scp -r apache-hive-3.1.2-bin/ hadoop2:/export/servers/
scp -r apache-hive-3.1.2-bin/ hadoop3:/export/servers/

Hadoop 1 node:

Start the hiveserver2 node in hive

bin/hiveserver2

Hadoop 2 node / hive2 node

beeline> !connect jdbc:hive2://hadoop1:10000
#默认端口号是10000
Connecting to jdbc:hive2://hadoop1:10000
#输入账号密码
Enter username for jdbc:hive2://hadoop1:10000: root
Enter password for jdbc:hive2://hadoop1:10000: ******