HIVE是一个基于Hadoop的数据仓库,适用于一些高延迟性的应用。如果对延迟性要求比较高,则可以选择Hbase。
前提:需要已经安装配置好hadoop参考:hadoop2.7.3伪分布式环境搭建详细安装过程
安装mysql
- 下载安装mysql
yum install mysql-server - 设置默认字符和引擎
vim /etc/my.cnf
在[mysqld]下添加
default-character-set=utf8
default-storage-engine=INNODB - 启动mysql
cd /etc/init.d
./mysqld start - 进入mysql
mysql
建立配置hive数据库
为用户创建一个名为hive的数据库,并设置编码为latin1
mysql> create database hive default character set latin1;查看hive数据库是否成功建立
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| hive |
| mysql |
| test |
+--------------------+
4 rows in set (0.00 sec)
- 创建hive用户并授权
//授权hive用户拥有hive数据库的所有权限
mysql> grant all privileges on hive.* to hive@'%' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
//刷新系统权限表
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
- 测试hive用户能否链接到mysql
[root@cognos init.d]# mysql -u hive -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
。。。
mysql> use hive;
Database changed
mysql> show tables;
Empty set (0.00 sec)
安装hive
- 下载
hive-2.0.1下载 - 解压
tar -xzvf apache-hive-2.0.1-bin.tar.gz - 将解压后的文件夹重命名并放到hadoop目录下
mv apache-hive-2.0.1-bin hive
mv hive /opt/hadoop/ - 下载mysql驱动包并放入hive安装目录/lib下
我这里下载的是mysql-connector-java-5.1.36-bin.jar
配置
- 修改环境变量
vi /etc/profile
添加以下内容
#HIVE
export HIVE_HOME=/opt/hadoop/hive
export PATHA=$PATH:$HIVE_HOME/bin
source /etc/profile 使更改生效
2.修改hive配置文件
- 复制几个配置文件
cp hive-default.xml.template hive-default.xml
cp hive-env.sh.template hive-env.sh
cp hive-log4j2.properties.template hive-log4j2.properties
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
- 修改hive-default.xml
vim hive-default.xml
通过vim编辑器的查找命令找到有vavax的位置,并对相关地方进行配置。总共四处。这四处改为之前mysql的配置信息。
#jdbc连接方式
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
#mysql连接配置
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://172.16.7.191:3306/hive?createDatabaseIfNotExist=true</value>
#mysql数据库的用户名
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
#用户对应的密码
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
redhat中vim编辑器的查找命令
:set hls //打开高亮
/XXX //往下查找
?XXX //网上查找
>####启动
1. 启动Hive 的 Metastore Server服务进程
hive --service metastore &
2. hive第一次登录需要初始化
schematool -dbType mysql -initSchema
3. 登录hive
[root@cognos conf]# hive
which: no hbase in (/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/root/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hive/bin)
Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
大致意思:在Hive2.0后在Mapreduce的框架上将不再支持,希望考虑使用其它的执行引擎(如tez,spark等。)暂时不知道会有什么影响。
hive> show databases;
OK
default
Time taken: 0.728 seconds, Fetched: 1 row(s)
4. 验证
hive配置成功后,mysql同样可以连接到hive数据库,并进行操作。
mysql> use hive
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive |
+---------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| DELEGATION_TOKENS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| HIVE_LOCKS |
| IDXS |
| INDEX_PARAMS |
| MASTER_KEYS |
| NEXT_COMPACTION_QUEUE_ID |
| NEXT_LOCK_ID |
| NEXT_TXN_ID |
| NOTIFICATION_LOG |
| NOTIFICATION_SEQUENCE |
| NUCLEUS_TABLES |
| PARTITIONS |
| PARTITION_EVENTS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| ROLES |
| ROLE_MAP |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| TXNS |
| TXN_COMPONENTS |
| TYPES |
| TYPE_FIELDS |
| VERSION |
+---------------------------+
55 rows in set (0.01 sec)
>####报错及解决方法
1. SLF4J多重绑定
which: no hbase in (/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/root/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
**解决办法**
上述jar包有重复绑定Logger类,删除较旧版本即可。
rm -rf /opt/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar
2. 没有正常启动Hive 的 Metastore Server服务进程。
Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Exception in thread "main" java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1550)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3080)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3108)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:543)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:516)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:648)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.reflect.InvocationTargetException
**解决方法:**
启动Hive 的 Metastore Server服务进程,执行如下命令:
hive --service metastore &
3. mysql权限问题
```
javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://172.16.7.191:3306/hive?createDatabaseIfNotExist=true, username = hive. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Access denied for user 'hive'@'cognos' (using password: YES)
解决办法:
将hive-default.xml文件中的jdbc:mysql://172.16.7.191:3306换成localhost:3306
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
- hive第一次登录没有初始化
avax.jdo.JDODataStoreException: Required table missing : "VERSION" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrect, or you need to enable "datanucleus.schema.autoCreateTables"
解决办法:
hive在第一次登录的时候需要用 schematool -dbType mysql -initSchema命令初始化。执行执行以下命令
schematool -dbType mysql -initSchema
- 不明确的路径指代system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
原因是system:java.io.tmpdir变量在配置文件中无法获取到实际的值,就是找不到路径,正常情况下Hive启动的时候会产生临时文件和日志文件。由于文件无法被创建,所以进程就启动不了。
解决办法:
在配置文件default-site.xml里找"system:java.io.tmpdir"把他们都换成绝对路径如:/opt/hadoop/hive/iotmp/
并指认一个system:user.name
<property>
<name>system:user.name</name>
<value>user_name</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/hadoop/hive/iotmp/${system:user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/hadoop/hive/iotmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
参考:
redhat下mysql安装与使用
mysql 创建和删除用户
HIVE完全分布式集群安装过程(元数据库: MySQL)
[Hive]那些年踩过的Hive坑