Table of Contents
步骤1:在PG中为元数据增加用户的DB
步骤2:下载PGJDBC驱动
步骤3:修改HIVE配置文件
步骤4:初始化元数据表
小结
Home Database Mysql Tutorial 配置apache HIVE元数据DB为PostgreSQL

配置apache HIVE元数据DB为PostgreSQL

Jun 07, 2016 pm 03:15 PM
apache hive postgresql data Configuration

本文出处:http://amutu.com/blog/2013/06/hive-metastore-db-postgresql/ HIVE 的元数据默认使用 derby 作为存储 DB , derby 作为轻量级的 DB ,在开发、测试过程中使用比较方便,但是在实际的生产环境中,还需要考虑易用性、容灾、稳定性以及各种监控、运

本文出处:http://amutu.com/blog/2013/06/hive-metastore-db-postgresql/

HIVE的元数据默认使用derby作为存储DBderby作为轻量级的DB,在开发、测试过程中使用比较方便,但是在实际的生产环境中,还需要考虑易用性、容灾、稳定性以及各种监控、运维工具等,这些都是derby缺乏的。MySQLPostgreSQL是两个比较常用的开源数据库系统,在生产环境中比较多的用来替换derby。配置MySQL在网上的文章比较多,这里不再赘述,本文主要描述配置HIVE元数据DBPostgreSQL的方法。

HIVE版本:HIVE 0.7-snapshotHIVE 0.8-snapshot

步骤1:在PG中为元数据增加用户的DB

首先在PostgreSQL中为HIVE的元数据建立帐号和DB

 

--以管理员身份登入PG

psql postgres -U postgres

 

--创建用户hive_user:

Create user hive_user;

 

--创建DB metastore_dbownerhive_user:

Create database metastore_db with owner=hive_user;

 

--设置hive_user的密码:

/password hive_user

 

完成以上步骤以后,还要确保PostgreSQLpg_hba.conf中的配置允许HIVE所在的机器ip可以访问PG

步骤2:下载PGJDBC驱动

HIVE_HOME目录下创建auxlib目录:

mkdir auxlib

此时HIVE_HOME目录中应该有binlibauxlibconf等目录。

 

下载PGJDBC驱动

Wget http://jdbc.postgresql.org/download/postgresql-9.0-801.jdbc4.jar

 

将下载到的postgresql-9.0-801.jdbc4.jar放到auxlib中。

步骤3:修改HIVE配置文件

HIVE_HOME中新建hive-site.xml 文件,内容如下,蓝色字体按照PG server的相关信息进行修改。

 

  javax.jdo.option.ConnectionURL

  jdbc:postgresql://pg_server_ip:pg_server_port/metastore_db?

  JDBC connect string for a JDBC metastore

 

  javax.jdo.option.ConnectionDriverName

  org.postgresql.Driver

  Driver class name for a JDBC metastore

 

  javax.jdo.option.ConnectionUserName

  hive_user

  username to use against metastore database

 

  javax.jdo.option.ConnectionPassword

  hive_user_pass

  password to use against metastore database

 

 

步骤4:初始化元数据表

元数据库metastore中默认没有表,当HIVE第一次使用某个表的时候,如果发现该表不存在就会自动创建。对derbymysql,这个过程没有问题,因此derbymysql作为元数据库不需要这一步。

PostgreSQL在初始化的时候,会遇到一些问题,导致PG数据库死锁。例如执行以下HIVE语句:

>Create table kv (key,int,value string) partitioned by (ds string);

OK

>Alter table kv add partition (ds = '20110101');

执行这一句的时候,HIVE会一直停在这。

 

查看PG数据库,发现有两个连接在进行事务操作,其中一个是:

 in transaction

此时处于事务中空闲,另外一个是:

ALTER TABLE "PARTITIONS" ADD CONSTRAINT "PARTITIONS_FK1" FOREIGN KEY ("SD_ID") REFERENCES "SDS" ("SD_ID") INITIALLY DEFERRED 

处于等待状态。

 

进一步查看日志,发现大致的过程是这样的:

 

HIVE发起Alter table kv add partition (ds = '20110101')语句,此时DataNucleus接口发起第一个isolationSERIALIZABLE的事务,锁定了TBLS等元数据表。在这个的事务进行过程中,DataNucleu发现PARTITIONS等表没有,则要自动创建。于是又发起了另外一个isolationSERIALIZABLE的事务,第一个事务变为 in transaction。第二个事务创建了PARTITIONS的表后,还要给它增加约束条件,这时,它需要获得它引用的表SDS的排他锁,但这个锁已经被第一个事务拿到了,因此需要等待第一个事务结束。而第一个事务也在等待第二个事务结束。这样就造成了死锁。

 

类似的情况出现在:

>create test(key int);

OK

>drop table test;

drop table时会去drop它的index,而此时没有index元数据表,它去键,然后产生死锁。

 

有三种方法可以解决这个死锁问题:

 

第一种方法:

使用PGpg_terminate_backend()将第一个事务结束掉,这样可以保证第二个事务完成下去,将元数据表键成功。

第二种方法:

使HIVE将创建元数据表的过程和向元数据表中添加数据的过程分离:

>Create table kv (key,int,value string) partitioned by (ds string);

OK

>show partitions kv;

OK

>Alter table kv add partition (ds = '20110101');

OK

执行以上语句时就不会发生死锁,因为在执行show partitions kv语句时,它是只读语句,不会加锁。当这个语句发现PARTITIONS等表不在时,创建这些表不会发生死锁。

同样对于index表,使用

>Show index on kv;

可以将IDXS表建好。

第三种方法:

使用DataNucleu提供的SchemaTool,将HIVEmetastore/src/model/package.jdo文件作为输入,这个工具可以自动创建元数据中的表。具体的使用方法见:

http://www.datanucleus.org/products/accessplatform_2_0/rdbms/schematool.html

小结

本文给出了使用PostgreSQL作为HIVE元数据DB的配置方法,以及遇到的死锁问题的解决办法,希望对使用HIVEPostgreSQL的朋友有帮助。

 

 

 

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1672
14
PHP Tutorial
1276
29
C# Tutorial
1256
24
How to set the cgi directory in apache How to set the cgi directory in apache Apr 13, 2025 pm 01:18 PM

To set up a CGI directory in Apache, you need to perform the following steps: Create a CGI directory such as "cgi-bin", and grant Apache write permissions. Add the "ScriptAlias" directive block in the Apache configuration file to map the CGI directory to the "/cgi-bin" URL. Restart Apache.

How to view your apache version How to view your apache version Apr 13, 2025 pm 01:15 PM

There are 3 ways to view the version on the Apache server: via the command line (apachectl -v or apache2ctl -v), check the server status page (http://<server IP or domain name>/server-status), or view the Apache configuration file (ServerVersion: Apache/<version number>).

What to do if the apache80 port is occupied What to do if the apache80 port is occupied Apr 13, 2025 pm 01:24 PM

When the Apache 80 port is occupied, the solution is as follows: find out the process that occupies the port and close it. Check the firewall settings to make sure Apache is not blocked. If the above method does not work, please reconfigure Apache to use a different port. Restart the Apache service.

How to connect to the database of apache How to connect to the database of apache Apr 13, 2025 pm 01:03 PM

Apache connects to a database requires the following steps: Install the database driver. Configure the web.xml file to create a connection pool. Create a JDBC data source and specify the connection settings. Use the JDBC API to access the database from Java code, including getting connections, creating statements, binding parameters, executing queries or updates, and processing results.

How to view the apache version How to view the apache version Apr 13, 2025 pm 01:00 PM

How to view the Apache version? Start the Apache server: Use sudo service apache2 start to start the server. View version number: Use one of the following methods to view version: Command line: Run the apache2 -v command. Server Status Page: Access the default port of the Apache server (usually 80) in a web browser, and the version information is displayed at the bottom of the page.

How to configure zend for apache How to configure zend for apache Apr 13, 2025 pm 12:57 PM

How to configure Zend in Apache? The steps to configure Zend Framework in an Apache Web Server are as follows: Install Zend Framework and extract it into the Web Server directory. Create a .htaccess file. Create the Zend application directory and add the index.php file. Configure the Zend application (application.ini). Restart the Apache Web server.

How to solve the problem that apache cannot be started How to solve the problem that apache cannot be started Apr 13, 2025 pm 01:21 PM

Apache cannot start because the following reasons may be: Configuration file syntax error. Conflict with other application ports. Permissions issue. Out of memory. Process deadlock. Daemon failure. SELinux permissions issues. Firewall problem. Software conflict.

How to delete more than server names of apache How to delete more than server names of apache Apr 13, 2025 pm 01:09 PM

To delete an extra ServerName directive from Apache, you can take the following steps: Identify and delete the extra ServerName directive. Restart Apache to make the changes take effect. Check the configuration file to verify changes. Test the server to make sure the problem is resolved.

See all articles