hive之编译源码支持UDF函数

it2024-04-21  10

下载hive源码

[root@hadoop001 ~]# cd /opt [root@hadoop001 opt]# mkdir sourcecode [root@hadoop001 opt]# cd sourcecode [root@hadoop001 sourcecode]# wget http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.7.0-src.tar.gz [root@hadoop001 sourcecode]# ll -rw-r--r-- 1 root root 14652104 Apr 21 10:23 hive-1.1.0-cdh5.7.0-src.tar.gz

 

解压源码

[root@hadoop001 sourcecode]#tar -xzf hive-1.1.0-cdh5.7.0-src.tar.gz [root@hadoop001 sourcecode]# ll total 14316 drwxrwxr-x 31 root root 4096 Mar 24 2016 hive-1.1.0-cdh5.7.0 -rw-r--r-- 1 root root 14652104 Apr 21 10:23 hive-1.1.0-cdh5.7.0-src.tar.gz

 

添加UDF函数类

HelloUDF.java

[root@hadoop001 udf]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/udf [root@hadoop001 udf]# rz ##上传你自己写的UDF函数

 

 

 

[root@hadoop001 udf]# vim HelloUDF.java

第一行改为:该类的包名为package org.apache.hadoop.hive.ql.udf;

【org/apache/hadoop/hive//ql/udf,这个包名就是HelloUDF.java所在路径】

 

 

 注册函数

[root@hadoop001 exec]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/exec/ [root@hadoop001 exec]# vim FunctionRegistry.java

 

在第135行添加

import org.apache.hadoop.hive.ql.udf.HelloUDF;

 

 

在176行添加

 

system.registerUDF("HelloUDF", HelloUDF.class,false);

 ###HelloUDF是函数名,随意起,第二个HelloUDF.class是类的名字

 

 编译hive

[root@hadoop001 exec]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0 [root@hadoop001 hive-1.1.0-cdh5.7.0]#mvn clean package -DskipTests -Phadoop-2 -Pdist

 

等待编译成功,或者各种报错,基本上就是配置文件的问题,我报错报了两天,真的心累,总结一下心得给大家1.查看一下maven的版本,最好用最新的,我现在最新的是apache-maven-3.6.1,用apache-maven-3.3.9的时候,编译不成功,会报错。2.换了版本以后看一下环境是否也配置了,如果还沿用以前的环境,会报错3.局部环境和全局环境要保持统一或者只配局部环境,要不然会报错4.setting文件配置###你可以把你之前的备份好,然后全部删掉,把以下内容复制进去 <repositories> <!-- This needs to be removed before checking in--> <repository> <id>alimaven</id> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>cdh.releases.repo</id> <url>https://repository.cloudera.com/content/groups/cdh-releases-rcs</url> <name>CDH Releases Repository</name> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>cdh.snapshots.repo</id> <url>https://repository.cloudera.com/content/repositories/snapshots</url> <name>CDH Snapshots Repository</name> <snapshots> <enabled>true</enabled> </snapshots> </repository> <repository> <id>datanucleus</id> <name>datanucleus maven repository</name> <url>http://www.datanucleus.org/downloads/maven2</url> <layout>default</layout> <releases> <enabled>true</enabled> <checksumPolicy>warn</checksumPolicy> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>glassfish-repository</id> <url>http://maven.glassfish.org/content/groups/glassfish</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>glassfish-repo-archive</id> <url>http://maven.glassfish.org/content/groups/glassfish</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>sonatype-snapshot</id> <url>https://oss.sonatype.org/content/repositories/snapshots</url> <releases> <enabled>false</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories>

 

编译成功

 

 【注意,编译途中可能会出现这种情况,不要担心,继续等待即可】

 

apache-hive-1.1.0-cdh5.7.0-bin.tar.gz这个包是我们需要的

[root@hadoop001 target]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/packaging/target [root@hadoop001 target]# ll total 129260 drwxr-xr-x 2 root root 4096 Apr 22 21:17 antrun drwxr-xr-x 3 root root 4096 Apr 22 21:17 apache-hive-1.1.0-cdh5.7.0-bin -rw-r--r-- 1 root root 105854885 Apr 22 21:17 apache-hive-1.1.0-cdh5.7.0-bin.tar.gz -rw-r--r-- 1 root root 12656493 Apr 22 21:18 apache-hive-1.1.0-cdh5.7.0-jdbc.jar -rw-r--r-- 1 root root 13823053 Apr 22 21:18 apache-hive-1.1.0-cdh5.7.0-src.tar.gz drwxr-xr-x 2 root root 4096 Apr 22 21:17 archive-tmp drwxr-xr-x 3 root root 4096 Apr 22 21:17 maven-shared-archive-resources drwxr-xr-x 3 root root 4096 Apr 22 21:17 tmp drwxr-xr-x 2 root root 4096 Apr 22 21:17 warehouse [root@hadoop001 lib]# pwd /opt/sourcecode/hive-1.1.0-cdh5.7.0/packaging/target/apache-hive-1.1.0-cdh5.7.0-bin/apache-hive-1.1.0-cdh5.7.0-bin/lib [root@hadoop001 lib]# ll hive-exec-1.1.0-cdh5.7.0.jar -rw-r--r-- 1 root root 19272399 Apr 22 21:17 hive-exec-1.1.0-cdh5.7.0.jar

  

##把 hive-exec-1.1.0-cdh5.7.0.jar这个包复制到hive放这个包的位置,并把原来的删掉

[root@hadoop001 lib]#su - hadoop [hadoop@hadoop001 lib]$ pwd /home/hadoop/app/hive-1.1.0-cdh5.7.0/lib [hadoop@hadoop001 lib]$ ll hive-exec-1.1.0-cdh5.7.0.jar -rw-r--r-- 1 hadoop hadoop 19274557 Apr 21 18:54 hive-exec-1.1.0-cdh5.7.0.jar [hadoop@hadoop001 lib]$ mv hive-exec-1.1.0-cdh5.7.0.jar hive-exec-1.1.0-cdh5.7.0.jar_yuan 重名了

  

复制到/home/hadoop/app/hive-1.1.0-cdh5.7.0/lib/目录下

[root@hadoop001 lib]# cp hive-exec-1.1.0-cdh5.7.0.jar /home/hadoop/app/hive-1.1.0-cdh5.7.0/lib/

 

测试

1 hive (default)> show functions; 2 3 helloudf 4 5 hive (default)> select helloudf('zz') from dual; 6 OK 7 Hello:zz 8 Time taken: 0.922 seconds, Fetched: 1 row(s)

 

成功

转载于:https://www.cnblogs.com/xuziyu/p/10753506.html

相关资源:各显卡算力对照表!
最新回复(0)