Hive生产中常用的一些操作

it2022-05-05  163

一,Create Table 1,拷贝表结构,不拷贝表数据

create table newTable like oldTable;

2,Create Table As Select (CTAS)

create table newTable as select * from table

二,Alter Table 1,修改表名

Alter Table oldTable rename to newTable

2,表新加一列

alter table table_name add columns (column_name dataType);

三,Drop Table 1,删表跑路

drop table table_name

2,删表的数据,表结构还存在

Truncate Table table_name

四,数据导入: 1,

1.load data local inpath '/home/hadoop/data/test.txt' into table test; 2.load data inpath 'hdfs://hadoop01:80020/test/test.txt' into table test; 3.Inserting data into Hive Tables from queries

LOAD DATA [LOCAL] INPATH ‘’ [OVERWRITE] INTO TABLE XXX; LOCAL:从本地系统 linux 不带LOCAL: 从Hadoop文件系统 HDFS

OVERWRITE 数据覆盖 不带OVERWRITE 追加 五,HQL带条件查询

select ename,sal, case when sal>1 and sal<=1000 then "lower" when sal>1000 and sal<=2000 then "just so so" when sal>2000 and sal<=4000 then "ok" else "high" end from xxx

六,order by vs sort by vs distribute by vs cluster by ①,order by 做全局排序,一个reduce ②,sort by 局部排序,每一个reduce内是有序的 一旦 set.mapred.reduce.tasks = 3,总的排序会乱 ③, distribute by 按照一定的规则把数据分散到某个reducer ④,cluster by = distribute by xxx sort by xxx 七,hive 内置函数build-in ①查看所有的内置函数:show functions; ②显示某个函数的使用,和使用案例:desc function extended length;

八,HQL实现wordcount案例 主要用到两个内置函数:split 和 explode(行转列)

select word ,count(1) as cnt from (select explode(split(statment,' '))as word from test) a group by word order by cnt desc;


最新回复(0)