Hive生产中常用的一些操作

it2022-05-05 208

一，Create Table 1，拷贝表结构，不拷贝表数据

create table newTable like oldTable;

2,Create Table As Select (CTAS)

create table newTable as select * from table

二，Alter Table 1,修改表名

Alter Table oldTable rename to newTable

2，表新加一列

alter table table_name add columns (column_name dataType);

三，Drop Table 1,删表跑路

drop table table_name

2,删表的数据，表结构还存在

Truncate Table table_name

四，数据导入： 1，

1.load data local inpath '/home/hadoop/data/test.txt' into table test; 2.load data inpath 'hdfs://hadoop01:80020/test/test.txt' into table test; 3.Inserting data into Hive Tables from queries

LOAD DATA [LOCAL] INPATH ‘’ [OVERWRITE] INTO TABLE XXX; LOCAL：从本地系统 linux 不带LOCAL: 从Hadoop文件系统 HDFS

OVERWRITE 数据覆盖不带OVERWRITE 追加五，HQL带条件查询

select ename,sal, case when sal>1 and sal<=1000 then "lower" when sal>1000 and sal<=2000 then "just so so" when sal>2000 and sal<=4000 then "ok" else "high" end from xxx

六，order by vs sort by vs distribute by vs cluster by ①，order by 做全局排序，一个reduce ②，sort by 局部排序，每一个reduce内是有序的一旦 set.mapred.reduce.tasks = 3，总的排序会乱 ③， distribute by 按照一定的规则把数据分散到某个reducer ④，cluster by = distribute by xxx sort by xxx 七，hive 内置函数build-in ①查看所有的内置函数：show functions; ②显示某个函数的使用，和使用案例：desc function extended length;

八，HQL实现wordcount案例主要用到两个内置函数：split 和 explode(行转列）

select word ,count(1) as cnt from (select explode(split(statment,' '))as word from test) a group by word order by cnt desc;

专利

最新回复(0)