HQL语法与SQL类似,不区分大小写,语句以分号(;)结束。

Hive基本操作

  • 查看数据库:show databases;
  • 创建数据库:create database test;
  • 切换数据库:use test;
  • 查看表:show tables;
  • 创建表:create table t_user(id string, name string);
  • 导入数据:insert into table t_user values ('1','lihua'), ('2','laoluo');
  • 查询数据:select * from t_user;
  • 删除表:drop table t_user
  • 删除数据库:drop database test

HQL建表

完整语法:

1
2
3
4
5
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[STORED AS file_format]

  • 指定表字段与数据类型:
    1
    2
    CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name
    [(col_name data_type [COMMENT col_comment], ...)]
  • 指定注释为Employee detailsCOMMENT ‘Employee details’
  • 指定字段终止段为\tFIELDS TERMINATED BY ‘\t’
  • 指定行终止符为\nLINES TERMINATED BY ‘\n’
  • 指定保存的文件类型:STORED AS TEXTFILE

实例

1
2
3
4
5
6
7
hive> CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
> salary String, destination String)
> COMMENT ‘Employee details’
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ‘\t’
> LINES TERMINATED BY ‘\n’
> STORED AS TEXTFILE;

数据导入

对于大量数据的处理,建议使用LOAD DATA来加载。
完整语法:

1
2
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename 
[PARTITION (partcol1=val1, partcol2=val2 ...)]

  • 从本地磁盘加载:
    1
    2
    hive> LOAD DATA LOCAL INPATH 'sample.txt'
    > OVERWRITE INTO TABLE employee;
  • 从HDFS加载:
    1
    2
    hive> LOAD DATA INPATH '\sample.txt'
    > OVERWRITE INTO TABLE employee;

表修改

完整语法:

1
2
3
4
5
ALTER TABLE name RENAME TO new_name //修改表名称
ALTER TABLE name ADD COLUMNS (col_spec[, col_spec ...]) //在表中增加一列
ALTER TABLE name DROP [COLUMN] column_name //删除表中的一列
ALTER TABLE name CHANGE column_name new_name new_type //修改某一列的名称与类型
ALTER TABLE name REPLACE COLUMNS (col_spec[, col_spec ...])

实例

1
2
3
4
5
6
hive> ALTER TABLE employee RENAME TO emp;

hive> ALTER TABLE employee ADD COLUMNS (
> dept STRING COMMENT 'Department name');

hive> ALTER TABLE employee CHANGE salary salary Double;

条件查询

完整语法:

1
2
3
4
5
6
7
SELECT [ALL | DISTINCT] select_expr, select_expr, ... 
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[HAVING having_condition]
[CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY col_list]]
[LIMIT number];

  • WHERE:与SQL中的语法类似,后跟查询条件,如SELECT * FROM employee WHERE age>30;
  • GROUP BY:用于分类所有结果,如SELECT Dept,count(*) FROM employee GROUP BY DEPT;
  • ORDER BY:全局排序,如SELECT ID,Name,Dept FROM employee ORDER BY DEPT

表关联

  • 等值关联:SELECT a.* FROM tableA a JOIN tableB b ON a.id=b.id
  • 左连接:SELECT a.* FROM tableA a LEFT UNION JOIN tableB b ON a.id=b.id,左连接中如果右表没有对应的记录,查询返回时对应的值为NULL。
    LEFT UNION JOIN
    RIGHT UNION JOIN

数据导出

  • 导出到本地文件系统:insert overwrite local directory '/root/' slect * from loan
  • 导出到HDFS:insert overwrite directory '/root select * from loan'
  • 导出到另外一张表:insert into table loan_copy select * from loan