https://github.com/kornev/kornev-gin-app
A JSON-RPC 2.0 operation based API for Hive Metastore which used PostgreSQL, Hive JDBC and HDFS.
https://github.com/kornev/kornev-gin-app
big-data clojure data-catalog data-discovery hdfs hive metadata
Last synced: 19 days ago
JSON representation
A JSON-RPC 2.0 operation based API for Hive Metastore which used PostgreSQL, Hive JDBC and HDFS.
- Host: GitHub
- URL: https://github.com/kornev/kornev-gin-app
- Owner: kornev
- License: mit
- Created: 2022-01-25T06:38:05.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-05-04T05:21:57.000Z (over 3 years ago)
- Last Synced: 2024-01-26T11:36:32.015Z (about 2 years ago)
- Topics: big-data, clojure, data-catalog, data-discovery, hdfs, hive, metadata
- Language: Clojure
- Homepage:
- Size: 69.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: license
Awesome Lists containing this project
README
# Gin
## Using Gin
Method **table/describe**:
```shell
curl -s -X POST 'http://127.0.0.1:8080/rpc' \
-d '{"id": 1, "jsonrpc": "2.0", "method": "table/describe", "params": ["processing", "user_agg"]}' \
-H 'content-type: application/json'
```
```json
{
"id": 1,
"jsonrpc": "2.0",
"result": {
"DB_ID": 16,
"TBL_ID": 59,
"SD_ID": 59,
"CD_ID": 59,
"DB_NAME": "processing",
"TBL_NAME": "user_agg",
"PART_SPEC": [
{
"INTEGER_IDX": 0,
"PKEY_NAME": "dt",
"PKEY_TYPE": "string"
}
],
"DB_LOCATION_URI": "hdfs://ensime/warehouse/tablespace/managed/hive/processing.db",
"LOCATION": "hdfs://ensime/warehouse/tablespace/external/hive/processing.db/user_agg"
}
}
```
Method **partition/list**:
```shell
curl -s -X POST 'http://127.0.0.1:8080/rpc' \
-d '{"id": 3, "jsonrpc": "2.0", "method": "partition/list", "params": [198, 2]}' \
-H 'content-type: application/json'
```
```json
{
"id": 3,
"jsonrpc": "2.0",
"result": [
{
"TBL_ID": 198,
"SD_ID": 13928,
"PART_ID": 13717,
"CREATE_TIME": 1608131750,
"CD_ID": 198,
"PART_NAME": "dt=2020-11-22",
"LOCATION": "hdfs://ensime/warehouse/tablespace/external/hive/incoming_data.db/mediahills/dt=2020-11-22",
"PART_SPEC_VAL": [
{
"INTEGER_IDX": 0,
"PKEY_NAME": "dt",
"PKEY_TYPE": "date",
"PART_KEY_VAL": "2020-11-22"
}
]
},
{
"TBL_ID": 198,
"SD_ID": 40501,
"PART_ID": 40258,
"CREATE_TIME": 0,
"CD_ID": 198,
"PART_NAME": "dt=2021-02-13",
"LOCATION": "hdfs://ensime/warehouse/tablespace/external/hive/incoming_data.db/mediahills/dt=2021-02-13",
"PART_SPEC_VAL": [
{
"INTEGER_IDX": 0,
"PKEY_NAME": "dt",
"PKEY_TYPE": "date",
"PART_KEY_VAL": "2021-02-13"
}
]
}
]
}
```
Method **partition/parent**:
```shell
curl -s -X POST 'http://127.0.0.1:8080/rpc' \
-d '{"id": 4, "jsonrpc": "2.0", "method": "partition/parent", "params": [59, 1626746198]}' \
-H 'content-type: application/json'
```
```json
{
"id": 4,
"jsonrpc": "2.0",
"result": {
"TBL_ID": 59,
"SD_ID": 52731,
"PART_ID": 52481,
"CREATE_TIME": 1626659658,
"CD_ID": 59,
"PART_NAME": "dt=2021-07-18",
"LOCATION": "hdfs://ensime/warehouse/tablespace/external/hive/processing.db/user_agg/dt=2021-07-18",
"PART_SPEC_VAL": [
{
"INTEGER_IDX": 0,
"PKEY_NAME": "dt",
"PKEY_TYPE": "string",
"PART_KEY_VAL": "2021-07-18"
}
]
}
}
```
Method **partition/find**:
```shell
curl -s -X POST 'http://127.0.0.1:8080/rpc' \
-d '{"id": 61966, "jsonrpc": "2.0", "method": "partition/find", "params": [248, ["2021", "08", "09", "15"]]}' \
-H 'content-type: application/json'
```
```json
{
"id": 61966,
"jsonrpc": "2.0",
"result": {
"TBL_ID": 248,
"SD_ID": 54309,
"PART_ID": 54059,
"CREATE_TIME": 1628514173,
"CD_ID": 248,
"PART_NAME": "year=2021/month=08/day=09/hour=15",
"LOCATION": "hdfs://ensime/data/logs/siren/year=2021/month=08/day=09/hour=15",
"PART_SPEC_VAL": [
{
"INTEGER_IDX": 0,
"PKEY_NAME": "year",
"PKEY_TYPE": "string",
"PART_KEY_VAL": "2021"
},
{
"INTEGER_IDX": 1,
"PKEY_NAME": "month",
"PKEY_TYPE": "string",
"PART_KEY_VAL": "08"
},
{
"INTEGER_IDX": 2,
"PKEY_NAME": "day",
"PKEY_TYPE": "string",
"PART_KEY_VAL": "09"
},
{
"INTEGER_IDX": 3,
"PKEY_NAME": "hour",
"PKEY_TYPE": "string",
"PART_KEY_VAL": "15"
}
]
}
}
```
Method **partition/attach**:
```clojure
(echo
(rpc :partition/attach [#_(TBL_ID) 261
#_(PART_KEY_VALS) ["2019-11-22" "moscow" 13]
#_(DATA_LOCATION) "/user/pmp/input"]))
```
```clojure
{:id 52445,
:jsonrpc "2.0",
:result {:PART_COLS ["dt" "city" "num"],
:PART_KEY_VALS ["2019-11-22" "moscow" 13],
:TBL_NAME "test01",
:CD_ID 261,
:ADD_PARTITION_SQL "ALTER TABLE processing.test01 ADD IF NOT EXISTS PARTITION(dt=date '2019-11-22', city='moscow', num=13)",
:DATA_LOCATION "/user/pmp/input",
:DB_NAME "processing",
:PART_LOCATION "/user/pmp/hive/dt=2019-11-22/city=moscow/num=13",
:DB_ID 16,
:TBL_ID 261,
:LOCATION "hdfs://ensime/user/pmp/hive",
:SD_ID 64654,
:PART_TYPES ["date" "string" "int"]}}
```
```text
+-----------------------------------+
| partition |
+-----------------------------------+
| dt=2019-11-22/city=moscow/num=13 |
+-----------------------------------+
1 rows selected (0.278 seconds)
+------------+--------------+-------------+--------------+-------------+
| test01.id | test01.name | test01.dt | test01.city | test01.num |
+------------+--------------+-------------+--------------+-------------+
| 1 | "ya.ru" | 2019-11-22 | moscow | 13 |
| 2 | "yandex.ru" | 2019-11-22 | moscow | 13 |
+------------+--------------+-------------+--------------+-------------+
```
Method **partition/detach**:
```clojure
(echo
(rpc :partition/detach [#_(TBL_ID) 261
#_(PART_KEY_VALS) ["2019-11-22" "moscow" 13]]))
```
```clojure
{:id 95792,
:jsonrpc "2.0",
:result {:TBL_NAME "test01",
:CD_ID 261,
:PART_KEY_VAL_STR "2019-11-22moscow13",
:CREATE_TIME 1645776871,
:PART_SPEC_VAL [{:INTEGER_IDX 0,
:PKEY_NAME "dt",
:PKEY_TYPE "date",
:PART_KEY_VAL "2019-11-22"}
{:INTEGER_IDX 1,
:PKEY_NAME "city",
:PKEY_TYPE "string",
:PART_KEY_VAL "moscow"}
{:INTEGER_IDX 2,
:PKEY_NAME "num",
:PKEY_TYPE "int",
:PART_KEY_VAL "13"}],
:DB_NAME "processing",
:PART_NAME "dt=2019-11-22/city=moscow/num=13",
:PART_ID 64453,
:PART_LOCATION "/user/pmp/hive/dt=2019-11-22/city=moscow/num=13",
:TBL_ID 261,
:DROP_PARTITION_SQL "ALTER TABLE processing.test01 DROP IF EXISTS PARTITION(dt=date '2019-11-22', city='moscow', num=13)",
:LOCATION "hdfs://ensime/user/pmp/hive/dt=2019-11-22/city=moscow/num=13",
:SD_ID 64704}}
```
## Requirements
* Java 8, 11
* Hadoop authentication method is **SIMPLE**
* Supported Hive data types in partitions query is **INT**, **DATE**, **STRING**
Build:
```sh
make jar
```
Environment variables are required:
```text
GIN_HOST
GIN_PORT
HADOOP_NAMENODE_ACTIVE
HADOOP_NAMENODE_PASSIVE
HADOOP_PROXY_USER (example: spark)
HIVE_METASTORE_HOST
HIVE_METASTORE_PORT
HIVE_METASTORE_USER (example: hive)
HIVE_METASTORE_PASS (example: hive)
HIVE_HOST
HIVE_PORT (example: 10000)
HIVE_USER
HIVE_PASS
```
Available endpoints:
```text
/rpc
/health
```
## License
Copyright (C) 2022 Vadim Kornev.
Distributed under the MIT License.