https://github.com/aphp/hiveqlkernel

HiveQL Jupyter Kernel
https://github.com/aphp/hiveqlkernel

hive hiveql jupyter kernel notebook

Last synced: 3 months ago
JSON representation

HiveQL Jupyter Kernel

Host: GitHub
URL: https://github.com/aphp/hiveqlkernel
Owner: aphp
License: mit
Created: 2018-09-22T14:46:10.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2022-08-05T14:33:17.000Z (almost 3 years ago)
Last Synced: 2025-04-12T00:33:10.455Z (3 months ago)
Topics: hive, hiveql, jupyter, kernel, notebook
Language: Python
Homepage:
Size: 61.5 KB
Stars: 10
Watchers: 4
Forks: 5
Open Issues: 8
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# HiveQL Kernel

### Requirements

If you are going to connect using kerberos:

```
sudo apt-get install python3-dev libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit
```

### Installation

To install the kernel:

```
pip install --upgrade hiveqlKernel
jupyter hiveql install --user
```

### Connection configuration

Two methods are available to connect to a Hive server:

* Directly inside the notebook
* Using a configuration file

If the configuration file is present, everytime you run a new HiveQL kernel it uses it, else you must configure your connection inside the notebook. The configuration in the notebook overwrites the one in the configuration file if present.

#### Configure directly in the notebook cells

Inside a Notebook cell, copy&paste this, change the configuration to match your needs, and run it.

```
$$ url=hive://@:/
$$ connect_args={"auth": "KERBEROS", "kerberos_service_name": "hive", "configuration": {"tez.queue.name": "myqueue"}}
$$ pool_size=5
$$ max_overflow=10
```

These args are passed to sqlalchemy, who registered pyHive as the 'hive' SQL back-end.
See [github.com/dropbox/PyHive](https://github.com/dropbox/PyHive/#sqlalchemy).

#### Configure using a configuration file

The HiveQL kernel is looking for the configuration file at `~/.hiveql_kernel.conf` by default. You can specify another path using `HIVE_KERNEL_CONF_FILE`.

The contents must be like this (in json format):

```
{ "url": "hive://@:/", "connect_args" : { "auth": "KERBEROS", "kerberos_service_name":"hive", "configuration": {"tez.queue.name": "myqueue"}}, "pool_size": 5, "max_overflow": 10, "default_limit": 20, "display_mode": "be" }
```

### Usage

Inside a HiveQL kernel you can type HiveQL directly in the cells and it displays a HTML table with the results.

You also have other options, like changing the default display limit (=20) like this :

```
$$ default_limit=50
```

Some hive functions are extended. They allow to filter with some patterns.

```
SHOW TABLES
SHOW DATABASES
```

### Run tests

```
python -m pytest
```

Have fun!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/aphp/hiveqlkernel

Awesome Lists containing this project

README