An open API service indexing awesome lists of open source software.

https://github.com/nihaljn/datahawk

Viewer for text datasets in formats like HuggingFace, JSONL, etc.
https://github.com/nihaljn/datahawk

data-analysis data-browser nlp research

Last synced: 8 days ago
JSON representation

Viewer for text datasets in formats like HuggingFace, JSONL, etc.

Awesome Lists containing this project

README

          



---

A lightweight app that makes browsing and analyzing text data a breeze.

#### Key Features

🔍 **Intuitive Navigation**: Effortlessly browse local (or remote) data in HuggingFace, JSONL, etc., formats.

⚡ **Efficient Browsing**: Stream large local (or remote) datasets without loading (or downloading) in memory.

🚀 **Powerful Analysis**: Easily filter and sort data for better insights.

đŸ’ģ **Pretty-Print Code**: Human-friendly visualization of code embedded in your data.

Experience seamless data browsing and analysis with Datahawk đŸĻ…!



Alternatives include: [Lilac](https://www.lilacml.com/), [HuggingFace Dataset Viewer](https://huggingface.co/docs/datasets-server/).

## Instructions

#### Install

Installation requires `python>=3.8`.

```shell
pip install datahawk
```

#### Run

Launch the app from anywhere as:

```shell
datahawk
```

This will start the application at `localhost:5009`.

Specify a custom port number as:

```shell
datahawk -p PORT
```

This will start the application at `localhost:PORT`.

#### Usage

Usage is quite intuitive! You can find on-screen instructions by hovering over the information icons â„šī¸.

## License

Datahawk has an MIT license, as found in the [LICENSE](LICENSE) file.

## Acknowledgements

* [Hawk icon](https://www.flaticon.com/free-icon/eagleemblem_14733103) made by [IconBaandar](https://www.flaticon.com/authors/iconbaandar) from [Flaticon](https://www.flaticon.com).