Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/alienobserver/datengeist

Application for easy understanding of unstructured data
https://github.com/alienobserver/datengeist

feature-engineering machine-learning streamlit

Last synced: about 2 months ago
JSON representation

Application for easy understanding of unstructured data

Awesome Lists containing this project

README

        

# Datengeist
## Application for easy understanding of unstructured data

Datengeist is a streamlit built application which is made to understand unstructured data through visualization
of its components. Datengeist is working with **.csv** files. Datengeist has this key functionalities:

1. Categorization of features
2. Visualization of distributions
3. Convenient handling of missing data
4. Tools for feature comparison

To run datengeist you can install via pip

```
$ pip install datengeist
$ datengeist start
```

Or you can create a virtual environment and then run it (recommended)

```
$ python3 -m venv datengeist_env
$ source datengeist_env/bin/activate

$ pip install datengeist
```

### 1. Sample the Dataset
Sample the Dataset is where you can sample data, load it and have your first overview of the data
screenshot

### 2. General Info
General Info is where you can divide your features into corresponding categories and view your
missing values in each feature

screenshot

### 3. Feature Info
Feature Info is where you can view your features more closely, the distributions and missing value percentage
screenshot
screenshot

### 4. Relate Features
Relate Features is where you can view the correlation between your features and relate them via box plotting
screenshot

### License

Apache 2.0