Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wiz-craft/wiz-craft
A CLI-based dataset preprocessing tool for machine learning tasks. Features include data exploration, null value handling, one-hot encoding, and feature scaling, and download the modified dataset effortlessly.
https://github.com/wiz-craft/wiz-craft
cli cli-app dataset machine-learning preprocessing
Last synced: about 2 months ago
JSON representation
A CLI-based dataset preprocessing tool for machine learning tasks. Features include data exploration, null value handling, one-hot encoding, and feature scaling, and download the modified dataset effortlessly.
- Host: GitHub
- URL: https://github.com/wiz-craft/wiz-craft
- Owner: wiz-craft
- License: mit
- Created: 2024-04-17T04:16:29.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-04-17T04:16:38.000Z (9 months ago)
- Last Synced: 2024-08-08T00:43:07.876Z (5 months ago)
- Topics: cli, cli-app, dataset, machine-learning, preprocessing
- Language: Python
- Homepage: https://pypi.org/project/wiz-craft/
- Size: 72.3 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- project-awesome - wiz-craft/wiz-craft - A CLI-based dataset preprocessing tool for machine learning tasks. Features include data exploration, null value handling, one-hot encoding, and feature scaling, and download the modified dataset effo (Python)
README
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://static.pepy.tech/personalized-badge/wiz-craft?period=total&units=international_system&left_color=brightgreen&right_color=orange&left_text=Downloads)](https://pepy.tech/project/wiz-craft)
![PyPI - Version](https://img.shields.io/pypi/v/wiz-craft)
# WizCraft - CLI-Based Dataset Preprocessing Tool
WizCraft is a cutting-edge Command Line Interface (CLI) tool developed to simplify the process of dataset preprocessing for machine learning tasks. It aims to provide a seamless and efficient experience for data scientists of all levels, facilitating the preparation of data for various machine-learning applications.
**[Try the tool online here](https://replit.com/@PinakDatta/DataWiz)**
**Check out the [Contribution Guide](https://github.com/Pinak-Datta/wiz-craft/blob/main/CONTRIBUTING.md) if you want to Contribute to this project**
## Table of Contents
- [Features](#features)
- [Getting Started](#getting-started)
- [Installation](#installation)
- [Tasks](#tasks)
- [Data Description](#data-description)
- [Handle Null Values](#handle-null-values)
- [Encode Categorical Values](#encode-categorical-values)
- [Feature Scaling](#feature-scaling)
- [Save Preprocessed Dataset](#save-preprocessed-dataset)
- [Future Works](#future-works)
- [Contributing to the Project](#contribute-to-the-project)## Features
- Load and preprocess your dataset effortlessly through a Command Line Interface (CLI).
- View dataset statistics, null value counts, and perform data imputation.
- Encode categorical variables using one-hot encoding.
- Normalize and standardize numerical features for better model performance.
- Download the preprocessed dataset with your desired modifications.## Getting Started
### Installation
1. Run the pip command:
```bash
pip install wiz-craft2. To use the module, use the commands:
```python
from wizcraft.preprocess import Preprocess
wiz_obj = Preprocess()
wiz_obj.start()3. Follow the on-screen prompts to load your dataset, select target variables, and perform preprocessing tasks.
## Features Available
### Data Description
1. View statistics and properties of numeric columns.
2. Explore unique values and statistics of categorical columns.
3. Display a snapshot of the dataset.### Handle Null Values
1. Show NULL value counts in each column.
2. Remove specific columns or fill NULL values with mean, median, or mode.### Encode Categorical Values
1. Identify and list categorical columns.
2. Perform one-hot encoding on categorical columns.### Feature Scaling
1. Normalize (Min-Max scaling) or standardize (Standard Scaler) numerical columns.
### Save Preprocessed Dataset
1. Download the modified dataset with applied preprocessing steps.
## Future Works
- [x] Advanced Data Imputation Techniques: Adding support for advanced data imputation techniques, such as K-nearest neighbours (KNN) imputation.
- [x] Improved UI and UX using Rich- [ ] Undo/Redo Option for each step
- [ ] Extension for NLP tasks (like tokenization, stemming)
- [ ] User-Friendly Interface: Improving the user interface to provide more interactive and user-friendly features, such as progress bars, error handling, and clear instructions.
- [ ] Using Curses for terminal Manipulation.
## Contributing to the Project
**Check out the [Contribution Guide](https://github.com/Pinak-Datta/wiz-craft/blob/main/CONTRIBUTING.md) if you want to contribute to this project**