https://github.com/voidful/nlprep
🍳 NLPrep - dataset tool for many natural language processing task
https://github.com/voidful/nlprep
dataset nlp prepare pytorch tfkit
Last synced: about 1 year ago
JSON representation
🍳 NLPrep - dataset tool for many natural language processing task
- Host: GitHub
- URL: https://github.com/voidful/nlprep
- Owner: voidful
- License: apache-2.0
- Created: 2020-02-18T13:40:53.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2021-07-30T15:14:56.000Z (almost 5 years ago)
- Last Synced: 2025-03-26T08:37:28.660Z (about 1 year ago)
- Topics: dataset, nlp, prepare, pytorch, tfkit
- Language: Python
- Homepage: https://voidful.github.io/NLPrep/
- Size: 30.8 MB
- Stars: 28
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
## Feature
- handle over 100 dataset
- generate statistic report about processed dataset
- support many pre-processing ways
- Provide a panel for entering your parameters at runtime
- easy to adapt your own dataset and pre-processing utility
# Online Explorer
[https://voidful.github.io/NLPrep-Datasets/](https://voidful.github.io/NLPrep-Datasets/)
# Documentation
Learn more from the [docs](https://voidful.github.io/NLPrep/).
## Quick Start
### Installing via pip
```bash
pip install nlprep
```
### get one of the dataset
```bash
nlprep --dataset clas_udicstm --outdir sentiment
```
**You can also try nlprep in Google Colab: [](https://colab.research.google.com/drive/1EfVXa0O1gtTZ1xEAPDyvXMnyjcHxO7Jk?usp=sharing)**
## Overview
```
$ nlprep
arguments:
--dataset which dataset to use
--outdir processed result output directory
optional arguments:
-h, --help show this help message and exit
--util data preprocessing utility, multiple utility are supported
--cachedir dir for caching raw dataset
--infile local dataset path
--report generate a html statistics report
```
## Contributing
Thanks for your interest.There are many ways to contribute to this project. Get started [here](https://github.com/voidful/nlprep/blob/master/CONTRIBUTING.md).
## License 
* [License](https://github.com/voidful/nlprep/blob/master/LICENSE)
## Icons reference
Icons modify from Darius Dan from www.flaticon.com
Icons modify from Freepik from www.flaticon.com