https://github.com/lyeoni/nlp-cheatsheets
https://github.com/lyeoni/nlp-cheatsheets
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/lyeoni/nlp-cheatsheets
- Owner: lyeoni
- License: apache-2.0
- Created: 2020-07-16T08:06:06.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-08-26T06:34:32.000Z (almost 6 years ago)
- Last Synced: 2025-02-16T02:44:08.858Z (over 1 year ago)
- Language: Python
- Size: 9.77 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# nlp-cheatsheets
## Usage
#### split_corpus.sh
Split given corpus into random train/test subsets. `TRAIN_RATIO` represent the proportion of the dataset to include in the train split.
```shell
$ scripts/split_corpus.sh [INPUT_FILE] [OUTPUT_FILE] [TRAIN_RATIO]
# example
$ scripts/split_corpus.sh sample/zero_to_hundred.txt sample/zero_to_hundred 0.8
Split corpus into train(#80) and test(#20) corpus, respectively.
```