https://github.com/jamnicki/split-corpus
Split-corpus package that provide dividing text corpora into the meaningful parts as close to specified size as possible.
https://github.com/jamnicki/split-corpus
corpora corpus-processing large-files natural-language-processing nlp processing
Last synced: 4 months ago
JSON representation
Split-corpus package that provide dividing text corpora into the meaningful parts as close to specified size as possible.
- Host: GitHub
- URL: https://github.com/jamnicki/split-corpus
- Owner: jamnicki
- Created: 2021-09-25T11:15:49.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2022-02-08T22:33:23.000Z (over 3 years ago)
- Last Synced: 2025-04-07T15:54:42.378Z (7 months ago)
- Topics: corpora, corpus-processing, large-files, natural-language-processing, nlp, processing
- Language: Python
- Homepage:
- Size: 12.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files: