Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/trinker/textshape
Tools for reshaping text data
data-reshaping manipulation r sentence-boundary-detection text-data text-formating tidy
Last synced: 20 May 2024
https://github.com/trinker/textreadr
Tools to uniformly read in text data including semi-structured transcripts
doc docx pdf-reading r read-transcripts text-data text-mining
Last synced: 20 May 2024
https://github.com/microsoft/DialoGPT
Large-scale pretraining for dialogue
data-processing dialogpt dialogue gpt-2 machine-learning pytorch text-data text-generation transformer
Last synced: 27 Apr 2024
https://github.com/PedroBarcha/old-books-dataset
Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.
binarization binarized-dataset books-dataset dataset ground-truth groundtruth ocr-database ocr-dataset old-books old-documents text text-data text-database
Last synced: 21 Apr 2024
https://github.com/asyml/texar-pytorch
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
bert casl-project data-processing deep-learning dialog-systems gpt-2 machine-learning machine-translation natural-language-processing python pytorch roberta texar texar-pytorch text-data text-generation xlnet
Last synced: 19 Apr 2024
https://github.com/asyml/texar
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
bert casl-project data-processing deep-learning dialog-systems gpt-2 machine-learning machine-translation natural-language-processing python tensorflow texar text-data text-generation xlnet
Last synced: 11 Apr 2024
https://github.com/microsoft/GODEL
Large-scale pretrained models for goal-directed dialog
conversational-ai data-processing dialogpt dialogue dialogue-systems grounded-generation language-grounding language-model machine-learning pretrained-model pytorch text-data text-generation transformer transformers
Last synced: 28 Mar 2024