https://github.com/proprietary/msoffice-document-tokenizer
Create wordlist dictionary from words/tokens in DOCX and PPTX Microsoft Office OpenXML documents
https://github.com/proprietary/msoffice-document-tokenizer
microsoft-office nlp nlp-parsing python tokenizer
Last synced: 3 months ago
JSON representation
Create wordlist dictionary from words/tokens in DOCX and PPTX Microsoft Office OpenXML documents
- Host: GitHub
- URL: https://github.com/proprietary/msoffice-document-tokenizer
- Owner: proprietary
- License: other
- Created: 2019-12-16T05:09:03.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-12-16T05:09:26.000Z (over 5 years ago)
- Last Synced: 2025-02-10T07:24:52.142Z (5 months ago)
- Topics: microsoft-office, nlp, nlp-parsing, python, tokenizer
- Language: Python
- Size: 1000 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0