Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/deanmalmgren/textract
extract text from any document. no muss. no fuss.
https://github.com/deanmalmgren/textract
data-mining natural-language-processing python text-mining
Last synced: about 2 months ago
JSON representation
extract text from any document. no muss. no fuss.
- Host: GitHub
- URL: https://github.com/deanmalmgren/textract
- Owner: deanmalmgren
- License: mit
- Created: 2014-07-03T20:36:59.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2024-04-01T14:04:15.000Z (3 months ago)
- Last Synced: 2024-04-01T15:27:34.518Z (3 months ago)
- Topics: data-mining, natural-language-processing, python, text-mining
- Language: HTML
- Homepage: http://textract.readthedocs.io
- Size: 4.32 MB
- Stars: 3,754
- Watchers: 83
- Forks: 567
- Open Issues: 130
-
Metadata Files:
- Readme: README.rst
- Contributing: CONTRIBUTING.md
- License: LICENSE
Lists
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- Python-Awesome - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - extract text from any document. no muss. no fuss. (Awesome Python / Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- Awesome-Python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python-resources - GitHub - 39% open · ⏱️ 10.03.2022): (网络)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- python-awesome-case1 - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- fucking-awesome-python - :octocat: textract - :star: 3574 :fork_and_knife: 535 - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python-master - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- my-awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome_python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- join-awesome-python-interview-topics - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python-cn - textract
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python-cn - 官网
- awesome-python-clone - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python4 - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python-resources-all - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- fucking-awesome-python - :octocat: textract - :star: 2901 :fork_and_knife: 413 - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python-machine-learning-resources - GitHub - 39% open · ⏱️ 10.03.2022): (数据读写与提取)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-projects - textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-stars - deanmalmgren/textract - `★3817` extract text from any document. no muss. no fuss. (HTML)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- starred-awesome - textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome_python_with_star - deanmalmgren/textract
- awesome-python-data-science - textract - Extract text from any document. (Feature Extraction / Text/NLP)
- awesome-stars - textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python-cn - 官网
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- git-github.com-vinta-awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python-master - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- python-awesome - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- my-awesome - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-stars - textract
- awesomePython - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- fucking_awesome_python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- Mpaperlee-awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- project-awesome - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome_python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- best-of-awesome - textract
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- awesome-python - textract - Extract text from any document, Word, PowerPoint, PDFs, etc. (Web Content Extracting)
- my-awesome-stars - deanmalmgren/textract - extract text from any document. no muss. no fuss. (HTML)
- awesome-python-cn - textract
- awesome-python-zh - textract - 从任何文档,Word,PowerPoint,pdf等中提取文本。 (Web内容提取)
- best-of-python - GitHub - 50% open · ⏱️ 10.03.2024): (Data Loading & Extraction)
- awesome-python - textract - extract text from any document. no muss. no fuss. ` 📝 2 years ago ` (Web Content Extracting [🔝](#readme))
README
.. NOTES FOR CREATING A RELEASE:
..
.. * bumpversion {major|minor|patch}
.. * git push && git push --tags
.. * twine upload -r textract dist/*
.. * convert into release https://github.com/deanmalmgren/textract/releasestextract
========Extract text from any document. No muss. No fuss.
`Full documentation `__.
|Build Status| |Version| |Downloads| |Test Coverage| |Documentation Status|
|Updates| |Stars| |Forks|.. |Build Status| image:: https://travis-ci.org/deanmalmgren/textract.svg?branch=master
:target: https://travis-ci.org/deanmalmgren/textract.. |Version| image:: https://img.shields.io/pypi/v/textract.svg
:target: https://warehouse.python.org/project/textract/.. |Downloads| image:: https://img.shields.io/pypi/dm/textract.svg
:target: https://warehouse.python.org/project/textract/.. |Test Coverage| image:: https://coveralls.io/repos/github/deanmalmgren/textract/badge.svg?branch=master
:target: https://coveralls.io/github/deanmalmgren/textract?branch=master.. |Documentation Status| image:: https://readthedocs.org/projects/textract/badge/?version=latest
:target: https://readthedocs.org/projects/textract/?badge=latest.. |Updates| image:: https://pyup.io/repos/github/deanmalmgren/textract/shield.svg
:target: https://pyup.io/repos/github/deanmalmgren/textract/.. |Stars| image:: https://img.shields.io/github/stars/deanmalmgren/textract.svg
:target: https://github.com/deanmalmgren/textract/stargazers.. |Forks| image:: https://img.shields.io/github/forks/deanmalmgren/textract.svg
:target: https://github.com/deanmalmgren/textract/network