Projects in Awesome Lists tagged with tika-python
A curated list of projects in awesome lists tagged with tika-python .
https://github.com/chrismattmann/tika-python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
buffer covid-19 detection extraction memex mime nlp nlp-library nlp-machine-learning parse parser-interface python recognition text-extraction text-recognition tika-python tika-server tika-server-jar translation-interface usc
Last synced: 14 May 2025
https://github.com/chrismattmann/tika-similarity
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
clustering cosine-distance cosine-similarity information-retrieval jaccard-similarity machine-learning metadata-features python similarity-score tika tika-python tika-similarity
Last synced: 02 May 2025
https://github.com/stumpylog/tika-client
A modern Python REST client for Apache Tika server
api hacktoberfest python python3 tika tika-python
Last synced: 25 Mar 2025
https://github.com/chrismattmann/drat
The Distributed Release Audit Tool (DRAT) for code analysis and verification.
license-checking mime oss rat tika tika-python
Last synced: 26 Mar 2025
https://github.com/kimtth/pyspark-tika-text-extraction
🚴♂️⛷Data Lake, Performance tuning for text extraction from a huge amount of files.
apache-spark apache-tika data-pipeline datalake multithreading pyspark spark tika-python
Last synced: 17 Jul 2025
https://github.com/opensemanticsearch/tika-python.deb
tika-python as Debian GNU/Linux and Ubuntu Linux package
debian debian-packaging tika-api tika-python tika-wrapper ubuntu ubuntu-packages
Last synced: 24 Feb 2025