An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with tika-python

A curated list of projects in awesome lists tagged with tika-python .

https://github.com/chrismattmann/tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

buffer covid-19 detection extraction memex mime nlp nlp-library nlp-machine-learning parse parser-interface python recognition text-extraction text-recognition tika-python tika-server tika-server-jar translation-interface usc

Last synced: 14 May 2025

https://github.com/chrismattmann/tika-similarity

Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.

clustering cosine-distance cosine-similarity information-retrieval jaccard-similarity machine-learning metadata-features python similarity-score tika tika-python tika-similarity

Last synced: 02 May 2025

https://github.com/stumpylog/tika-client

A modern Python REST client for Apache Tika server

api hacktoberfest python python3 tika tika-python

Last synced: 25 Mar 2025

https://github.com/chrismattmann/drat

The Distributed Release Audit Tool (DRAT) for code analysis and verification.

license-checking mime oss rat tika tika-python

Last synced: 26 Mar 2025

https://github.com/kimtth/pyspark-tika-text-extraction

🚴‍♂️⛷Data Lake, Performance tuning for text extraction from a huge amount of files.

apache-spark apache-tika data-pipeline datalake multithreading pyspark spark tika-python

Last synced: 17 Jul 2025

https://github.com/opensemanticsearch/tika-python.deb

tika-python as Debian GNU/Linux and Ubuntu Linux package

debian debian-packaging tika-api tika-python tika-wrapper ubuntu ubuntu-packages

Last synced: 24 Feb 2025