Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/apache/tika
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
https://github.com/apache/tika
content extraction java metadata tika
Last synced: 5 days ago
JSON representation
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
- Host: GitHub
- URL: https://github.com/apache/tika
- Owner: apache
- License: apache-2.0
- Created: 2009-05-21T02:12:11.000Z (over 15 years ago)
- Default Branch: main
- Last Pushed: 2024-12-30T13:15:34.000Z (12 days ago)
- Last Synced: 2025-01-04T11:09:18.144Z (7 days ago)
- Topics: content, extraction, java, metadata, tika
- Language: Java
- Homepage: https://tika.apache.org/
- Size: 235 MB
- Stars: 2,641
- Watchers: 98
- Forks: 795
- Open Issues: 53
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.txt
- License: LICENSE.txt
Awesome Lists containing this project
- awesome - apache/tika - The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). (Java)
- awesome-ccamel - apache/tika - The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). (Java)