Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/apache/tika
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
https://github.com/apache/tika
content extraction java metadata tika
Last synced: about 8 hours ago
JSON representation
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
- Host: GitHub
- URL: https://github.com/apache/tika
- Owner: apache
- License: apache-2.0
- Created: 2009-05-21T02:12:11.000Z (over 15 years ago)
- Default Branch: main
- Last Pushed: 2025-02-04T11:29:33.000Z (7 days ago)
- Last Synced: 2025-02-05T21:13:10.038Z (5 days ago)
- Topics: content, extraction, java, metadata, tika
- Language: Java
- Homepage: https://tika.apache.org/
- Size: 235 MB
- Stars: 2,704
- Watchers: 98
- Forks: 796
- Open Issues: 52
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.txt
- License: LICENSE.txt
Awesome Lists containing this project
- awesome - apache/tika - The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). (Java)
- awesome-ccamel - apache/tika - The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). (Java)