Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lloydzhou/tika-parser
parse server based on tika
https://github.com/lloydzhou/tika-parser
Last synced: 1 day ago
JSON representation
parse server based on tika
- Host: GitHub
- URL: https://github.com/lloydzhou/tika-parser
- Owner: lloydzhou
- License: mit
- Created: 2024-11-13T11:12:27.000Z (2 months ago)
- Default Branch: master
- Last Pushed: 2024-11-15T06:00:07.000Z (2 months ago)
- Last Synced: 2025-01-11T02:53:17.876Z (7 days ago)
- Language: Python
- Size: 5.86 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tika-parser
parse server based on tika## usage
1. start server
```
pip install regex tikapython main.py
```2. http client
```
curl http://127.0.0.1:8888 -F 'file=@/path/to/file'-->
[
{
"page_content": "xxxx",
"metadata": {
"offset": 0,
"length": 1,
"strip": 1,
"source": "xx filename"
}
}
]
```