https://github.com/arc53/doc2md
Convert pdf and image files into markdown
https://github.com/arc53/doc2md
doc doc2md llm visual vllm vlm
Last synced: 3 months ago
JSON representation
Convert pdf and image files into markdown
- Host: GitHub
- URL: https://github.com/arc53/doc2md
- Owner: arc53
- License: mit
- Created: 2024-11-20T11:42:04.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-01-22T06:31:37.000Z (9 months ago)
- Last Synced: 2025-06-02T01:16:09.734Z (4 months ago)
- Topics: doc, doc2md, llm, visual, vllm, vlm
- Language: TypeScript
- Homepage: https://doc2md.arc53.com/
- Size: 361 KB
- Stars: 7
- Watchers: 1
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
This project helps users to convert Documents (.pdf, .png, .jpg, .jpeg) into Markdown for you ease of ingestion into LLM workflows.
It uses a public LLM endpint (doc2md) [here](https://llm.arc53.com/docs#/)
This endpoint simply gives images or pdfs (converted to images) to visual model and asks it to conver it into markdown.Here is a quick snippet using python to perform such task:
```python
# Client is your OpenAI compatible client
model = 'meta-llama/Llama-3.2-11B-Vision-Instruct'
prompt = "Convert the following image to just the markdown text, respond only with text and description of it if relevant."
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": prompt,
},
{
"type": "image_url",
"image_url": {
"url": f"{base64_image}"
},
},
]
}
]
response = client.chat.completions.create(model=model,
messages=messages,
stream=False,
max_tokens=int(max_new_tokens),
**kwargs)
```