https://github.com/arc53/doc2md

Convert pdf and image files into markdown
https://github.com/arc53/doc2md

doc doc2md llm visual vllm vlm

Last synced: 3 months ago
JSON representation

Convert pdf and image files into markdown

Host: GitHub
URL: https://github.com/arc53/doc2md
Owner: arc53
License: mit
Created: 2024-11-20T11:42:04.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-01-22T06:31:37.000Z (9 months ago)
Last Synced: 2025-06-02T01:16:09.734Z (4 months ago)
Topics: doc, doc2md, llm, visual, vllm, vlm
Language: TypeScript
Homepage: https://doc2md.arc53.com/
Size: 361 KB
Stars: 7
Watchers: 1
Forks: 1
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          This project helps users to convert Documents (.pdf, .png, .jpg, .jpeg) into Markdown for you ease of ingestion into LLM workflows.

It uses a public LLM endpint (doc2md) [here](https://llm.arc53.com/docs#/)

This endpoint simply gives images or pdfs (converted to images) to visual model and asks it to conver it into markdown.

Here is a quick snippet using python to perform such task:

```python

# Client is your OpenAI compatible client

model = 'meta-llama/Llama-3.2-11B-Vision-Instruct'

prompt = "Convert the following image to just the markdown text, respond only with text and description of it if relevant."

messages = [

    {

        "role": "user",

        "content": [

                    {

            "type": "text",

            "text": prompt,

            },

            {

            "type": "image_url",

            "image_url": {

                "url":  f"{base64_image}"

            },

            },

        ]

    }

]

response = client.chat.completions.create(model=model,

    messages=messages,

    stream=False,

    max_tokens=int(max_new_tokens),

    **kwargs)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/arc53/doc2md

Awesome Lists containing this project

README