{"id":33185510,"url":"https://github.com/shoryasethia/markdrop","last_synced_at":"2026-01-14T08:20:01.994Z","repository":{"id":269592816,"uuid":"907939145","full_name":"shoryasethia/markdrop","owner":"shoryasethia","description":"A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.","archived":false,"fork":false,"pushed_at":"2025-07-05T17:42:43.000Z","size":162,"stargazers_count":123,"open_issues_count":4,"forks_count":5,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-05T17:48:36.081Z","etag":null,"topics":["agents","docling","image-to-text","llm","markdrop","marker","markitdown","open-source","pdf-to-markdown","pdf-to-text","pypi-package","table-to-text"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/markdrop","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shoryasethia.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-24T16:47:00.000Z","updated_at":"2025-07-05T17:42:45.000Z","dependencies_parsed_at":"2025-03-01T22:45:10.645Z","dependency_job_id":null,"html_url":"https://github.com/shoryasethia/markdrop","commit_stats":null,"previous_names":["shoryasethia/markdrop"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/shoryasethia/markdrop","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shoryasethia%2Fmarkdrop","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shoryasethia%2Fmarkdrop/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shoryasethia%2Fmarkdrop/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shoryasethia%2Fmarkdrop/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shoryasethia","download_url":"https://codeload.github.com/shoryasethia/markdrop/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shoryasethia%2Fmarkdrop/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28413779,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:16:59.381Z","status":"ssl_error","status_checked_at":"2026-01-14T08:13:45.490Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","docling","image-to-text","llm","markdrop","marker","markitdown","open-source","pdf-to-markdown","pdf-to-text","pypi-package","table-to-text"],"created_at":"2025-11-16T05:00:20.093Z","updated_at":"2026-01-14T08:20:01.987Z","avatar_url":"https://github.com/shoryasethia.png","language":"Python","funding_links":[],"categories":["\u003ca name=\"conversion\"\u003e\u003c/a\u003eConversion"],"sub_categories":[],"readme":"\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/shoryasethia/markdrop/main/markdrop/src/markdrop-logo.png\" alt=\"Markdrop Logo\" width=\"200\" height=\"200\"/\u003e\n  \u003ch1 style=\"display: inline; font-size: 2em; vertical-align: middle; padding-left: 10px; margin: 0;\"\u003eMarkdrop\u003c/h1\u003e\n\u003c/p\u003e\n\n[![Downloads](https://static.pepy.tech/badge/markdrop)](https://pepy.tech/projects/markdrop)\n[![PyPI Version](https://img.shields.io/pypi/v/markdrop)](https://pypi.org/project/markdrop/)\n[![License](https://img.shields.io/github/license/shoryasethia/markdrop)](https://github.com/shoryasethia/markdrop/blob/main/LICENSE)\n[![Stars](https://img.shields.io/github/stars/shoryasethia/markdrop?style=social)](https://github.com/shoryasethia/markdrop/stargazers)\n[![Issues](https://img.shields.io/github/issues/shoryasethia/markdrop)](https://github.com/shoryasethia/markdrop/issues)\n[![Forks](https://img.shields.io/github/forks/shoryasethia/markdrop?style=social)](https://github.com/shoryasethia/markdrop/network/members)\n\nA Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.\n\n## Features  \n\n- [x] PDF to Markdown conversion with formatting preservation using Docling\n- [x] Automatic image extraction with quality preservation using XRef Id\n- [x] Table detection using Microsoft's Table Transformer\n- [x] PDF URL support for core functionalities\n- [x] AI-powered image and table descriptions using multiple LLM providers\n- [x] Interactive HTML output with downloadable Excel tables\n- [x] Customizable image resolution and UI elements\n- [x] Comprehensive logging system\n- [ ] Support for other files\n- [ ] Streamlit/web interface\n\n## Installation  \n\n```bash  \npip install markdrop  \n```  \n\nIf you are using the CLI, you can install the package in editable mode:\n```bash\npython -m pip install -e .\n```\n\n#### Python Package Index (PyPI) Page: https://pypi.org/project/markdrop\n\n## Quick Start  \n\n[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1oApTrP_kjNn0s1tpE0SIWRyGzYfflQsi?usp=sharing)\n[![Watch the demo](https://img.shields.io/badge/YouTube-Demo-red?logo=youtube\u0026logoColor=white)](https://youtu.be/2xg7W0-oiw0)\n\n### Using the MarkDrop CLI\n\nAfter installing the package, you can use the `markdrop` command-line interface.\n\n**1. Convert PDF to Markdown and HTML:**\n\n```bash\nmarkdrop convert \u003cinput_path\u003e --output_dir \u003coutput_directory\u003e [--add_tables]\n```\n*   `\u003cinput_path\u003e`: Path or URL to the input PDF file.\n*   `\u003coutput_directory\u003e`: Directory to save output files (default: `output`).\n*   `--add_tables`: (Optional) Add downloadable tables to the HTML output.\n\n**Example:**\n```bash\nmarkdrop convert my_document.pdf --output_dir processed_docs --add_tables\n```\n\n**2. Generate Descriptions for Images and Tables in a Markdown File:**\n\n```bash\nmarkdrop describe \u003cinput_path\u003e --output_dir \u003coutput_directory\u003e --ai_provider \u003cprovider\u003e [--remove_images] [--remove_tables]\n```\n*   `\u003cinput_path\u003e`: Path to the markdown file.\n*   `\u003coutput_directory\u003e`: Directory to save the processed file (default: `output`).\n*   `\u003cprovider\u003e`: AI provider to use (`gemini` or `openai`).\n*   `--remove_images`: (Optional) Remove images from the markdown file.\n*   `--remove_tables`: (Optional) Remove tables from the markdown file.\n\n**Example:**\n```bash\nmarkdrop describe my_markdown.md --output_dir described_content --ai_provider gemini --remove_images\n```\n\n**3. Analyze Images in a PDF File:**\n\n```bash\nmarkdrop analyze \u003cinput_path\u003e --output_dir \u003coutput_directory\u003e [--save_images]\n```\n*   `\u003cinput_path\u003e`: Path or URL to the PDF file.\n*   `\u003coutput_directory\u003e`: Directory to save analysis results (default: `output/analysis`).\n*   `--save_images`: (Optional) Save extracted images.\n\n**Example:**\n```bash\nmarkdrop analyze report.pdf --output_dir pdf_analysis --save_images\n```\n\n**4. Set Up API Keys for AI Providers:**\n\n```bash\nmarkdrop setup \u003cprovider\u003e\n```\n*   `\u003cprovider\u003e`: The AI provider to set up (`gemini` or `openai`).\n\n**Example:**\n```bash\nmarkdrop setup gemini\n```\n\n**5. Generate Descriptions for Images (Standalone):**\n\n```bash\nmarkdrop generate \u003cinput_path\u003e --output_dir \u003coutput_directory\u003e [--prompt \u003cprompt_text\u003e] [--llm_client \u003cclient1\u003e \u003cclient2\u003e ...]\n```\n*   `\u003cinput_path\u003e`: Path to an image file or a directory of images.\n*   `\u003coutput_directory\u003e`: Directory to save the descriptions CSV (default: `output/descriptions`).\n*   `--prompt`: (Optional) Prompt for the AI model (default: \"Describe the image in detail.\").\n*   `--llm_client`: (Optional) List of LLM clients to use (default: `gemini`). Available: `qwen`, `gemini`, `openai`, `llama-vision`, `molmo`, `pixtral`.\n\n**Example:**\n```bash\nmarkdrop generate my_images/ --output_dir image_descriptions --prompt \"What is in this picture?\" --llm_client gemini openai\n```\n\n### Advanced PDF Processing with MarkDrop (Python API)\n\n```python\nfrom markdrop import markdrop, MarkDropConfig, add_downloadable_tables\nfrom pathlib import Path\nimport logging\n\n# Configure processing options\nconfig = MarkDropConfig(\n    image_resolution_scale=2.0,        # Scale factor for image resolution\n    download_button_color='#444444',   # Color for download buttons in HTML\n    log_level=logging.INFO,           # Logging detail level\n    log_dir='logs',                   # Directory for log files\n    excel_dir='markdropped-excel-tables'  # Directory for Excel table exports\n)\n\n# Process PDF document\ninput_doc_path = \"path/to/input.pdf\"\noutput_dir = Path('output_directory')\n\n# Convert PDF and generate HTML with images and tables\nhtml_path = markdrop(input_doc_path, str(output_dir), config)\n\n# Add interactive table download functionality\ndownloadable_html = add_downloadable_tables(html_path, config)\n```\n\n### AI-Powered Content Analysis (Python API)\n\n```python\nfrom markdrop import setup_keys, process_markdown, ProcessorConfig, AIProvider, logger\nfrom pathlib import Path\n\n# Set up API keys for AI providers\nsetup_keys(key='gemini')  # or setup_keys(key='openai')\n\n# Configure AI processing options\nconfig = ProcessorConfig(\n    input_path=\"path/to/markdown/file.md\",    # Input markdown file path\n    output_dir=Path(\"output_directory\"),      # Output directory\n    ai_provider=AIProvider.GEMINI,            # AI provider (GEMINI or OPENAI)\n    remove_images=False,                      # Keep or remove original images\n    remove_tables=False,                      # Keep or remove original tables\n    table_descriptions=True,                  # Generate table descriptions\n    image_descriptions=True,                  # Generate image descriptions\n    max_retries=3,                           # Number of API call retries\n    retry_delay=2,                           # Delay between retries in seconds\n    gemini_model_name=\"gemini-2.5-flash\",    # Gemini model for images\n    gemini_text_model_name=\"gemini--2.5-flash\",     # Gemini model for text\n    image_prompt=DEFAULT_IMAGE_PROMPT,        # Custom prompt for image analysis\n    table_prompt=DEFAULT_TABLE_PROMPT         # Custom prompt for table analysis\n)\n\n# Process markdown with AI descriptions\noutput_path = process_markdown(config)\n```\n\n### Image Description Generation (Python API)\n\n```python\nfrom markdrop import generate_descriptions\n\nprompt = \"Give textual highly detailed descriptions from this image ONLY, nothing else.\"\ninput_path = 'path/to/img_file/or/dir'\noutput_dir = 'data/output'\nllm_clients = ['gemini', 'llama-vision']  # Available: ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral']\n\ngenerate_descriptions(\n    input_path=input_path,\n    output_dir=output_dir,\n    prompt=prompt,\n    llm_client=llm_clients\n)\n```\n\n## API Reference  \n\n### Core Functions\n\n#### markdrop(input_doc_path: str, output_dir: str, config: Optional[MarkDropConfig] = None) -\u003e Path\nConverts PDF to markdown and HTML with enhanced features.\n\nParameters:\n- `input_doc_path` (str): Path to input PDF file\n- `output_dir` (str): Output directory path\n- `config` (MarkDropConfig, optional): Configuration options for processing\n\n#### add_downloadable_tables(html_path: Path, config: Optional[MarkDropConfig] = None) -\u003e Path\nAdds interactive table download functionality to HTML output.\n\nParameters:\n- `html_path` (Path): Path to HTML file\n- `config` (MarkDropConfig, optional): Configuration options\n\n### Configuration Classes\n\n#### MarkDropConfig\nConfiguration for PDF processing:\n- `image_resolution_scale` (float): Scale factor for image resolution (default: 2.0)\n- `download_button_color` (str): HTML color code for download buttons (default: '#444444')\n- `log_level` (int): Logging level (default: logging.INFO)\n- `log_dir` (str): Directory for log files (default: 'logs')\n- `excel_dir` (str): Directory for Excel table exports (default: 'markdropped-excel-tables')\n\n#### ProcessorConfig\nConfiguration for AI processing:\n- `input_path` (str): Path to markdown file\n- `output_dir` (str): Output directory path\n- `ai_provider` (AIProvider): AI provider selection (GEMINI or OPENAI)\n- `remove_images` (bool): Whether to remove original images\n- `remove_tables` (bool): Whether to remove original tables\n- `table_descriptions` (bool): Generate table descriptions\n- `image_descriptions` (bool): Generate image descriptions\n- `max_retries` (int): Maximum API call retries\n- `retry_delay` (int): Delay between retries in seconds\n- `gemini_model_name` (str): Gemini model for image processing\n- `gemini_text_model_name` (str): Gemini model for text processing\n- `image_prompt` (str): Custom prompt for image analysis\n- `table_prompt` (str): Custom prompt for table analysis\n\n## Contributing  \n\nWe welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.  \n\n### Development Setup  \n\n1. Clone the repository:  \n```bash  \ngit clone https://github.com/shoryasethia/markdrop.git  \ncd markdrop  \n```  \n\n2. Create a virtual environment:  \n```bash  \npython -m venv venv  \nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate  \n```  \n\n3. Install development dependencies:  \n```bash  \npip install -r requirements.txt  \n```  \n\n## Project Structure  \n\n```bash  \nmarkdrop/  \n├── LICENSE  \n├── README.md  \n├── CONTRIBUTING.md  \n├── CHANGELOG.md  \n├── requirements.txt  \n├── setup.py  \n└── markdrop/ \n    ├── __init__.py \n    ├── src\n    |    └── markdrop-logo.png\n    ├── main.py\n    ├── process.py\n    ├── api_setup.py\n    ├── parse.py\n    ├── utils.py  \n    ├── helper.py\n    ├── ignore_warnings.py\n    ├── run.py\n    └── models/\n        ├── __init__.py\n        ├── .env\n        ├── img_descriptions.py\n        ├── logger.py\n        ├── model_loader.py\n        ├── responder.py\n        └── setup_keys.py  \n```  \n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=shoryasethia/markdrop\u0026type=Timeline)](https://star-history.com/#shoryasethia/markdrop\u0026Timeline)\n\n## License  \n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.  \n\n## Changelog  \n\nSee [CHANGELOG.md](CHANGELOG.md) for version history.  \n\n## Code of Conduct  \n\nPlease note that this project follows our [Code of Conduct](CODE_OF_CONDUCT.md).  \n\n## Support  \n\n- [Open an issue](https://github.com/shoryasethia/markdrop/issues)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshoryasethia%2Fmarkdrop","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshoryasethia%2Fmarkdrop","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshoryasethia%2Fmarkdrop/lists"}