{"id":23186768,"url":"https://github.com/9bow/markitdown-api-fly-io","last_synced_at":"2026-02-06T05:03:15.219Z","repository":{"id":268460967,"uuid":"904062462","full_name":"9bow/markitdown-api-fly-io","owner":"9bow","description":"Simple FastAPI wrapper for Document-to-Markdown conversion using Microsoft's MarkItDown library.","archived":false,"fork":false,"pushed_at":"2024-12-26T07:22:05.000Z","size":17,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-20T15:53:46.727Z","etag":null,"topics":["doc-to-md","document-to-markdown","markitdown","markitdown-api"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/9bow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-12-16T07:22:38.000Z","updated_at":"2025-05-02T04:36:11.000Z","dependencies_parsed_at":"2025-06-11T06:06:16.650Z","dependency_job_id":"f003e8d2-b231-4593-abd5-0fe3cf7983ac","html_url":"https://github.com/9bow/markitdown-api-fly-io","commit_stats":null,"previous_names":["9bow/markitdown-api-fly-io"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/9bow/markitdown-api-fly-io","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/9bow%2Fmarkitdown-api-fly-io","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/9bow%2Fmarkitdown-api-fly-io/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/9bow%2Fmarkitdown-api-fly-io/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/9bow%2Fmarkitdown-api-fly-io/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/9bow","download_url":"https://codeload.github.com/9bow/markitdown-api-fly-io/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/9bow%2Fmarkitdown-api-fly-io/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29151590,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-06T02:39:25.012Z","status":"ssl_error","status_checked_at":"2026-02-06T02:37:22.784Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["doc-to-md","document-to-markdown","markitdown","markitdown-api"],"created_at":"2024-12-18T10:17:29.833Z","updated_at":"2026-02-06T05:03:15.186Z","avatar_url":"https://github.com/9bow.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MarkItDown API\n\nA REST API service that converts documents and web content to Markdown. Supports various file formats using [Microsoft's MarkItDown](https://github.com/microsoft/markitdown) and web content extraction using [Trafilatura](https://github.com/adbar/trafilatura) and [python-readability](https://github.com/buriy/python-readability).\n\n## Features\n\n- Document-to-Markdown conversion via file upload or URL\n  - Office documents (DOCX, XLSX, PPTX)\n  - PDF files\n  - Images (PNG, JPEG, GIF, WebP)\n  - Data files (CSV, JSON, XML)\n- Web content extraction and conversion\n  - Primary extraction using Trafilatura\n  - Fallback to python-readability for robust content extraction\n  - Intelligent character encoding detection\n  - Clean Markdown output with preserved formatting\n- Rich metadata for conversion results\n- API Key-based authentication and OpenAPI documentation\n- Robust content type detection and handling\n\n## Installation \u0026 Development\n\n### Prerequisites\n\n- Python 3.8 or higher\n- pip (Python package installer)\n- Virtual environment (recommended)\n\n### Clone \u0026 Local Setup\n\n1. Clone the repository\n```bash\ngit clone https://github.com/9bow/markitdown-api-fly-io.git\ncd markitdown-api-fly-io\n```\n\n2. Create and activate virtual environment (recommended)\n```bash\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n```\n\n3. Install dependencies\n```bash\npip install -r requirements.txt\n```\n\n4. Configure environment variables\n```bash\n# Create .env file with the following variables (via .env.template)\ncp .env.example .env\n# Update the following variables in .env\nVERSION=0.0.1\nMAX_DOWNLOAD_SIZE=52428800  # 50MB in bytes\nTIMEOUT_SECONDS=30\n```\n\n5. Run development server\n```bash\ncd app/\nuvicorn main:app --reload\n```\n\n### Deployment (via Fly.io)\n\n1. Install Fly.io CLI\n```bash\ncurl -L https://fly.io/install.sh | sh\n```\n\n2. Login and deploy\n```bash\nflyctl auth login\nflyctl launch\nflyctl secrets set API_KEY=\"your-secure-api-key\"\nflyctl deploy\n```\n\n## API Usage\n\n### Authentication\n\nAll API endpoints require authentication using either:\n- API key in the `X-API-Key` header\n- Bearer token in the `Authorization` header\n\n### Endpoints\n\n#### Health Check\n```bash\ncurl -X GET \\\n  -H \"X-API-Key: your-secure-api-key\" \\\n  http://localhost:8000/health\n```\n\n#### Convert Document\n```bash\n# via file upload\ncurl -X POST \\\n  -H \"X-API-Key: your-secure-api-key\" \\\n  -F \"file=@document.pdf\" \\\n  http://localhost:8000/convert\n\n# via file URL\ncurl -X POST \\\n  -H \"X-API-Key: your-secure-api-key\" \\\n  -F \"url=https://example.com/document.pdf\" \\\n  http://localhost:8000/convert\n```\n\n### Response Format\n\nSuccessful conversions return a JSON object with the following structure:\n```json\n{\n  \"result\": \"# Converted Markdown Content...\",\n  \"metadata\": {\n    \"content_type\": \"application/pdf\",\n    \"file_size\": 12345,\n    \"processing_time\": 0.532,\n    \"original_url\": \"https://example.com/document.pdf\",\n    \"conversion_method\": \"markitdown\"\n  }\n}\n```\n\n## Error Handling\n\nThe API returns appropriate HTTP status codes and error messages:\n- 400: Bad Request (invalid input)\n  - Unsupported file format\n  - Invalid URL\n  - Missing file/URL\n- 401: Unauthorized (invalid API key)\n- 408: Request Timeout\n- 413: Payload Too Large (file size exceeds limit)\n- 500: Internal Server Error\n\n## Content Type Support\n\n### Documents\n- PDF (`.pdf`)\n- Microsoft Word (`.docx`)\n- Microsoft Excel (`.xlsx`)\n- Microsoft PowerPoint (`.pptx`)\n\n### Web Content\n- HTML pages (`.html`, `.htm`)\n- XML documents (`.xml`)\n\n### Data Files\n- CSV (`.csv`)\n- JSON (`.json`)\n- XML (`.xml`)\n\n### Images\n- JPEG (`.jpg`, `.jpeg`)\n- PNG (`.png`)\n- GIF (`.gif`)\n- WebP (`.webp`)\n\n## Development\n\n### Running Tests\n```bash\npytest\n```\n\n### Type Checking\n```bash\nmypy app/\n```\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F9bow%2Fmarkitdown-api-fly-io","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F9bow%2Fmarkitdown-api-fly-io","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F9bow%2Fmarkitdown-api-fly-io/lists"}