{"id":24481067,"url":"https://github.com/uhobnil/markitdown-rs","last_synced_at":"2025-10-16T21:38:20.787Z","repository":{"id":272903138,"uuid":"916557372","full_name":"uhobnil/markitdown-rs","owner":"uhobnil","description":"A Rust library designed to facilitate the conversion of various document formats into markdown text.","archived":false,"fork":false,"pushed_at":"2025-06-23T02:17:08.000Z","size":2129,"stargazers_count":10,"open_issues_count":2,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-07T02:47:09.483Z","etag":null,"topics":["deepseek","docx","excel","image","langchain","markdown","openai","pdf","xml"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/uhobnil.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-14T10:31:29.000Z","updated_at":"2025-06-27T02:42:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"b4b8e441-0bd8-447a-b28c-8d31e7a2705b","html_url":"https://github.com/uhobnil/markitdown-rs","commit_stats":null,"previous_names":["uhobnil/markitdown-rs"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/uhobnil/markitdown-rs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhobnil%2Fmarkitdown-rs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhobnil%2Fmarkitdown-rs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhobnil%2Fmarkitdown-rs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhobnil%2Fmarkitdown-rs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/uhobnil","download_url":"https://codeload.github.com/uhobnil/markitdown-rs/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhobnil%2Fmarkitdown-rs/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265672332,"owners_count":23808842,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deepseek","docx","excel","image","langchain","markdown","openai","pdf","xml"],"created_at":"2025-01-21T11:18:51.696Z","updated_at":"2025-10-16T21:38:20.759Z","avatar_url":"https://github.com/uhobnil.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# markitdown-rs\n\nmarkitdown-rs is a Rust library designed to facilitate the conversion of various document formats into markdown text. It is a Rust implementation of the original [markitdown](https://github.com/microsoft/markitdown) Python library.\n\n## Features\n\nIt supports:\n\n- [x] Excel(.xlsx)\n- [x] Word(.docx)\n- [x] PowerPoint\n- [x] PDF\n- [x] Images\n- [ ] Audio\n- [x] HTML\n- [x] CSV(UTF-8)\n- [x] Text-based formats (.xml, .rss, .atom)\n- [x] ZIP\n\n## Usage\n\n### Command-Line\n\n#### Installation\n\n```\ncargo install markitdown\n```\n\n#### Convert a File\n\n```\nmarkitdown path-to-file.pdf\n```\n\nOr use -o to specify the output file:\n\n```\nmarkitdown path-to-file.pdf -o document.md\n```\n\n### Rust API\n\n#### Installation\n\nAdd the following to your `Cargo.toml`:\n\n```toml\n[dependencies]\nmarkitdown = \"0.1.10\"\n```\n\n#### Initialize MarkItDown\n\n```rust\nuse markitdown::MarkItDown;\n\nlet mut md = MarkItDown::new();\n```\n\n#### Convert a File\n\n```rust\nuse markitdown::{ConversionOptions, DocumentConverterResult, MarkItDown};\n\n// Basic conversion - file type is auto-detected\nlet result = md.convert(\"path/to/file.xlsx\", None)?;\n\n// Or explicitly specify options\nlet options = ConversionOptions {\n    file_extension: Some(\".xlsx\".to_string()),\n    url: None,\n    llm_client: None,\n    llm_model: None,\n};\n\nlet result = md.convert(\"path/to/file.xlsx\", Some(options))?;\n\n// To use Large Language Models for image descriptions\nlet options = ConversionOptions {\n    file_extension: Some(\".jpg\".to_string()),\n    url: None,\n    llm_client: Some(\"gemini\".to_string()),\n    llm_model: Some(\"gemini-2.0-flash\".to_string()),\n};\n\nlet result = md.convert(\"path/to/file.jpg\", Some(options))?;\n\nif let Some(conversion_result) = result {\n    println!(\"Converted Text: {}\", conversion_result.text_content);\n} else {\n    println!(\"Conversion failed or unsupported file type.\");\n}\n```\n\n#### Convert from Bytes\n\n```rust\nuse markitdown::{ConversionOptions, MarkItDown};\n\nlet file_bytes = std::fs::read(\"path/to/file.pdf\")?;\n\n// Auto-detect file type from bytes\nlet result = md.convert_bytes(\u0026file_bytes, None)?;\n\n// Or specify options explicitly\nlet options = ConversionOptions {\n    file_extension: Some(\".pdf\".to_string()),\n    url: None,\n    llm_client: None,\n    llm_model: None,\n};\n\nlet result = md.convert_bytes(\u0026file_bytes, Some(options))?;\n\nif let Some(conversion_result) = result {\n    println!(\"Converted Text: {}\", conversion_result.text_content);\n}\n```\n\n#### Register a Custom Converter\n\nYou can extend MarkItDown by implementing the `DocumentConverter` trait for your custom converters and registering them:\n\n```rust\nuse markitdown::{DocumentConverter, DocumentConverterResult, ConversionOptions, MarkItDown};\nuse markitdown::error::MarkitdownError;\n\nstruct MyCustomConverter;\n\nimpl DocumentConverter for MyCustomConverter {\n    fn convert(\n        \u0026self,\n        local_path: \u0026str,\n        args: Option\u003cConversionOptions\u003e,\n    ) -\u003e Result\u003cDocumentConverterResult, MarkitdownError\u003e {\n        // Implement file conversion logic\n        todo!()\n    }\n\n    fn convert_bytes(\n        \u0026self,\n        bytes: \u0026[u8],\n        args: Option\u003cConversionOptions\u003e,\n    ) -\u003e Result\u003cDocumentConverterResult, MarkitdownError\u003e {\n        // Implement bytes conversion logic\n        todo!()\n    }\n}\n\nlet mut md = MarkItDown::new();\nmd.register_converter(Box::new(MyCustomConverter));\n```\n\n## License\n\nMarkItDown is licensed under the MIT License. See `LICENSE` for more details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuhobnil%2Fmarkitdown-rs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fuhobnil%2Fmarkitdown-rs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuhobnil%2Fmarkitdown-rs/lists"}