{"id":38146515,"url":"https://github.com/fleetingbytes/rtfparse","last_synced_at":"2026-01-16T22:56:35.006Z","repository":{"id":57463110,"uuid":"323251306","full_name":"fleetingbytes/rtfparse","owner":"fleetingbytes","description":"RTF Parser","archived":false,"fork":false,"pushed_at":"2025-09-10T16:07:24.000Z","size":92,"stargazers_count":18,"open_issues_count":12,"forks_count":10,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-11-27T10:59:22.752Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fleetingbytes.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-12-21T06:33:36.000Z","updated_at":"2025-10-30T13:55:17.000Z","dependencies_parsed_at":"2024-05-17T05:32:16.543Z","dependency_job_id":"89b16dd0-08c9-4250-b8f3-ea44e1a72779","html_url":"https://github.com/fleetingbytes/rtfparse","commit_stats":null,"previous_names":["nagidal/rtfparse"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/fleetingbytes/rtfparse","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fleetingbytes%2Frtfparse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fleetingbytes%2Frtfparse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fleetingbytes%2Frtfparse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fleetingbytes%2Frtfparse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fleetingbytes","download_url":"https://codeload.github.com/fleetingbytes/rtfparse/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fleetingbytes%2Frtfparse/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28486949,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T22:54:02.790Z","status":"ssl_error","status_checked_at":"2026-01-16T22:50:10.344Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-16T22:56:34.890Z","updated_at":"2026-01-16T22:56:34.998Z","avatar_url":"https://github.com/fleetingbytes.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rtfparse\n\nParses Microsoft's Rich Text Format (RTF) documents. It creates an in-memory object which represents the tree structure of the RTF document. This object can in turn be rendered by using one of the renderers.\nSo far, rtfparse provides only one renderer (`HTML_Decapsulator`) which liberates the HTML code encapsulated in RTF. This will come handy, for examle, if you ever need to extract the HTML from a HTML-formatted email message saved by Microsoft Outlook.\n\nMS Outlook also tends to use RTF compression, so the CLI of rtfparse can optionally decompress that, too.\n\nYou can of course write your own renderers of parsed RTF documents and consider contributing them to this project.\n\n\n# Installation\n\nInstall rtfparse from your local repository with pip:\n\n    pip install rtfparse\n\nInstallation creates an executable file `rtfparse` in your python scripts folder which should be in your `$PATH`.\n\n# Usage From Command Line\n\nUse the `rtfparse` executable from the command line. Read `rtfparse --help`.\n\nrtfparse writes logs into `~/rtfparse/` into these files:\n\n```\nrtfparse.debug.log\nrtfparse.info.log\nrtfparse.errors.log\n```\n\n## Example: Decapsulate HTML from an uncompressed RTF file\n\n    rtfparse --rtf-file \"path/to/rtf_file.rtf\" --decapsulate-html --output-file \"path/to/extracted.html\"\n\n## Example: Decapsulate HTML from MS Outlook email file\n\nFor this, the CLI of rtfparse uses [extract_msg](https://github.com/TeamMsgExtractor/msg-extractor) and [compressed_rtf](https://github.com/delimitry/compressed_rtf).\n\n    rtfparse --msg-file \"path/to/email.msg\" --decapsulate-html --output-file \"path/to/extracted.html\"\n\n## Example: Only decompress the RTF from MS Outlook email file\n\n    rtfparse --msg-file \"path/to/email.msg\" --output-file \"path/to/extracted.rtf\"\n\n## Example: Decapsulate HTML from MS Outlook email file and save (and later embed) the attachments\n\nWhen extracting the RTF from the `.msg` file, you can save the attachments (which includes images embedded in the email text) in a directory:\n\n    rtfparse --msg-file \"path/to/email.msg\" --output-file \"path/to/extracted.rtf\" --attachments-dir \"path/to/dir\"\n\nIn `rtfparse` version 1.x you will be able to embed these images in the decapsulated HTML. This functionality will be provided by the package [embedimg](https://github.com/fleetingbytes/embedimg).\n\n    rtfparse --msg-file \"path/to/email.msg\" --output-file \"path/to/extracted.rtf\" --attachments-dir \"path/to/dir\" --embed-img\n\nIn the current version the option `--embed-img` does nothing.\n\n# Programatic usage in a Python module\n\n## Decapsulate HTML from an uncompressed RTF file\n\n```py\nfrom pathlib import Path\nfrom rtfparse.parser import Rtf_Parser\nfrom rtfparse.renderers.html_decapsulator import HTML_Decapsulator\n\nsource_path = Path(r\"path/to/your/rtf/document.rtf\")\ntarget_path = Path(r\"path/to/your/html/decapsulated.html\")\n# Create parent directory of `target_path` if it does not already exist:\ntarget_path.parent.mkdir(parents=True, exist_ok=True)\n\nparser = Rtf_Parser(rtf_path=source_path)\nparsed = parser.parse_file()\n\nrenderer = HTML_Decapsulator()\n\nwith open(target_path, mode=\"w\", encoding=\"utf-8\") as html_file:\n    renderer.render(parsed, html_file)\n```\n\n## Decapsulate HTML from an MS Outlook msg file\n\n```py\nfrom pathlib import Path\nfrom extract_msg import openMsg\nfrom compressed_rtf import decompress\nfrom io import BytesIO\nfrom rtfparse.parser import Rtf_Parser\nfrom rtfparse.renderers.html_decapsulator import HTML_Decapsulator\n\n\nsource_file = Path(\"path/to/your/source.msg\")\ntarget_file = Path(r\"path/to/your/target.html\")\n# Create parent directory of `target_path` if it does not already exist:\ntarget_file.parent.mkdir(parents=True, exist_ok=True)\n\n# Get a decompressed RTF bytes buffer from the MS Outlook message\nmsg = openMsg(source_file)\ndecompressed_rtf = decompress(msg.compressedRtf)\nrtf_buffer = BytesIO(decompressed_rtf)\n\n# Parse the rtf buffer\nparser = Rtf_Parser(rtf_file=rtf_buffer)\nparsed = parser.parse_file()\n\n# Decapsulate the HTML from the parsed RTF\ndecapsulator = HTML_Decapsulator()\nwith open(target_file, mode=\"w\", encoding=\"utf-8\") as html_file:\n    decapsulator.render(parsed, html_file)\n```\n\n# RTF Specification Links\n\n* [RTF Informative References](https://learn.microsoft.com/en-us/openspecs/exchange_server_protocols/ms-oxrtfcp/85c0b884-a960-4d1a-874e-53eeee527ca6)\n* [RTF Specification 1.9.1](https://go.microsoft.com/fwlink/?LinkId=120924)\n* [RTF Extensions, MS-OXRTFEX](https://docs.microsoft.com/en-us/openspecs/exchange_server_protocols/ms-oxrtfex/411d0d58-49f7-496c-b8c3-5859b045f6cf)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffleetingbytes%2Frtfparse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffleetingbytes%2Frtfparse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffleetingbytes%2Frtfparse/lists"}