{"id":13910701,"url":"https://github.com/artiomn/markdown_articles_tool","last_synced_at":"2025-04-09T09:05:49.001Z","repository":{"id":37237409,"uuid":"213034544","full_name":"artiomn/markdown_articles_tool","owner":"artiomn","description":"Parse markdown article, download images and replace images URL's with local paths","archived":false,"fork":false,"pushed_at":"2024-05-22T20:05:43.000Z","size":309,"stargazers_count":122,"open_issues_count":7,"forks_count":25,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-02T08:11:31.029Z","etag":null,"topics":["article","article-extracting","article-extractor","articles","downloader","html","image-manipulation","images","markdown","markdown-articles","markdown-converter","markdown-parser","markdown-to-html","markdown-to-pdf","md","pdf","python-library","toolset"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/artiomn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-10-05T16:40:51.000Z","updated_at":"2025-03-29T12:00:52.000Z","dependencies_parsed_at":"2023-11-07T04:05:32.085Z","dependency_job_id":"325963e5-c759-4dcd-a119-5cb6aa9b2e39","html_url":"https://github.com/artiomn/markdown_articles_tool","commit_stats":{"total_commits":126,"total_committers":3,"mean_commits":42.0,"dds":0.3492063492063492,"last_synced_commit":"87440aef96e21f33ba4ec2692c69c24150374e32"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artiomn%2Fmarkdown_articles_tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artiomn%2Fmarkdown_articles_tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artiomn%2Fmarkdown_articles_tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artiomn%2Fmarkdown_articles_tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/artiomn","download_url":"https://codeload.github.com/artiomn/markdown_articles_tool/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248008629,"owners_count":21032556,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["article","article-extracting","article-extractor","articles","downloader","html","image-manipulation","images","markdown","markdown-articles","markdown-converter","markdown-parser","markdown-to-html","markdown-to-pdf","md","pdf","python-library","toolset"],"created_at":"2024-08-07T00:01:42.908Z","updated_at":"2025-04-09T09:05:48.976Z","avatar_url":"https://github.com/artiomn.png","language":"Python","readme":"[![Python package](https://github.com/artiomn/markdown_images_downloader/workflows/Python%20package/badge.svg)](https://github.com/artiomn/markdown_articles_tool/actions/)\n[![License](https://img.shields.io/badge/license-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)\n[![Stargazers](https://img.shields.io/github/stars/artiomn/markdown_images_downloader.svg)](https://github.com/artiomn/markdown_images_downloader/stargazers)\n[![Forks](https://img.shields.io/github/forks/artiomn/markdown_images_downloader.svg)](https://github.com/artiomn/markdown_images_downloader/network/members)\n[![Latest Release](https://img.shields.io/github/v/release/artiomn/markdown_images_downloader.svg)](https://github.com/artiomn/markdown_images_downloader/releases)\n\n\n# Markdown articles tool 0.1.3\n\nFree command line utility, written in Python, designed to help you manage online and downloaded Markdown documents (e.g., articles).\nThe Markdown Articles Tool is available for macOS, Windows, and Linux.\n\nTool can be used:\n\n- To download Markdown documents with images and:\n  * Find all image links, download images and fix links in the document.\n  * Can skip broken links.\n  * Deduplicate similar images by content hash or using hash as a name.\n- Support images, linked with HTML `\u003cimg\u003e` tag.\n- Support local image files.\n- Convert Markdown documents to:\n  * HTML.\n  * PDF.\n  * Or save in the plain Markdown.\n\nAlso, if you want to use separate functions, you can just import the package.\n\n\n## Installation\n\n### From the repository\n\n**You need Python 3.9+.**\n\nRun:\n\n```\ngit clone \"https://github.com/artiomn/markdown_articles_tool\"\npip3 install -r markdown_articles_tool/requirements.txt\n```\n\n### From the [PIP](https://pypi.org/project/markdown-tool/)\n\n```\npip3 install markdown-tool\n```\n\n\n## Usage\n\nSyntax:\n\n```\nmarkdown_tool [options] \u003carticle_file_path_or_url\u003e\n\noptions:\n  -h, --help            show this help message and exit\n  -D {disabled,names_hashing,content_hash}, --deduplication-type {disabled,names_hashing,content_hash}\n                        Deduplicate images, using content hash or SHA1(image_name) (default: disabled)\n  -d IMAGES_DIRNAME, --images-dirname IMAGES_DIRNAME\n                        Folder in which to download images (possible variables: $article_name, $time, $date, $dt, $base_url) (default: images)\n  -a, --skip-all-incorrect\n                        skip all incorrect images (default: False)\n  -E, --download-incorrect-mime\n                        download \"images\" with unrecognized MIME type (default: False)\n  -s SKIP_LIST, --skip-list SKIP_LIST\n                        skip URL's from the comma-separated list (or file with a leading '@') (default: None)\n  -i {md,html,md+html,html+md}, --input-format {md,html,md+html,html+md}\n                        input format (default: md)\n  -l, --process-local-images\n                        [DEPRECATED] Process local images (default: False)\n  -n, --replace-image-names\n                        Replace image names, using content hash (default: False)\n  -o {md,html}, --output-format {md,html}\n                        output format (default: md)\n  -p IMAGES_PUBLIC_PATH, --images-public-path IMAGES_PUBLIC_PATH\n                        Public path to the folder of downloaded images (possible variables: $article_name, $time, $date, $dt, $base_url)\n  -P, --prepend-images-with-path\n                        Save relative images paths (default: False)\n  -R, --remove-source   Remove or replace source file (default: False)\n  -t DOWNLOADING_TIMEOUT, --downloading-timeout DOWNLOADING_TIMEOUT\n                        how many seconds to wait before downloading will be failed (default: -1)\n  -O OUTPUT_PATH, --output-path OUTPUT_PATH\n                        article output file name or path\n  --verbose, -v         More verbose logging (default: False)\n  --version             return version number\n```\n\nRun example 1:\n\n```\n./markdown_tool.py nc-1-zfs/article.md\n```\n\nRun example 2:\n\n```\n./markdown_tool.py not-nas/sov/article.md -o html -s \"http://www.ossec.net/_images/ossec-arch.jpg\" -a\n```\n\nRun example 3 (run on a folder):\n\n```\nfind content/ -name \"*.md\" | xargs -n1 ./markdown_tool.py\n```\n\n\n## Changes\n\n### 0.1.3\n\n- Mostly technical fixes, necessary to work GUI tool.\n- Now the tool has [Qt-based GUI](https://github.com/artiomn/mat_gui).\n\n\n### 0.1.2\n\n- `-l, --process-local-images` deprecated from the version 0.1.2 and will not work: local images will always be processed.\n- Images with unrecognized MIME type will not be downloaded by default (use `-E` to disable this behaviour).\n- New option `-P, --prepend-images-with-path` changes image output path structure. If this option is enabled,\n  \"remote\" image path will be saved in the local directory structure.\n- Code was significantly refactored.\n- Some auto tests were added.\n\n\n### 0.0.8\n\n`-D` (deduplication) option was changed in the version 0.0.8. Now option is not boolean, it has several values: \"disabled\", \"names_hashing\", \"content_hash\".\n  Long option name was changed too: now it's `deduplication-type`.\n\n\n# Internals\n\nTools is a pipeline, which get Markdown form the source and process them, using blocks:\n\n- Source download article.\n- `ImageDownloader` download every image.\n  Inside may be used image deduplicator blocks applied to the image.\n- Transform article file, i.e. fix images URLs.\n- Format article to the specific format (Markdown, HTML, PDF, etc.), using selected formatters.\n\n`ArticleProcessor` class is a strategy, applies blocks, based on the parameters (from the CLI, for example).\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fartiomn%2Fmarkdown_articles_tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fartiomn%2Fmarkdown_articles_tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fartiomn%2Fmarkdown_articles_tool/lists"}