https://github.com/stn1slv/md-fetch
Python library that extracts article content from Medium and dev.to and returns it as clean, well-structured Markdown.
https://github.com/stn1slv/md-fetch
article-extraction devto markdown medium python scraping
Last synced: 7 days ago
JSON representation
Python library that extracts article content from Medium and dev.to and returns it as clean, well-structured Markdown.
- Host: GitHub
- URL: https://github.com/stn1slv/md-fetch
- Owner: stn1slv
- License: mit
- Created: 2026-05-13T20:09:40.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-06-02T12:18:58.000Z (20 days ago)
- Last Synced: 2026-06-02T12:27:50.923Z (20 days ago)
- Topics: article-extraction, devto, markdown, medium, python, scraping
- Language: Python
- Homepage:
- Size: 526 KB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# mdfetch
A Python library that extracts article content from web platforms and returns it as clean Markdown.
## Install
### pip
```bash
pip install mdfetch
```
### Homebrew (macOS / Linux)
```bash
brew install stn1slv/tap/md-fetch
```
No Python environment setup required.
## CLI Usage
You can use the built-in `md-fetch` command directly from your terminal:
```bash
# Fetch and print Markdown to standard output
md-fetch https://medium.com/example/article
# Fetch and save Markdown to a file
md-fetch https://dev.to/example/article --output article.md
# List all supported platforms (no URL or network needed)
md-fetch --list-platforms
```
## Python Usage
```python
from mdfetch import extract
# Works with any supported platform — just pass the URL
markdown = extract("https://medium.com/some-publication/article-slug-abc123")
markdown = extract("https://dev.to/username/article-slug")
markdown = extract("https://example.substack.com/p/article-slug")
markdown = extract("https://thenewstack.io/article-slug")
markdown = extract("https://dzone.com/articles/article-slug")
markdown = extract("https://boomi.com/blog/article-slug")
markdown = extract("https://konghq.com/blog/category/article-slug")
print(markdown)
```
## Error handling
```python
from mdfetch import (
extract,
InvalidURLError,
UnsupportedPlatformError,
UnsupportedContentTypeError,
FetchError,
HTTPStatusError,
EmptyContentError,
)
url = "https://medium.com/some-publication/article-slug-abc123"
try:
markdown = extract(url)
except InvalidURLError as e:
print(f"Bad URL: {e.message}")
except UnsupportedPlatformError as e:
print(f"Platform not supported: {e.message}")
except UnsupportedContentTypeError as e:
print(f"Not an article page: {e.message}")
except HTTPStatusError as e:
print(f"HTTP {e.status_code}: {e.message}")
except FetchError as e:
print(f"Network error: {e.message}")
except EmptyContentError as e:
print(f"No content: {e.message}")
```
## Supported platforms
| Platform | Domains |
|----------|---------|
| Medium | `medium.com`, `*.medium.com` |
| dev.to | `dev.to` |
| Substack | `substack.com`, `*.substack.com` |
| The New Stack | `thenewstack.io` |
| DZone | `dzone.com` |
| Boomi | `boomi.com` |
| Kong | `konghq.com` |
## Development
Requires [uv](https://docs.astral.sh/uv/).
```bash
make setup # install dependencies
make test # run unit tests
make integration # run integration tests (requires network access)
make lint # ruff check
make format # ruff format
make build # build wheel + sdist
make upgrade-deps # upgrade all dependencies
make clean # remove build artifacts
```
## Requirements
- Python 3.12+