https://github.com/dmitriiweb/article-scraper-mcp
MCP to get an article text from an URL
https://github.com/dmitriiweb/article-scraper-mcp
ai mcp scraping
Last synced: 24 days ago
JSON representation
MCP to get an article text from an URL
- Host: GitHub
- URL: https://github.com/dmitriiweb/article-scraper-mcp
- Owner: dmitriiweb
- License: mit
- Created: 2025-08-17T10:19:55.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-08-20T21:33:24.000Z (10 months ago)
- Last Synced: 2025-08-20T21:52:25.166Z (10 months ago)
- Topics: ai, mcp, scraping
- Language: Python
- Homepage:
- Size: 60.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Article Scraper MCP
A Model Context Protocol (MCP) server that fetches article data from URLs using newspaper3k.
## Features
- Extract article title, text, author, and publication date
- Robust error handling and URL validation
- Structured data output
- Built with FastMCP for easy integration
## Installation
Install directly from PyPI:
```bash
uvx article-scraper-mcp
```
## Usage
Add to your MCP client configuration:
```json
{
"mcpServers": {
"article-scraper": {
"command": "uvx",
"args": ["article-scraper-mcp"]
}
}
}
```
## API
### `fetch_article(url: str) -> dict[str, Any]`
Fetches and parses a news article from the given URL.
**Parameters:**
- `url`: The URL of the news article to fetch
**Returns:**
A dictionary containing:
- `title`: Article title
- `text`: Article content text
- `author`: Author name(s) (may be None)
- `date`: Publication date in ISO format (may be None)
**Raises:**
- `ValueError`: If URL is invalid or article cannot be parsed
- `requests.RequestException`: If HTTP request fails
## Requirements
- Python 3.11+
- newspaper3k
- requests
- loguru
- mcp[cli]
## License
MIT