https://github.com/NimbleBrainInc/mcp-pdfco
PDFCo MCP Server
https://github.com/NimbleBrainInc/mcp-pdfco
mcp mcpb nimblebrain nimbletools pdfco
Last synced: 4 months ago
JSON representation
PDFCo MCP Server
- Host: GitHub
- URL: https://github.com/NimbleBrainInc/mcp-pdfco
- Owner: NimbleBrainInc
- License: mit
- Created: 2025-10-09T17:04:16.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2026-02-14T02:49:59.000Z (4 months ago)
- Last Synced: 2026-02-14T09:22:59.043Z (4 months ago)
- Topics: mcp, mcpb, nimblebrain, nimbletools, pdfco
- Language: Python
- Homepage:
- Size: 62.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-mcpb - PDF.co - PDF manipulation, conversion, OCR, and text extraction. (Utilities)
README
# MCP Server PDF.co
[](https://github.com/nimbletoolsinc/mcp-registry)
[](https://www.nimblebrain.ai)
[](https://www.nimblebrain.ai/discord?utm_source=github&utm_medium=readme&utm_campaign=mcp-pdfco&utm_content=discord-badge)
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
[](https://github.com/NimbleBrainInc/mcp-pdfco/actions)
## About
MCP server for PDF.co API. Comprehensive PDF manipulation, conversion, OCR,
text extraction, and document automation with support for barcodes,
watermarks, and security features.
## Features
- **Full API Coverage**: Complete implementation of PDF.co API endpoints
- **Strongly Typed**: All responses use Pydantic models for type safety
- **S-Tier Architecture**: Production-ready with separated concerns (API client, models, server)
- **HTTP Transport**: Supports streamable-http with health endpoint
- **Async/Await**: Built on aiohttp for high performance
- **Type Safe**: Full mypy strict mode compliance
- **Comprehensive Testing**: Unit tests with pytest and AsyncMock
- **Docker Ready**: Production Dockerfile included
- **Built-in Skill Resource**: Serves a `skill://pdfco/usage` resource that teaches LLMs HTML best practices, tool selection, and error recovery
## Available Tools
### PDF Conversion Tools
- `pdf_to_text` - Extract text content from PDF documents
- `pdf_to_json` - Extract structured data from PDFs
- `pdf_to_html` - Convert PDF to HTML format
- `pdf_to_csv` - Extract tables from PDF to CSV
### PDF Manipulation Tools
- `pdf_merge` - Combine multiple PDFs into one
- `pdf_split` - Split PDF into separate pages or ranges
- `pdf_rotate` - Rotate pages in a PDF document
- `pdf_compress` - Reduce PDF file size with configurable compression
- `pdf_add_watermark` - Add text watermarks to PDFs
### PDF Security Tools
- `pdf_protect` - Add password protection to PDFs
- `pdf_unlock` - Remove password protection from PDFs
### PDF Information
- `pdf_info` - Get PDF metadata (pages, size, dimensions, etc.)
### Document Creation Tools
- `html_to_pdf` - Convert HTML content to PDF
- `url_to_pdf` - Convert web pages to PDF
- `image_to_pdf` - Convert images to PDF documents
### Barcode Tools
- `barcode_generate` - Generate QR codes and barcodes
- `barcode_read` - Read and decode barcodes from images
### OCR Tools
- `ocr_pdf` - OCR scanned PDFs to make them searchable
## Installation
### Using uv (recommended)
```bash
# Clone the repository
git clone
cd mcp-pdfco
# Install with uv
uv pip install -e .
# Install with development dependencies
uv pip install -e ".[dev]"
```
### Using pip
```bash
pip install -e .
```
## Configuration
### API Key
Get your free API key from [PDF.co Dashboard](https://app.pdf.co/dashboard) and set it as an environment variable:
```bash
export PDFCO_API_KEY=your_api_key_here
```
Or create a `.env` file:
```env
PDFCO_API_KEY=your_api_key_here
```
### Claude Desktop Configuration
Add to your Claude Desktop configuration file:
**MacOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
**Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
```json
{
"mcpServers": {
"pdfco": {
"command": "uvx",
"args": ["mcp-pdfco"],
"env": {
"PDFCO_API_KEY": "your_api_key_here"
}
}
}
}
```
## Running the Server
### Development Mode
```bash
# Using Python module
uv run python -m mcp_pdfco.server
# Using the Makefile
make run
```
### Production Mode (Docker)
```bash
# Build the Docker image
docker build -t mcp-pdfco .
# Run with Docker
docker run -e PDFCO_API_KEY=your_key -p 8000:8000 mcp-pdfco
# Run with Docker Compose
docker-compose up
```
### HTTP Transport
The server supports HTTP transport with a health check endpoint:
```bash
# Start with uvicorn
uvicorn mcp_pdfco.server:app --host 0.0.0.0 --port 8000
# Check health
curl http://localhost:8000/health
```
## Usage Examples
### Extract Text from PDF
```python
result = await pdf_to_text(
url="https://example.com/document.pdf",
pages="1-5"
)
print(result.text)
```
### Merge Multiple PDFs
```python
result = await pdf_merge(
urls=[
"https://example.com/doc1.pdf",
"https://example.com/doc2.pdf"
],
name="merged_document.pdf"
)
print(f"Merged PDF: {result.url}")
```
### Convert HTML to PDF
```python
result = await html_to_pdf(
html="
Hello World
This is a PDF
",
name="hello.pdf",
page_size="A4",
orientation="Portrait"
)
print(f"Generated PDF: {result.url}")
```
### Add Watermark
```python
result = await pdf_add_watermark(
url="https://example.com/document.pdf",
text="CONFIDENTIAL",
x=200,
y=400,
font_size=48,
color="FF0000",
opacity=0.3,
pages="0-", # Apply to all pages
name="watermarked_document.pdf"
)
print(f"Watermarked PDF: {result.url}")
```
### Generate QR Code
```python
result = await barcode_generate(
value="https://example.com",
barcode_type="QRCode",
format="png"
)
print(f"QR Code: {result.url}")
```
### OCR a Scanned PDF
```python
result = await ocr_pdf(
url="https://example.com/scanned.pdf",
pages="1-10",
lang="eng"
)
print(f"OCR'd PDF: {result.url}")
print(f"Extracted text: {result.text}")
```
## Development
### Quick Start
```bash
make help # Show all available commands
make install # Install dependencies
make dev-install # Install with dev dependencies
make format # Format code with ruff
make lint # Lint code with ruff
make typecheck # Type check with mypy
make test # Run tests with pytest
make test-cov # Run tests with coverage
make check # Run all checks (lint + typecheck + test)
make clean # Clean up artifacts
```
### Project Structure
```
.
├── src/
│ └── mcp_pdfco/
│ ├── __init__.py
│ ├── server.py # FastMCP server with tool definitions
│ ├── api_client.py # Async PDF.co API client
│ └── api_models.py # Pydantic models for type safety
├── tests/
│ ├── __init__.py
│ ├── test_server.py # Server tool tests
│ └── test_api_client.py # API client tests
├── pyproject.toml # Project configuration
├── Makefile # Development commands
├── Dockerfile # Container deployment
└── README.md # This file
```
### Running Tests
```bash
# Run all tests
pytest
# Run with coverage
pytest --cov=src/mcp_pdfco --cov-report=term-missing
# Run specific test file
pytest tests/test_server.py -v
```
### Code Quality
This project uses:
- **ruff**: Fast Python linter and formatter
- **mypy**: Static type checker (strict mode)
- **pytest**: Testing framework with async support
All code must pass:
```bash
make check # Runs lint + typecheck + test
```
## Architecture
This server follows S-Tier MCP architecture principles:
1. **Separation of Concerns**
- `api_client.py`: HTTP communication layer
- `api_models.py`: Data models and type definitions
- `server.py`: MCP tool definitions and routing
2. **Type Safety**
- Full type hints on all functions
- Pydantic models for API responses
- Mypy strict mode compliance
3. **Async All the Way**
- aiohttp for HTTP requests
- Async/await throughout
- Context managers for resource cleanup
4. **Error Handling**
- Custom `PDFcoAPIError` exception
- Context logging via `ctx.error()` and `ctx.warning()`
- Graceful error messages
5. **Production Ready**
- Docker support
- Health check endpoint
- Environment-based configuration
- Comprehensive logging
## Requirements
- Python 3.13+
- aiohttp >= 3.12.15
- fastmcp >= 2.14.0
- pydantic >= 2.0.0
## API Documentation
For detailed API documentation, visit [PDF.co API Documentation](https://apidocs.pdf.co/).
### Supported Input Formats
- **PDF**: URL or base64 encoded
- **Images**: PNG, JPG, GIF, BMP, TIFF
- **HTML**: Raw HTML string or URL
### Supported Output Formats
- **PDF**: High-quality PDF generation
- **Text**: Plain text extraction
- **JSON**: Structured data extraction
- **HTML**: Formatted HTML output
- **CSV**: Table data extraction
- **Images**: PNG, JPG, SVG for barcodes
### Rate Limits
PDF.co has rate limits based on your subscription plan. Free plans include:
- 100 API calls per month
- 10 API calls per minute
Check your [dashboard](https://app.pdf.co/dashboard) for current usage.
## Troubleshooting
### Common Issues
**Issue**: `PDFCO_API_KEY is not set` warning
**Solution**: Set the environment variable:
```bash
export PDFCO_API_KEY=your_key_here
```
**Issue**: `Network error` or timeout
**Solution**: Check your internet connection and increase timeout:
```python
client = PDFcoClient(timeout=180.0) # 3 minutes
```
**Issue**: `API Error 401: Unauthorized`
**Solution**: Verify your API key is valid at https://app.pdf.co/dashboard
**Issue**: Docker container won't start
**Solution**: Ensure the API key is passed correctly:
```bash
docker run -e PDFCO_API_KEY=your_key_here -p 8000:8000 mcp-pdfco
```
## Contributing
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests: `make check`
5. Submit a pull request
Issue Tracker: [GitHub Issues](https://github.com/your-org/mcp-pdfco/issues)
## License
MIT
## Links
Part of the [NimbleTools Registry](https://github.com/nimbletoolsinc/mcp-registry) - an open source collection of production-ready MCP servers. For enterprise deployment, check out [NimbleBrain](https://www.nimblebrain.ai).
### API Documentation
- [PDF.co Documentation](https://apidocs.pdf.co/)
- [PDF Operations](https://apidocs.pdf.co/02-pdf-to-text)
- [Conversion APIs](https://apidocs.pdf.co/07-html-to-pdf)
- [OCR API](https://apidocs.pdf.co/12-pdf-ocr)
- [Barcode API](https://apidocs.pdf.co/24-barcode-generate)
### Support
- [Help Center](https://pdf.co/support)
- [API Documentation](https://apidocs.pdf.co/)
- [Contact Support](https://pdf.co/contact)
- [Status Page](https://status.pdf.co/)
- Built with [FastMCP](https://github.com/jlowin/fastmcp)