https://github.com/cyanheads/crossref-mcp-server
Resolve DOIs, search ~155M scholarly works, and fetch references via the Crossref REST API. STDIO or Streamable HTTP.
https://github.com/cyanheads/crossref-mcp-server
academic-search ai-agent bibliometrics bun citation crossref cyanheads doi llm mcp mcp-server model-context-protocol research scholarly-metadata stdio streamable-http typescript
Last synced: 19 days ago
JSON representation
Resolve DOIs, search ~155M scholarly works, and fetch references via the Crossref REST API. STDIO or Streamable HTTP.
- Host: GitHub
- URL: https://github.com/cyanheads/crossref-mcp-server
- Owner: cyanheads
- License: other
- Created: 2026-05-22T03:20:28.000Z (27 days ago)
- Default Branch: main
- Last Pushed: 2026-05-23T17:00:37.000Z (25 days ago)
- Last Synced: 2026-05-23T17:04:28.998Z (25 days ago)
- Topics: academic-search, ai-agent, bibliometrics, bun, citation, crossref, cyanheads, doi, llm, mcp, mcp-server, model-context-protocol, research, scholarly-metadata, stdio, streamable-http, typescript
- Language: TypeScript
- Homepage: https://www.npmjs.com/package/@cyanheads/crossref-mcp-server
- Size: 471 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
README
@cyanheads/crossref-mcp-server
Resolve DOIs, search ~155M scholarly works, and fetch references via the Crossref REST API. STDIO or Streamable HTTP.
5 Tools
[](./CHANGELOG.md) [](./LICENSE) [](https://github.com/users/cyanheads/packages/container/package/crossref-mcp-server) [](https://modelcontextprotocol.io/) [](https://www.npmjs.com/package/@cyanheads/crossref-mcp-server) [](https://www.typescriptlang.org/) [](https://bun.sh/)
[](https://github.com/cyanheads/crossref-mcp-server/releases/latest/download/crossref-mcp-server.mcpb) [](https://cursor.com/en/install-mcp?name=crossref-mcp-server&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBjeWFuaGVhZHMvY3Jvc3NyZWYtbWNwLXNlcnZlciJdfQ==) [](https://vscode.dev/redirect?url=vscode:mcp/install?%7B%22name%22%3A%22crossref-mcp-server%22%2C%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40cyanheads/crossref-mcp-server%22%5D%7D)
[](https://www.npmjs.com/package/@cyanheads/mcp-ts-core)
---
## Tools
Five tools for working with Crossref data — DOI resolution, full-text search across all scholarly works, outgoing reference lists, and journal/funder lookup:
| Tool | Description |
|:-----|:------------|
| `crossref_get_work` | Resolve a DOI to its full Crossref metadata record: title, authors, affiliations, abstract (when deposited), journal, publication date, type, license, full-text links, funder acknowledgements, and outgoing reference list |
| `crossref_search_works` | Search the Crossref works index by free text and/or structured filters. Supports sort, field selection, and cursor-based deep paging. Large result sets spill to a DataCanvas table for SQL querying. |
| `crossref_get_references` | Return the outgoing reference list for a DOI — the works cited by this paper, with raw citation strings and resolved DOIs where available |
| `crossref_search_journals` | Find Crossref journal records by ISSN or title query; optionally retrieve the journal's most recent works |
| `crossref_search_funders` | Find funders registered in the Crossref Funder Registry by name or funder DOI; optionally retrieve funded works |
### `crossref_get_work`
Resolve a DOI to its canonical Crossref record.
- DOI validated against `10.NNNN/suffix` regex before the upstream call
- Returns title, authors with affiliations, abstract (when deposited), container/journal, publication date, work type, ISSN, license URLs, full-text link URLs, and funder acknowledgements
- Incoming citation count (`is-referenced-by-count`) is included; citing works are not — Crossref does not expose that data. Use OpenAlex for citation graphs.
---
### `crossref_search_works`
Search across ~155M Crossref-registered works.
- Free-text `query` plus a structured `filter` object using Crossref's hyphen-separated key syntax: `from-pub-date`, `until-pub-date`, `type`, `funder`, `issn`, `member`, `has-abstract`, `has-references`, `has-full-text`, `directory` (use `DOAJ` to restrict to open-access content)
- Sort by `relevance`, `is-referenced-by-count`, `published`, `deposited`, or `score`
- `fields` parameter narrows response payload — useful for large result sets
- Offset paging up to ~10K results; deep paging requires `cursor=*` on the first call, then pass the returned `next-cursor` token. Cursor and offset cannot be combined.
- Large result sets automatically spill to a DataCanvas table for SQL querying (requires `CANVAS_PROVIDER_TYPE=duckdb`; disabled on Workers)
---
### `crossref_get_references`
Fetch the outgoing reference list for a DOI.
- Each reference includes its raw citation string and, where Crossref has resolved it, a DOI for follow-up lookup
- Coverage varies by publisher — pre-2000 literature and non-participating publishers may have no reference list
- Single-hop only; agents that need N-hop traversal chain calls explicitly
---
### `crossref_search_journals`
Find journal records by ISSN or title.
- `include_works: true` triggers a second upstream call to fetch the journal's most recent works
- Returns journal title, publisher, ISSN-L, subject areas, and DOI prefix
---
### `crossref_search_funders`
Find funders in the Crossref Funder Registry.
- Accepts a name query or a direct funder DOI
- `include_works: true` retrieves funded works for the matched funder
- Returns funder name, DOI, country, and alternate names
## Features
Built on [`@cyanheads/mcp-ts-core`](https://github.com/cyanheads/mcp-ts-core):
- Declarative tool definitions — single file per tool, framework handles registration and validation
- Unified error handling across all tools
- Pluggable auth (`none`, `jwt`, `oauth`)
- Swappable storage backends: `in-memory`, `filesystem`, `Supabase`, `Cloudflare KV/R2/D1`
- Structured logging with optional OpenTelemetry tracing
- STDIO and Streamable HTTP transports
Crossref-specific:
- Polite-pool `User-Agent` header injected on every request — priority access granted via `CROSSREF_MAILTO` email address, no API token required
- `withRetry`: 3 attempts, exponential backoff, handles both 429 and 503 responses
- Cursor-based deep paging for result sets beyond the ~10K offset cap
- DataCanvas spillover for large `crossref_search_works` result sets (opt-in via `CANVAS_PROVIDER_TYPE=duckdb`)
- Filter key validation: Crossref uses hyphens (`has-abstract`, `has-references`, `from-pub-date`); the server enforces correct syntax and surfaces API validation errors with actionable recovery hints
## Getting started
Add the following to your MCP client configuration file. `CROSSREF_MAILTO` is optional but recommended — without it the server uses Crossref's anonymous pool with stricter rate limits.
```json
{
"mcpServers": {
"crossref": {
"type": "stdio",
"command": "bunx",
"args": ["@cyanheads/crossref-mcp-server@latest"],
"env": {
"MCP_TRANSPORT_TYPE": "stdio",
"MCP_LOG_LEVEL": "info",
"CROSSREF_MAILTO": "your-email@example.com"
}
}
}
}
```
Or with npx (no Bun required):
```json
{
"mcpServers": {
"crossref": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@cyanheads/crossref-mcp-server@latest"],
"env": {
"MCP_TRANSPORT_TYPE": "stdio",
"MCP_LOG_LEVEL": "info",
"CROSSREF_MAILTO": "your-email@example.com"
}
}
}
}
```
Or with Docker:
```json
{
"mcpServers": {
"crossref": {
"type": "stdio",
"command": "docker",
"args": [
"run", "-i", "--rm",
"-e", "MCP_TRANSPORT_TYPE=stdio",
"-e", "CROSSREF_MAILTO=your-email@example.com",
"ghcr.io/cyanheads/crossref-mcp-server:latest"
]
}
}
}
```
For Streamable HTTP, set the transport and start the server:
```sh
MCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 CROSSREF_MAILTO=your-email@example.com bun run start:http
# Server listens at http://localhost:3010/mcp
```
### Prerequisites
- [Bun v1.3.2](https://bun.sh/) or higher (or Node.js v24+).
- An email address for `CROSSREF_MAILTO` is optional but recommended — Crossref's polite pool grants priority access to clients that identify themselves. No account or token is required.
### Installation
1. **Clone the repository:**
```sh
git clone https://github.com/cyanheads/crossref-mcp-server.git
```
2. **Navigate into the directory:**
```sh
cd crossref-mcp-server
```
3. **Install dependencies:**
```sh
bun install
```
4. **Configure environment:**
```sh
cp .env.example .env
# edit .env and optionally set CROSSREF_MAILTO for polite-pool access
```
## Configuration
All configuration is validated at startup via Zod schemas in `src/config/server-config.ts`.
| Variable | Description | Default |
|:---------|:------------|:--------|
| `CROSSREF_MAILTO` | Email address embedded in the polite-pool `User-Agent` header. Optional — server starts without it but logs a warning and uses the anonymous pool with stricter rate limits. | — |
| `CROSSREF_BASE_URL` | Crossref API base URL. Override for testing against a local proxy. | `https://api.crossref.org` |
| `CROSSREF_TIMEOUT_MS` | Per-request timeout in milliseconds. | `10000` |
| `CANVAS_PROVIDER_TYPE` | Set to `duckdb` to enable DataCanvas spillover for large `crossref_search_works` result sets. Node only; omit on Workers deployments. | — |
| `MCP_TRANSPORT_TYPE` | Transport: `stdio` or `http`. | `stdio` |
| `MCP_HTTP_PORT` | Port for the HTTP server. | `3010` |
| `MCP_AUTH_MODE` | Auth mode: `none`, `jwt`, or `oauth`. | `none` |
| `MCP_LOG_LEVEL` | Log level (RFC 5424). | `info` |
| `LOGS_DIR` | Directory for log files (Node.js only). | `/logs` |
| `OTEL_ENABLED` | Enable [OpenTelemetry instrumentation](https://github.com/cyanheads/mcp-ts-core/tree/main/docs/telemetry). | `false` |
See [`.env.example`](./.env.example) for the full list of optional overrides.
## Running the server
### Local development
- **Build and run:**
```sh
# One-time build
bun run rebuild
# Run the built server
bun run start:stdio
# or
bun run start:http
```
- **Run checks and tests:**
```sh
bun run devcheck # Lint, format, typecheck, security
bun run test # Vitest test suite
bun run lint:mcp # Validate MCP definitions against spec
```
## Project structure
| Directory | Purpose |
|:----------|:--------|
| `src/index.ts` | `createApp()` entry point — registers tools and inits services. |
| `src/config` | Server-specific environment variable parsing and validation with Zod. |
| `src/mcp-server/tools` | Tool definitions (`*.tool.ts`). Five tools for Crossref data access. |
| `src/services/crossref` | CrossrefService — HTTP client, polite-pool header, retry, pagination helpers. |
| `tests/` | Unit and integration tests mirroring `src/`. |
## Development guide
See [`CLAUDE.md`](./CLAUDE.md) for development guidelines and architectural rules. The short version:
- Handlers throw, framework catches — no `try/catch` in tool logic
- Use `ctx.log` for request-scoped logging, `ctx.state` for tenant-scoped storage
- Register new tools via the barrel in `src/mcp-server/tools/definitions/index.ts`
- Wrap external API calls: validate raw → normalize to domain type → return output schema; never fabricate missing fields (abstracts, reference lists, and affiliations are frequently absent in Crossref records)
## Contributing
Issues and pull requests are welcome. Run checks and tests before submitting:
```sh
bun run devcheck
bun run test
```
## License
Apache-2.0 — see [LICENSE](LICENSE) for details.