https://github.com/rainyheart/jcrawl4ai-mcp-server
Java implementation of MCP Server for Crawl4ai
https://github.com/rainyheart/jcrawl4ai-mcp-server
Last synced: 3 months ago
JSON representation
Java implementation of MCP Server for Crawl4ai
- Host: GitHub
- URL: https://github.com/rainyheart/jcrawl4ai-mcp-server
- Owner: rainyheart
- License: mit
- Created: 2025-04-21T00:56:36.000Z (6 months ago)
- Default Branch: master
- Last Pushed: 2025-06-27T00:47:56.000Z (3 months ago)
- Last Synced: 2025-06-27T01:35:17.465Z (3 months ago)
- Language: Java
- Homepage:
- Size: 12.7 KB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-mcp-servers - **jcrawl4ai-mcp-server** - Java implementation of MCP Server for Crawl4ai `java` `mcp` `server` `ai` `git clone https://github.com/rainyheart/jcrawl4ai-mcp-server` (API)
- awesome-mcp-servers - **jcrawl4ai-mcp-server** - Java implementation of MCP Server for Crawl4ai `java` `mcp` `server` `ai` `git clone https://github.com/rainyheart/jcrawl4ai-mcp-server` (API)
README
# jcrawl4ai-mcp-server
- Java implementation of MCP Server for interacting with Crawl4ai API.
- Certified by [mcpreview](https://mcpreview.com/mcp-servers/rainyheart/jcrawl4ai-mcp-server)## Project Overview
jcrawl4ai-mcp-server is a Spring Boot-based MCP server that interacts with the Crawl4ai API to perform web crawling. The main functionalities include:
- Crawling specified URLs using a given strategy, maximum depth, and output format.
- Getting the crawl result by a given task ID.## Configuration
### application.properties
Configure the following properties in the `src/main/resources/application.properties` file:
- `cawl4ai.base-url`: Base URL of the Crawl4ai server.
- `cawl4ai.api-token`: API token for the Crawl4ai server.Example configuration:
```properties
cawl4ai.base-url=http://your-cral4ai-server-url:11235
cawl4ai.api-token=your-api-token
```## Dependencies
The project depends on the following libraries:
- Spring AI MCP Server
- Spring Boot
- Hutool## Running the Project
Build and run the project using Maven:
```sh
mvn clean install
java -jar target/jcawl4ai-mcp-server-1.0.0.jar
```You can download the jar file from this [link](https://github-registry-files.githubusercontent.com/969807736/78982980-2371-11f0-9074-8f75756ab435?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20250427%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250427T061314Z&X-Amz-Expires=300&X-Amz-Signature=4069fd1d4dca48407ba3bfd5ee4a236eeaa67e284c97eda6bff4f3490fafdfc8&X-Amz-SignedHeaders=host&response-content-disposition=filename%3Djcrawl4ai-mcp-server-1.0.0.jar&response-content-type=application%2Foctet-stream) directly.
## APIs
### Crawl4aiApi
#### `crawl` Method
- **Description**: Call the Crawl4ai API to crawl the specified URLs.
- **Parameters**:
- `urls`: Array of target website URLs.
- `strategy`: Crawl strategy.
- `max_depth`: Maximum depth.
- `output_format`: Output format.
- **Return Value**: JSON string of the crawl result.#### `task` Method
- **Description**: Get the crawl result by a given task ID.
- **Parameters**:
- `taskId`: Task ID.
- **Return Value**: JSON string of the crawl result.## Logging
Log file path: `./target/mcp-stdio-server.log`.
## MCP Server Configuration
``` Json
{
"mcpServers": {
"jcawl4ai-mcp-server": {
"autoApprove": [
"crawl",
"task"
],
"disabled": false,
"timeout": 60,
"command": "java",
"args": [
"-jar",
"/path/to/your/jar/file/jcawl4ai-mcp-server-1.0.0.jar"
],
"transportType": "stdio"
}
}
}
```## Contact
If you have any questions or suggestions, please contact [Ken Ye](mailto:yjz_work@126.com).
---
# jcrawl4ai-mcp-server
Java 实现的 MCP 服务器,用于与 Crawl4ai API 进行交互。
## 项目概述
jcrawl4ai-mcp-server 是一个基于 Spring Boot 的 MCP 服务器,用于调用 Crawl4ai API 进行网页爬取。该项目的主要功能包括:
- 使用指定的策略、最大深度和输出格式对给定的 URL 进行爬取。
- 根据给定的任务 ID 获取爬取结果。## 配置
### application.properties
在 `src/main/resources/application.properties` 文件中配置以下属性:
- `cawl4ai.base-url`:Crawl4ai 服务器的基础 URL。
- `cawl4ai.api-token`:Crawl4ai 服务器的 API 令牌。示例配置:
```properties
cawl4ai.base-url=http://your-cral4ai-server-url:11235
cawl4ai.api-token=your-api-token
```## 依赖
项目依赖于以下库:
- Spring AI MCP Server
- Spring Boot
- Hutool## 启动
使用 Maven 构建并运行项目:
```sh
mvn clean install
java -jar target/jcawl4ai-mcp-server-1.0.0.jar
```您可以从以下链接中直接下载jar包: [link](https://github-registry-files.githubusercontent.com/969807736/78982980-2371-11f0-9074-8f75756ab435?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20250427%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250427T061314Z&X-Amz-Expires=300&X-Amz-Signature=4069fd1d4dca48407ba3bfd5ee4a236eeaa67e284c97eda6bff4f3490fafdfc8&X-Amz-SignedHeaders=host&response-content-disposition=filename%3Djcrawl4ai-mcp-server-1.0.0.jar&response-content-type=application%2Foctet-stream)
## 接口
### Crawl4aiApi
#### `crawl` 方法
- **描述**:调用 Crawl4ai API 爬取指定的 URL。
- **参数**:
- `urls`:目标网站的 URL 数组。
- `strategy`:爬取策略。
- `max_depth`:最大深度。
- `output_format`:输出格式。
- **返回值**:爬取结果的 JSON 字符串。#### `task` 方法
- **描述**:根据给定的任务 ID 获取爬取结果。
- **参数**:
- `taskId`:任务 ID。
- **返回值**:爬取结果的 JSON 字符串。## 日志
日志文件路径为 `./target/mcp-stdio-server.log`。
## MCP Server 配置
``` Json
{
"mcpServers": {
"jcawl4ai-mcp-server": {
"autoApprove": [
"crawl",
"task"
],
"disabled": false,
"timeout": 60,
"command": "java",
"args": [
"-jar",
"/path/to/your/jar/file/jcawl4ai-mcp-server-1.0.0.jar"
],
"transportType": "stdio"
}
}
}
```## 联系
如果您有任何问题或建议,请联系 [Ken Ye](mailto:yjz_work@126.com)。