https://github.com/mrasu/dataharpoon
An MCP-ready query engine that connects to your data — wherever it lives
https://github.com/mrasu/dataharpoon
database datafusion mcp mcp-client mcp-server query-engine
Last synced: 8 months ago
JSON representation
An MCP-ready query engine that connects to your data — wherever it lives
- Host: GitHub
- URL: https://github.com/mrasu/dataharpoon
- Owner: mrasu
- Created: 2025-05-25T16:14:19.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-07-06T13:10:47.000Z (9 months ago)
- Last Synced: 2025-07-06T14:24:40.764Z (9 months ago)
- Topics: database, datafusion, mcp, mcp-client, mcp-server, query-engine
- Language: Rust
- Homepage:
- Size: 139 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DataHarpoon
An MCP-ready query engine that connects to your data — wherever it lives
DataHarpoon lets you query both raw data files and MCP-generated results with:
* Natural-language via MCP
* Raw SQL for precise control
# Examples
## Natural Language via MCP
You can write in plain English, and MCP automatically generates and runs the SQL.
When you ask:
> With DataHarpoon, please tell me the number of users (written in user.csv) with the role 'member' in each organization (written in org.json).
MCP will query:
```sql
SELECT o.name AS organization_name, COUNT(u.id) AS member_count
FROM 'user.csv' u
JOIN 'org.json' o ON u.organization_id = o.id
WHERE u.role = 'member'
GROUP BY o.id, o.name
ORDER BY o.name;
```
and shows:
```
Member Count by Organization
| Organization Name | Number of Members |
|-------------------|-------------------|
| EduCore | 1 |
| GreenFuture | 1 |
| HealthPlus | 2 |
| Tech Innovators | 2 |
```
https://github.com/user-attachments/assets/46089765-b3d5-4c58-83e2-71a0e21a8db9
### Call MCP
You can also use SQL directly.
The following SQL query retrieves issues from GitHub and asks Claude to classify them as either bugs or feature requests based on their titles and bodies.
```sql
SELECT
html_url,
title,
exec_mcp(
'claude',
'chat-with-claude',
{'content': 'Classify the following GitHub issue as "bug", "feature request" or "other". Reply with the classification only. ' || title || '' || body || ''}
) AS category,
"user"['login'] AS user
FROM
call_mcp('github', 'list_issues', {'owner': 'github', 'repo': 'github-mcp-server'})
WHERE pull_request IS NULL
LIMIT 5;
#=>
+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+-----------------+----------------+
| html_url | title | category | user |
+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+-----------------+----------------+
| https://github.com/github/github-mcp-server/issues/520 | 64-Character Limit on Tool Names Conflicts with MCP Spec — Should Be Removed or Configurable | feature request | jlwainwright |
| https://github.com/github/github-mcp-server/issues/519 | [Visual Studio] cannot connect to remote MCP server `"Invalid content type: must be 'application/json'\n"` | bug | xperiandri |
| https://github.com/github/github-mcp-server/issues/517 | Add cursor install info to README.md | other | maxs10-creator |
| https://github.com/github/github-mcp-server/issues/507 | Regression in `get_file_contents` making it return `nil` in latest image `3e32f75` | bug | monotykamary |
| https://github.com/github/github-mcp-server/issues/504 | git blame tool (to get the latest contributors of the class) | feature request | ismurygin |
+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+-----------------+----------------+
```
Refer [example/README.md](./example/README.md) for more examples.
# How to use
After building, you can use DataHarpoon as an MCP server or run it via the CLI.
## Build
```shell
git clone git@github.com:mrasu/dataharpoon.git
cd dataharpoon
cargo build
# file will be in ./target/debug/dataharpoon
```
## Run via CLI
```shell
cd example
./../target/debug/dataharpoon
```
## Run with your own configuration
1. Create a `data_harpoon.toml`
2. Run the DataHarpoon binary with your config.
## Run as an MCP Server
Configure settings for your Agent.
```json
{
"mcpServers": {
"dataharpoon": {
"command": "/path/to/dataharpoon",
"args": [
"serve",
"mcp",
"-c",
"/path/to/data_harpoon.toml"
]
}
}
}
```