{"id":29449740,"url":"https://github.com/mrasu/dataharpoon","last_synced_at":"2025-08-01T10:07:31.143Z","repository":{"id":297894894,"uuid":"990159629","full_name":"mrasu/dataharpoon","owner":"mrasu","description":"An MCP-ready query engine that connects to your data — wherever it lives","archived":false,"fork":false,"pushed_at":"2025-07-06T13:10:47.000Z","size":142,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-06T14:24:40.764Z","etag":null,"topics":["database","datafusion","mcp","mcp-client","mcp-server","query-engine"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mrasu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-25T16:14:19.000Z","updated_at":"2025-07-06T13:10:49.000Z","dependencies_parsed_at":"2025-06-08T07:36:39.172Z","dependency_job_id":null,"html_url":"https://github.com/mrasu/dataharpoon","commit_stats":null,"previous_names":["mrasu/dataharpoon"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mrasu/dataharpoon","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrasu%2Fdataharpoon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrasu%2Fdataharpoon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrasu%2Fdataharpoon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrasu%2Fdataharpoon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mrasu","download_url":"https://codeload.github.com/mrasu/dataharpoon/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrasu%2Fdataharpoon/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265198361,"owners_count":23726449,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","datafusion","mcp","mcp-client","mcp-server","query-engine"],"created_at":"2025-07-13T20:08:42.984Z","updated_at":"2025-07-13T20:08:56.264Z","avatar_url":"https://github.com/mrasu.png","language":"Rust","readme":"# DataHarpoon\n\nAn MCP-ready query engine that connects to your data — wherever it lives\n\nDataHarpoon lets you query both raw data files and MCP-generated results with:\n* Natural-language via MCP\n* Raw SQL for precise control\n\n# Examples\n\n## Natural Language via MCP\n\nYou can write in plain English, and MCP automatically generates and runs the SQL.\n\nWhen you ask:\n\n\u003e With DataHarpoon, please tell me the number of users (written in user.csv) with the role 'member' in each organization (written in org.json).\n\nMCP will query:\n\n```sql\nSELECT o.name AS organization_name, COUNT(u.id) AS member_count\nFROM 'user.csv' u\nJOIN 'org.json' o ON u.organization_id = o.id\nWHERE u.role = 'member'\nGROUP BY o.id, o.name\nORDER BY o.name;\n```\n\nand shows:\n\n```\nMember Count by Organization\n| Organization Name | Number of Members |\n|-------------------|-------------------|\n| EduCore | 1 |\n| GreenFuture | 1 |\n| HealthPlus | 2 |\n| Tech Innovators | 2 |\n```\n\nhttps://github.com/user-attachments/assets/46089765-b3d5-4c58-83e2-71a0e21a8db9\n\n### Call MCP\n\nYou can also use SQL directly.\n\nThe following SQL query retrieves issues from GitHub and asks Claude to classify them as either bugs or feature requests based on their titles and bodies.\n\n```sql\nSELECT\n  html_url,\n  title,\n  exec_mcp(\n    'claude',\n    'chat-with-claude', \n    {'content': 'Classify the following GitHub issue as \"bug\", \"feature request\" or \"other\". Reply with the classification only.\u003ctitle\u003e ' || title || '\u003c/title\u003e\u003cbody\u003e' || body || '\u003c/body\u003e'}\n  ) AS category,\n  \"user\"['login'] AS user\nFROM\n  call_mcp('github', 'list_issues', {'owner': 'github', 'repo': 'github-mcp-server'})\nWHERE pull_request IS NULL\nLIMIT 5;\n\n#=\u003e\n+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+-----------------+----------------+\n| html_url                                               | title                                                                                                      | category        | user           |\n+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+-----------------+----------------+\n| https://github.com/github/github-mcp-server/issues/520 | 64-Character Limit on Tool Names Conflicts with MCP Spec — Should Be Removed or Configurable               | feature request | jlwainwright   |\n| https://github.com/github/github-mcp-server/issues/519 | [Visual Studio] cannot connect to remote MCP server `\"Invalid content type: must be 'application/json'\\n\"` | bug             | xperiandri     |\n| https://github.com/github/github-mcp-server/issues/517 | Add cursor install info to README.md                                                                       | other           | maxs10-creator |\n| https://github.com/github/github-mcp-server/issues/507 | Regression in `get_file_contents` making it return `nil` in latest image `3e32f75`                         | bug             | monotykamary   |\n| https://github.com/github/github-mcp-server/issues/504 | git blame tool (to get the latest contributors of the class)                                               | feature request | ismurygin      |\n+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+-----------------+----------------+\n```\n\nRefer [example/README.md](./example/README.md) for more examples.\n\n# How to use\n\nAfter building, you can use DataHarpoon as an MCP server or run it via the CLI.\n\n## Build\n\n```shell\ngit clone git@github.com:mrasu/dataharpoon.git\ncd dataharpoon\ncargo build\n\n# file will be in ./target/debug/dataharpoon\n```\n\n## Run via CLI\n\n```shell\ncd example\n./../target/debug/dataharpoon\n```\n\n## Run with your own configuration\n\n1. Create a `data_harpoon.toml`\n2. Run the DataHarpoon binary with your config.\n\n## Run as an MCP Server\n\nConfigure settings for your Agent.\n\n```json\n{\n  \"mcpServers\": {\n    \"dataharpoon\": {\n      \"command\": \"/path/to/dataharpoon\",\n      \"args\": [\n        \"serve\",\n        \"mcp\",\n        \"-c\",\n        \"/path/to/data_harpoon.toml\"\n      ]\n    }\n  }\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrasu%2Fdataharpoon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmrasu%2Fdataharpoon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrasu%2Fdataharpoon/lists"}