{"id":26160705,"url":"https://github.com/sammcj/mcp-data-extractor","last_synced_at":"2026-03-08T18:38:03.422Z","repository":{"id":279168203,"uuid":"937920107","full_name":"sammcj/mcp-data-extractor","owner":"sammcj","description":"A model context protocol server to migrate data out of code (ts/js) into config (json)","archived":false,"fork":false,"pushed_at":"2025-02-27T01:34:01.000Z","size":154,"stargazers_count":8,"open_issues_count":1,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-30T13:58:13.251Z","etag":null,"topics":["data","javascript","js","json","llm","mcp","tool","ts","typescript"],"latest_commit_sha":null,"homepage":"https://smcleod.net","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sammcj.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["sammcj"],"buy_me_a_coffee":"sam.mcleod"}},"created_at":"2025-02-24T06:01:32.000Z","updated_at":"2025-09-03T22:32:57.000Z","dependencies_parsed_at":"2025-02-24T06:39:40.289Z","dependency_job_id":"fc409998-0774-44ed-82d0-09de5485c6e5","html_url":"https://github.com/sammcj/mcp-data-extractor","commit_stats":null,"previous_names":["sammcj/mcp-data-extractor"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/sammcj/mcp-data-extractor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammcj%2Fmcp-data-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammcj%2Fmcp-data-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammcj%2Fmcp-data-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammcj%2Fmcp-data-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sammcj","download_url":"https://codeload.github.com/sammcj/mcp-data-extractor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sammcj%2Fmcp-data-extractor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30269178,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-08T17:53:40.517Z","status":"ssl_error","status_checked_at":"2026-03-08T17:53:40.101Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","javascript","js","json","llm","mcp","tool","ts","typescript"],"created_at":"2025-03-11T12:19:22.884Z","updated_at":"2026-03-08T18:38:03.398Z","avatar_url":"https://github.com/sammcj.png","language":"JavaScript","funding_links":["https://github.com/sponsors/sammcj","https://buymeacoffee.com/sam.mcleod"],"categories":["Search \u0026 Data Extraction","Document Processing","🌐 Web Development","JavaScript"],"sub_categories":["How to Submit"],"readme":"# mcp-data-extractor MCP Server\n\nA Model Context Protocol server that extracts embedded data (such as i18n translations or key/value configurations) from TypeScript/JavaScript source code into structured JSON configuration files.\n\n[![smithery badge](https://smithery.ai/badge/mcp-data-extractor)](https://smithery.ai/server/mcp-data-extractor)\n\n\u003ca href=\"https://glama.ai/mcp/servers/40c3iyazm5\"\u003e\u003cimg width=\"380\" height=\"200\" src=\"https://glama.ai/mcp/servers/40c3iyazm5/badge\" alt=\"MCP Data Extractor MCP server\" /\u003e\u003c/a\u003e\n\n## Features\n\n- Data Extraction:\n  - Extracts string literals, template literals, and complex nested objects\n  - Preserves template variables (e.g., `Hello, {{name}}!`)\n  - Supports nested object structures and arrays\n  - Maintains hierarchical key structure using dot notation\n  - Handles both TypeScript and JavaScript files with JSX support\n  - Replaces source file content with \"MIGRATED TO \u003ctarget absolute path\u003e\" after successful extraction (configurable)\n\n- SVG Extraction:\n  - Extracts SVG components from React/TypeScript/JavaScript files\n  - Preserves SVG structure and attributes\n  - Removes React-specific code and props\n  - Creates individual .svg files named after their component\n  - Replaces source file content with \"MIGRATED TO \u003ctarget absolute path\u003e\" after successful extraction (configurable)\n\n## Usage\n\nAdd to your MCP Client configuration:\n\n```bash\n{\n  \"mcpServers\": {\n    \"data-extractor\": {\n      \"command\": \"npx\",\n      \"args\": [\n        \"-y\",\n        \"mcp-data-extractor\"\n      ],\n      \"disabled\": false,\n      \"autoApprove\": [\n        \"extract_data\",\n        \"extract_svg\"\n      ]\n    }\n  }\n}\n```\n\n### Basic Usage\n\nThe server provides two tools:\n\n#### 1. Data Extraction\n\nUse `extract_data` to extract data (like i18n translations) from source files:\n\n```typescript\n\u003cuse_mcp_tool\u003e\n\u003cserver_name\u003edata-extractor\u003c/server_name\u003e\n\u003ctool_name\u003eextract_data\u003c/tool_name\u003e\n\u003carguments\u003e\n{\n  \"sourcePath\": \"src/translations.ts\",\n  \"targetPath\": \"src/translations.json\"\n}\n\u003c/arguments\u003e\n\u003c/use_mcp_tool\u003e\n```\n\n#### 2. SVG Extraction\n\nUse `extract_svg` to extract SVG components into individual files:\n\n```typescript\n\u003cuse_mcp_tool\u003e\n\u003cserver_name\u003edata-extractor\u003c/server_name\u003e\n\u003ctool_name\u003eextract_svg\u003c/tool_name\u003e\n\u003carguments\u003e\n{\n  \"sourcePath\": \"src/components/icons/InspectionIcon.tsx\",\n  \"targetDir\": \"src/assets/icons\"\n}\n\u003c/arguments\u003e\n\u003c/use_mcp_tool\u003e\n```\n\n### Source File Replacement\n\nBy default, after successful extraction, the server will replace the content of the source file with:\n- \"MIGRATED TO \u003ctarget path\u003e\" for data extraction\n- \"MIGRATED TO \u003ctarget directory\u003e\" for SVG extraction\n\nThis helps track which files have already been processed and prevents duplicate extraction. It also makes it easy for LLMs and developers to see where the extracted data now lives when they encounter the source file later.\n\nTo disable this behavior, set the `DISABLE_SOURCE_REPLACEMENT` environment variable to `true` in your MCP configuration:\n\n```json\n{\n  \"mcpServers\": {\n    \"data-extractor\": {\n      \"command\": \"npx\",\n      \"args\": [\n        \"-y\",\n        \"mcp-data-extractor\"\n      ],\n      \"env\": {\n        \"DISABLE_SOURCE_REPLACEMENT\": \"true\"\n      },\n      \"disabled\": false,\n      \"autoApprove\": [\n        \"extract_data\",\n        \"extract_svg\"\n      ]\n    }\n  }\n}\n```\n\n### Supported Patterns\n\n#### Data Extraction Patterns\n\nThe data extractor supports various patterns commonly used in TypeScript/JavaScript applications:\n\n1. Simple Object Exports:\n```typescript\nexport default {\n  welcome: \"Welcome to our app\",\n  greeting: \"Hello, {name}!\",\n  submit: \"Submit form\"\n};\n```\n\n2. Nested Objects:\n```typescript\nexport default {\n  header: {\n    title: \"Book Your Flight\",\n    subtitle: \"Find the best deals\"\n  },\n  footer: {\n    content: [\n      \"Please refer to {{privacyPolicyUrl}} for details\",\n      \"© {{year}} {{companyName}}\"\n    ]\n  }\n};\n```\n\n3. Complex Structures with Arrays:\n```typescript\nexport default {\n  faq: {\n    heading: \"Common questions\",\n    content: [\n      {\n        heading: \"What if I need to change my flight?\",\n        content: \"You can change your flight online if:\",\n        list: [\n          \"You have a flexible fare type\",\n          \"Your flight is more than 24 hours away\"\n        ]\n      }\n    ]\n  }\n};\n```\n\n4. Template Literals with Variables:\n```typescript\nexport default {\n  greeting: `Hello, {{username}}!`,\n  message: `Welcome to {{appName}}`\n};\n```\n\n### Output Formats\n\n#### Data Extraction Output\n\nThe extracted data is saved as a JSON file with dot notation for nested structures:\n\n```json\n{\n  \"welcome\": \"Welcome to our app\",\n  \"header.title\": \"Book Your Flight\",\n  \"footer.content.0\": \"Please refer to {{privacyPolicyUrl}} for details\",\n  \"footer.content.1\": \"© {{year}} {{companyName}}\",\n  \"faq.content.0.heading\": \"What if I need to change my flight?\"\n}\n```\n\n#### SVG Extraction Output\n\nSVG components are extracted into individual .svg files, with React-specific code removed. For example:\n\nInput (React component):\n```tsx\nconst InspectionIcon: React.FC\u003cInspectionIconProps\u003e = ({ title }) =\u003e (\n  \u003csvg className=\"c-tab__icon\" width=\"40px\" id=\"Layer_1\" data-name=\"Layer 1\" xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 32 32\"\u003e\n    \u003ctitle\u003e{title}\u003c/title\u003e\n    \u003cpath className=\"cls-1\" d=\"M18.89,12.74a3.18,3.18,0,0,1-3.24-3.11...\" /\u003e\n  \u003c/svg\u003e\n);\n```\n\nOutput (InspectionIcon.svg):\n```svg\n\u003csvg width=\"40px\" id=\"Layer_1\" data-name=\"Layer 1\" xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 32 32\"\u003e\n    \u003cpath class=\"cls-1\" d=\"M18.89,12.74a3.18,3.18,0,0,1-3.24-3.11...\" /\u003e\n\u003c/svg\u003e\n```\n\n## Extending Supported Patterns\n\nThe extractor uses Babel to parse and traverse the AST (Abstract Syntax Tree) of your source files. You can extend the supported patterns by modifying the source code:\n\n1. **Add New Node Types**: The `extractStringValue` method in `src/index.ts` handles different types of string values. Extend it to support new node types:\n\n```typescript\nprivate extractStringValue(node: t.Node): string | null {\n  if (t.isStringLiteral(node)) {\n    return node.value;\n  } else if (t.isTemplateLiteral(node)) {\n    return node.quasis.map(quasi =\u003e quasi.value.raw).join('{{}}');\n  }\n  // Add support for new node types here\n  return null;\n}\n```\n\n2. **Custom Value Processing**: The `processValue` method handles different value types (strings, arrays, objects). Extend it to support new value types or custom processing:\n\n```typescript\nprivate processValue(value: t.Node, currentPath: string[]): void {\n  if (t.isStringLiteral(value) || t.isTemplateLiteral(value)) {\n    // Process string values\n  } else if (t.isArrayExpression(value)) {\n    // Process arrays\n  } else if (t.isObjectExpression(value)) {\n    // Process objects\n  }\n  // Add support for new value types here\n}\n```\n\n3. **Custom AST Traversal**: The server uses Babel's traverse to walk the AST. You can add new visitors to handle different node types:\n\n```typescript\ntraverse(ast, {\n  ExportDefaultDeclaration(path: NodePath\u003ct.ExportDefaultDeclaration\u003e) {\n    // Handle default exports\n  },\n  // Add new visitors here\n});\n```\n\n## Development\n\nInstall dependencies:\n```bash\nnpm install\n```\n\nBuild the server:\n```bash\nnpm run build\n```\n\nFor development with auto-rebuild:\n```bash\nnpm run watch\n```\n\n### Debugging\n\nSince MCP servers communicate over stdio, debugging can be challenging. We recommend using the [MCP Inspector](https://github.com/modelcontextprotocol/inspector), which is available as a package script:\n\n```bash\nnpm run inspector\n```\n\nThe Inspector will provide a URL to access debugging tools in your browser.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsammcj%2Fmcp-data-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsammcj%2Fmcp-data-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsammcj%2Fmcp-data-extractor/lists"}