{"id":19660228,"url":"https://github.com/takumakanari/embulk-parser-xml","last_synced_at":"2025-04-28T20:32:09.267Z","repository":{"id":28688870,"uuid":"32208946","full_name":"takumakanari/embulk-parser-xml","owner":"takumakanari","description":"Embulk parser plugin for xml","archived":false,"fork":false,"pushed_at":"2019-11-19T00:26:34.000Z","size":24,"stargazers_count":11,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-16T13:15:59.778Z","etag":null,"topics":["embulk","embulk-parser-plugin","ruby","xml","xpath"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/takumakanari.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-03-14T11:52:54.000Z","updated_at":"2022-11-07T02:28:15.000Z","dependencies_parsed_at":"2022-08-29T01:10:40.151Z","dependency_job_id":null,"html_url":"https://github.com/takumakanari/embulk-parser-xml","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-parser-xml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-parser-xml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-parser-xml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/takumakanari%2Fembulk-parser-xml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/takumakanari","download_url":"https://codeload.github.com/takumakanari/embulk-parser-xml/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251383886,"owners_count":21580959,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embulk","embulk-parser-plugin","ruby","xml","xpath"],"created_at":"2024-11-11T15:45:43.332Z","updated_at":"2025-04-28T20:32:09.041Z","avatar_url":"https://github.com/takumakanari.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# XML parser plugin for Embulk\n\nParser plugin for [Embulk](https://github.com/embulk/embulk).\n\nRead data from input as xml and fetch each entries to output.\n\n## Overview\n\n* **Plugin type**: parser\n* **Load all or nothing**: yes\n* **Resume supported**: no\n\n## Types\n\n- **xml**:   Find rows by SAX.\n- **xpath**: Find finds rows by Xpath, so you can process XML by more complex condition than *xml* type.\n\n## Configuration\n\n### XML\n\n```yaml\nparser:\n  type: xml\n  root: data/students/student\n  schema:\n    - {name: name, type: string}\n    - {name: age, type: long}\n```\n\n- **type**: specify this plugin as `xml` .\n- **root**: root property to start fetching each entries, specify in *path/to/node* style, required.\n- **schema**: specify the attribute of table and data type, required.\n\nIf you need to parse column as timestamp type, *schema* supports 2 optional parameters:\n\n```yaml\nschema:\n  - {name: timestamp_column, type: timestamp, format: \"%Y-%m-%d\", timezone: \"+0000\"}\n```\n\n- **format**: timestamp format to parse, required.\n- **timezone**: timestamp will be parsing in this timezone, `\"+0900\"` is used by default.\n\n\n### Xpath\n\n```yaml\nparser:\n  type: xpath\n  root: //data/students/student\n  schema:\n    - {path: name, type: string, name: name}\n    - {path: age, type: long, name: age}\n    - {path: hobbies/hobby, type: json, name: hobbies}\n```\n\n- **type**: specify this plugin as `xpath` .\n- **root**: root property to start fetching each entries, specify in Xpath, *'/''* is used by default.\n- **schema**: specify the attribute of table and data type, required.\n- **namespaces**: xml namespaces\n\n\nIf you need to parse column as timestamp type, *schema* supports 2 optional parameters:\n\n```yaml\nschema:\n  - {name: timestamp_column, type: timestamp, format: \"%Y-%m-%d\", timezone: \"+0000\"}\n```\n\n- **format**: timestamp format to parse, required.\n- **timezone**: timestamp will be parsing in this timezone, `\"+0900\"` is used by default.\n\n\nHere is XML for xample:\n\n```xml\n\u003cdata\u003e\n  \u003cresult\u003etrue\u003c/result\u003e\n  \u003cstudents\u003e\n    \u003cstudent\u003e\n      \u003cname\u003eJohn\u003c/name\u003e\n      \u003cage\u003e10\u003c/age\u003e\n      \u003chobbies\u003e\n        \u003chobby\u003emusic\u003c/hobby\u003e\n        \u003chobby\u003emovie\u003c/hobby\u003e\n      \u003c/hobbies\u003e\n    \u003c/student\u003e\n    \u003cstudent\u003e\n      \u003cname\u003ePaul\u003c/name\u003e\n      \u003cage\u003e16\u003c/age\u003e\n      \u003chobbies\u003e\n        \u003chobby\u003egame\u003c/hobby\u003e\n      \u003c/hobbies\u003e\n    \u003c/student\u003e\n    \u003cstudent\u003e\n      \u003cname\u003eGeorge\u003c/name\u003e\n      \u003cage\u003e17\u003c/age\u003e\n    \u003c/student\u003e\n    \u003cstudent\u003e\n      \u003cname\u003eRingo\u003c/name\u003e\n      \u003cage\u003e18\u003c/age\u003e\n    \u003c/student\u003e\n  \u003c/students\u003e\n\u003c/data\u003e\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftakumakanari%2Fembulk-parser-xml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftakumakanari%2Fembulk-parser-xml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftakumakanari%2Fembulk-parser-xml/lists"}