{"id":32206744,"url":"https://github.com/jsugarelli/xml2relational","last_synced_at":"2026-02-21T03:32:33.652Z","repository":{"id":44295327,"uuid":"259912932","full_name":"jsugarelli/xml2relational","owner":"jsugarelli","description":"Converting XML documents into relational data models","archived":false,"fork":false,"pushed_at":"2022-02-10T19:03:31.000Z","size":28,"stargazers_count":11,"open_issues_count":2,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-12-09T18:16:01.593Z","etag":null,"topics":["r-package","relational-database","relational-model","sql","xml","xml-serialization"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jsugarelli.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-04-29T11:58:40.000Z","updated_at":"2025-07-17T18:28:58.000Z","dependencies_parsed_at":"2022-09-15T03:25:19.839Z","dependency_job_id":null,"html_url":"https://github.com/jsugarelli/xml2relational","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jsugarelli/xml2relational","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jsugarelli%2Fxml2relational","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jsugarelli%2Fxml2relational/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jsugarelli%2Fxml2relational/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jsugarelli%2Fxml2relational/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jsugarelli","download_url":"https://codeload.github.com/jsugarelli/xml2relational/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jsugarelli%2Fxml2relational/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29672704,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-21T03:11:15.450Z","status":"ssl_error","status_checked_at":"2026-02-21T03:10:34.920Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["r-package","relational-database","relational-model","sql","xml","xml-serialization"],"created_at":"2025-10-22T05:35:07.782Z","updated_at":"2026-02-21T03:32:33.641Z","avatar_url":"https://github.com/jsugarelli.png","language":"R","readme":"The xml2relational Package\n================\nJoachim Zuckarelli\n\n## What `xml2relational` does\n\n`xml2relational` is designed to convert XML documents with nested object\nhierarchies into a set of R dataframes. These dataframes represent the\ndifferent tables in a relational data model and are connected amongst\neach other by foreign keys. Essentially, `xml2relational` flattens an\nobject-oriented data structure into a relational data structure.\n\nOnce the relational structure is created (and that is basically a list\nof dataframes representing the different tables) you can export both the\ndata model (as SQL `CREATE` statements) and the data (either as SQL\n`INSERT` statements or as CSV files) to get the data easily into a\nrelational database.\n\n## Getting started\n\n## Installing and loading `xml2relational`\n\nYou can install the `xml2relational` package from CRAN by executing the\nfollowing code in your script or in the R console:\n\n``` r\ninstall.packages(\"xml2relational\", dependencies = TRUE)\n```\n\nAfter having installed the package you need to load it (attach it to the\nsearch path) by calling `library()`:\n\n``` r\nlibrary(xml2relational)\n```\n\n### Our example data set\n\nTo demonstrate how `xml2relational` works, we will use a small sample\ndataset that is shipped together with the `xml2relational` package: the\n`customer` dataset.\n\nHere is how it looks like:\n\n    \u003cxml\u003e\n        \u003ccustomer\u003e \n            \u003ccustomerno\u003eC0023751\u003c/customerno\u003e\n            \u003cgivenname\u003eSarah\u003c/givenname\u003e\n            \u003csurname\u003eDurbin\u003c/surname\u003e\n            \u003cemail\u003esarah.durbin@absolutelynowhere.com\u003c/email\u003e\n            \u003caddress\u003e\n                \u003cstreet\u003e139 W Jackson Blvd\u003c/street\u003e\n                \u003cpostalcode\u003e60604\u003c/postalcode\u003e\n                \u003ccity\u003e\n                    \u003cname\u003eChicago\u003c/name\u003e\n                    \u003cstate\u003eIllinois\u003c/state\u003e\n                \u003c/city\u003e\n                \u003ccountry\u003e\n                    \u003cname\u003eUnited States of America\u003c/name\u003e\n                    \u003cisocode\u003eUS\u003c/isocode\u003e\n                \u003c/country\u003e\n            \u003c/address\u003e\n            \u003cusername\u003equeenofqueens\u003c/username\u003e\n        \u003c/customer\u003e\n    \n        \u003ccustomer\u003e\n            \u003ccustomerno\u003eC0017439\u003c/customerno\u003e\n            \u003cgivenname\u003eMark\u003c/givenname\u003e\n            \u003csurname\u003eDurbin\u003c/surname\u003e\n            \u003cemail\u003emark@durbinshome.net\u003c/email\u003e\n            \u003caddress\u003e\n                \u003cstreet\u003e139 W Jackson Blvd\u003c/street\u003e\n                \u003cpostalcode\u003e60604\u003c/postalcode\u003e\n                \u003ccity\u003e\n                    \u003cname\u003eChicago\u003c/name\u003e\n                    \u003cstate\u003eIllinois\u003c/state\u003e\n                \u003c/city\u003e\n                \u003ccountry\u003e\n                    \u003cname\u003eUnited States of America\u003c/name\u003e\n                    \u003cisocode\u003eUS\u003c/isocode\u003e\n                \u003c/country\u003e\n            \u003c/address\u003e\n            \u003cusername\u003edurby82\u003c/username\u003e    \n        \u003c/customer\u003e\n    \n        \u003ccustomer\u003e\n            \u003ccustomerno\u003eC0248538\u003c/customerno\u003e \n            \u003cgivenname\u003eMax\u003c/givenname\u003e\n            \u003csurname\u003eBrunner\u003c/surname\u003e\n            \u003cemail\u003embrunner@winetasting-brunner.de\u003c/email\u003e\n            \u003caddress\u003e\n                \u003cstreet\u003eRotkreuzplatz 5\u003c/street\u003e\n                \u003cpostalcode\u003e80634\u003c/postalcode\u003e\n                \u003ccity\u003e\n                    \u003cname\u003eMunich\u003c/name\u003e\n                    \u003cstate\u003eBavaria\u003c/state\u003e\n                \u003c/city\u003e     \n                \u003ccountry\u003e\n                    \u003cname\u003eGermany\u003c/name\u003e\n                    \u003cisocode\u003eDE\u003c/isocode\u003e\n                \u003c/country\u003e\n            \u003c/address\u003e\n            \u003cusername\u003ebrunnermax_69\u003c/username\u003e  \n        \u003c/customer\u003e\n    \n        \u003ccustomer\u003e\n            \u003ccustomerno\u003eC0271182\u003c/customerno\u003e\n            \u003cgivenname\u003eUrs\u003c/givenname\u003e\n            \u003csurname\u003eRichli\u003c/surname\u003e\n            \u003cemail\u003eurs.richli@richli-design.ch\u003c/email\u003e\n            \u003caddress\u003e\n                \u003cstreet\u003eSeestrasse 43\u003c/street\u003e\n                \u003cpostalcode\u003e6052\u003c/postalcode\u003e\n                \u003ccity\u003e\n                    \u003cname\u003eHergiswil\u003c/name\u003e\n                    \u003cstate\u003eLuzern\u003c/state\u003e\n                \u003c/city\u003e\n                \u003ccountry\u003e\n                    \u003cname\u003eSwitzerland\u003c/name\u003e\n                    \u003cisocode\u003eCH\u003c/isocode\u003e\n                \u003c/country\u003e\n            \u003c/address\u003e\n            \u003cusername\u003eursrichli\u003c/username\u003e\n        \u003c/customer\u003e\n    \n        \u003ccustomer\u003e\n            \u003ccustomerno\u003eC0019935\u003c/customerno\u003e \n            \u003cgivenname\u003eClara-Sophie\u003c/givenname\u003e\n            \u003csurname\u003eDr. Hellmann\u003c/surname\u003e\n            \u003cemail\u003eclara-sophie@ginternetpost.de\u003c/email\u003e\n            \u003caddress\u003e\n                \u003cstreet\u003eBrienner Strasse 11\u003c/street\u003e\n                \u003cpostalcode\u003e80333\u003c/postalcode\u003e\n                \u003ccity\u003e\n                    \u003cname\u003eMunich\u003c/name\u003e\n                    \u003cstate\u003eBavaria\u003c/state\u003e\n                \u003c/city\u003e     \n                \u003ccountry\u003e\n                    \u003cname\u003eGermany\u003c/name\u003e\n                    \u003cisocode\u003eDE\u003c/isocode\u003e\n                \u003c/country\u003e\n            \u003c/address\u003e\n            \u003cusername\u003ehelli\u003c/username\u003e  \n        \u003c/customer\u003e\n    \n        \u003ccustomer\u003e\n            \u003ccustomerno\u003eC0019935\u003c/customerno\u003e \n            \u003cgivenname\u003eThomas\u003c/givenname\u003e\n            \u003csurname\u003eChang\u003c/surname\u003e\n            \u003cemail\u003echang-thomas@sf-foryou.com\u003c/email\u003e\n            \u003caddress\u003e\n                \u003cstreet\u003e539 Lombard St\u003c/street\u003e\n                \u003cpostalcode\u003e94133\u003c/postalcode\u003e\n                \u003ccity\u003e\n                    \u003cname\u003eSan Francisco\u003c/name\u003e\n                    \u003cstate\u003eCalifornia\u003c/state\u003e\n                \u003c/city\u003e     \n                \u003ccountry\u003e\n                    \u003cname\u003eUnited States of America\u003c/name\u003e\n                    \u003cisocode\u003eUS\u003c/isocode\u003e\n                \u003c/country\u003e\n            \u003c/address\u003e\n            \u003cusername\u003etchango123\u003c/username\u003e \n        \u003c/customer\u003e\n    \u003c/xml\u003e\n\nIn this dataset we have a nested object structure. Specifically, each\ncustomer has an address consisting of several elements. Among those\nelements is the city which is again an object of its own, with a city\nname and state. The same applies to the country which is included with\nits name and its ISO country code. When you look at the (completely\nmade-up) customers here, you will notice that the customers Sarah Durbin\nand Mark Durbin (the first two customers) share the same address. Also,\nMax Brunner and Clara-Sophie Hellmann both live in Munich, Germany\n(although at different addresses). Thomas Chang of San Francisco lives\nin the USA, as do the Durbins.\n\nWhen we now process the data and derive the relational data model,\n`xml2relational` will take care of these ‘duplicates’.\n\n### Processing the data\n\nDeriving the relational data model from this XML data is fairly simple:\n\n``` r\ncustomer.data \u003c- toRelational(\"customers.xml\")\n```\n\nThe `toRelational()` function flattens the hierarchical structure of the\nXML data and distributes the data to a set of dataframes representing\nthe tables of our relational data model. It returns these dataframes as\na list (`customer.data`). We can now inspect this list to see the tables\nthat have been generated:\n\n``` r\nclass(customer.data)\n```\n\n    ## [1] \"list\"\n\n``` r\nnames(customer.data)\n```\n\n    ## [1] \"xml\"      \"customer\" \"address\"  \"city\"     \"country\"\n\n``` r\nclass(customer.data$customer)\n```\n\n    ## [1] \"data.frame\"\n\nLet us have a closer look at the `customer` dataframe:\n\n``` r\ncustomer.data$customer\n```\n\n    ##   ID_customer customerno    givenname      surname\n    ## 1      263023   C0023751        Sarah       Durbin\n    ## 2      597336   C0017439         Mark       Durbin\n    ## 3       59960   C0248538          Max      Brunner\n    ## 4      159381   C0271182          Urs       Richli\n    ## 5       83969   C0019935 Clara-Sophie Dr. Hellmann\n    ## 6      465004   C0019935       Thomas        Chang\n    ##                                email FKID_address      username\n    ## 1 sarah.durbin@absolutelynowhere.com       674038 queenofqueens\n    ## 2               mark@durbinshome.net       674038       durby82\n    ## 3    mbrunner@winetasting-brunner.de       149765 brunnermax_69\n    ## 4        urs.richli@richli-design.ch       718252     ursrichli\n    ## 5      clara-sophie@ginternetpost.de       977313         helli\n    ## 6         chang-thomas@sf-foryou.com       112551    tchango123\n\nAs you can see, each customer record has been assigned a primary key,\n`ID_customer`. The argument `prefix.primary` of the `toRelational()`\nfunction lets you change the prefix that is used to identify primary key\nfields. Its default value is `\"ID_\"`. Similiarly, using the\n`prefix.foreign` argument you can change the prefix used for the names\nof foreign key fields from its default value `\"FKID_\"` to whatever you\nlike. The name of the key fields always consists of the prefix and the\nname of the table.\n\nIn the `customer` table we have a foreign key that relates to the\naddress. You may have noticed that, as expected, the data records of\nSarah and Mark Durbin point to the same `address` record as they live in\nthe same place.\n\nLet us now look into the address table:\n\n``` r\ncustomer.data$address\n```\n\n    ##   ID_address              street postalcode FKID_city FKID_country\n    ## 1     674038  139 W Jackson Blvd      60604    735977       495268\n    ## 2     149765     Rotkreuzplatz 5      80634      2299       352009\n    ## 3     718252       Seestrasse 43       6052    448761       817914\n    ## 4     977313 Brienner Strasse 11      80333      2299       352009\n    ## 5     112551      539 Lombard St      94133     70561       495268\n\nAgain, the address points to other tables, namely the `city` and the\n`country` table. As we would have expected, the two Munich addresses\npoint to the same city and the same country, and the two US addresses\npoint to the same record in the `country` table.\n\nYou see how easy it is to flatten a hierarchical, objected-oriented XML\ndata structure to a relational data model using the `toRelational()`\nfunction.\n\n### Saving the results\n\nIn the next step, we want to export our results. That can mean two\nthings:\n\n  - exporting the data model (i.e. the structure of the tables)\n  - exporting the data, the content of the tables.\n\nFor the first task, `xml2relational` provides the `getCreateSQL()`\nfunction. This function returns ready-to-excecute SQL `CREATE`\nstatements. It supports three built-in SQL flavors, `MySQL`,\n`TransactSQL` and `Oracle`. You add additional SQL flavors, if you like.\nIn this case, you would use `sql.style` argument to provide a special\ndataframe containing the required definitions for the new SQL dialect.\nPlease consult the online help texts for more information on how this is\ndone.\n\nIn order to generate proper SQL `CREATE` statements, `getCreateSQL()`\nguesses the data types of the table fields from the data. If you do not\nlike the results, you can provide your own function to derive the data\ntypes as `datatype.func` argument. This function would need to accept\nexactly one argument, a vector with the field vales of the field for\nwhich a datatype needs to be guessed. It then must return the datatype\nas a one-element character vector.\n\nIf you are not going to change the behavior of `getCreateSQL()` using\nthese options, generating the SQL `CREATE` statements is very\nstraightforward:\n\n``` r\ncreate.sql \u003c- getCreateSQL(customer.data, \"MySQL\")\ncat(create.sql, sep=\"\\n\\n\")\n```\n\n    ## CREATE TABLE xml (\n    ## PRIMARY KEY (ID_xml)\n    ## , ID_xml BIGINT\n    ## , FOREIGN KEY (FKID_customer) REFERENCES customer(ID_customer)\n    ## , FKID_customer BIGINT\n    ## );\n    ## \n    ## CREATE TABLE customer (\n    ## PRIMARY KEY (ID_customer)\n    ## , ID_customer BIGINT\n    ## , customerno VARCHAR(8) NOT NULL\n    ## , givenname VARCHAR(12) NOT NULL\n    ## , surname VARCHAR(12) NOT NULL\n    ## , email VARCHAR(34) NOT NULL\n    ## , FOREIGN KEY (FKID_address) REFERENCES address(ID_address)\n    ## , FKID_address BIGINT\n    ## , username VARCHAR(13) NOT NULL\n    ## );\n    ## \n    ## CREATE TABLE address (\n    ## PRIMARY KEY (ID_address)\n    ## , ID_address BIGINT\n    ## , street VARCHAR(19) NOT NULL\n    ## , postalcode BIGINT NOT NULL\n    ## , FOREIGN KEY (FKID_city) REFERENCES city(ID_city)\n    ## , FKID_city BIGINT\n    ## , FOREIGN KEY (FKID_country) REFERENCES country(ID_country)\n    ## , FKID_country BIGINT\n    ## );\n    ## \n    ## CREATE TABLE city (\n    ## PRIMARY KEY (ID_city)\n    ## , ID_city BIGINT\n    ## , name VARCHAR(13) NOT NULL\n    ## , state VARCHAR(10) NOT NULL\n    ## );\n    ## \n    ## CREATE TABLE country (\n    ## PRIMARY KEY (ID_country)\n    ## , ID_country BIGINT\n    ## , name VARCHAR(24) NOT NULL\n    ## , isocode VARCHAR(2) NOT NULL\n    ## );\n\n`xml2relational` tries to guess the datatype from the actual data. When\nyou are working with the `MySQL`, `Transact SQL` (`T-SQL`) and `Oracle`\ndialects/flavors of SQL, this should be alright. Nevertheless, using the\n`datatype.func` argument of `getcreateSQL()` you can also provide your\nown function to determine the data type. This function would need to\ntake exactly one argument, a data vector from a data table, and return\nthe appropriate SQL data type as a one-element character vector.\nAlternatively, you can also use the built-in mechanism for determining\nthe data type and just supply additional information on the SQL flavor\nthat you use. Please consult the online help with `?getCreateSQL` to\nlearn more on providing the necessary information.\n\nBy setting the logical `one.statement` argument to `TRUE` you can let\n`getcreateSQL()` return the `CREATE` statements in one character value\ninstead of a vector with one element per `CREATE` statement. In this\ncase you can use the `line.break` argument to define how the different\n`CREATE` statement are to be separated (apart from a semicolon that is\nadded by default).\n\nTo export the data as such you have two options:\n\n  - you export ready-to-execute SQL `INSERT` statements using\n    `getInsertSQL()` function\n  - you save the data to CSV files using `savetofiles()`.\n\nProducing SQL `INSERT` statements for the data in one of the tables is\nvery easy with `getInsertSQL()`:\n\n``` r\ninsert.sql \u003c- getInsertSQL(customer.data, table.name = \"city\")\ncat(insert.sql, sep=\"\\n\")\n```\n\n    ## INSERT INTO city(ID_city, name, state) VALUES (735977, 'Chicago', 'Illinois');\n    ## INSERT INTO city(ID_city, name, state) VALUES (2299, 'Munich', 'Bavaria');\n    ## INSERT INTO city(ID_city, name, state) VALUES (448761, 'Hergiswil', 'Luzern');\n    ## INSERT INTO city(ID_city, name, state) VALUES (70561, 'San Francisco', 'California');\n\nYou can also export all the tables of your relational model with\n`savetofiles()`:\n\n``` r\nsavetofiles(customer.data)\n```\n\nThis will save as many CSV files to your current working directory as\nyou have tables in your model (`customer.data`). Each file is named for\nthe name of the dataframe connected to the respective table, so\n`city.csv` will store the data from the `city` table.\n\nMore optional arguments for most of the functions discussed here are\navailable. Please check the online help for more details.\n\n## Contact the author\n\nI appreciate your questions, issues and feature requests. Contact me on\n\u003cjoachim@zuckarelli.de\u003e, visit the GitHub repository on\n\u003chttps://github.com/jsugarelli/xml2relational\u003e for the package source\nand [follow me on Twitter](https://twitter.com/jsugarelli) to stay\nup-to-date\\!\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjsugarelli%2Fxml2relational","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjsugarelli%2Fxml2relational","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjsugarelli%2Fxml2relational/lists"}