{"id":15266402,"url":"https://github.com/jkminder/data2neo","last_synced_at":"2025-04-12T04:44:13.523Z","repository":{"id":242872647,"uuid":"437839060","full_name":"jkminder/data2neo","owner":"jkminder","description":"Data2Neo is a library that simplifies the conversion of data in relational format to a graph knowledge database.","archived":false,"fork":false,"pushed_at":"2024-08-25T20:47:36.000Z","size":5864,"stargazers_count":21,"open_issues_count":3,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-26T00:13:01.661Z","etag":null,"topics":["data-cleaning","data-conversion","data-engineering","data2neo","database-migrations","graphs","neo4j","relational-databases","remodeling"],"latest_commit_sha":null,"homepage":"https://data2neo.jkminder.ch","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jkminder.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-13T11:02:27.000Z","updated_at":"2025-03-01T02:12:28.000Z","dependencies_parsed_at":"2024-06-05T13:31:16.485Z","dependency_job_id":"4da22168-37a1-4c18-b220-ce200b57a846","html_url":"https://github.com/jkminder/data2neo","commit_stats":null,"previous_names":["jkminder/data2neo"],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkminder%2Fdata2neo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkminder%2Fdata2neo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkminder%2Fdata2neo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkminder%2Fdata2neo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jkminder","download_url":"https://codeload.github.com/jkminder/data2neo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248519468,"owners_count":21117757,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-cleaning","data-conversion","data-engineering","data2neo","database-migrations","graphs","neo4j","relational-databases","remodeling"],"created_at":"2024-09-30T05:09:02.119Z","updated_at":"2025-04-12T04:44:13.502Z","avatar_url":"https://github.com/jkminder.png","language":"Python","readme":"[![Tests Neo4j 5.13](https://github.com/jkminder/data2neo/actions/workflows/tests_neo4j5.yaml/badge.svg)](https://github.com/jkminder/data2neo/actions/workflows/tests_neo4j5.yaml)\n[![Python Versions](https://img.shields.io/badge/python-3.8%20%7C%C2%A03.9%C2%A0%7C%C2%A03.10%C2%A0%7C%203.11%C2%A0%7C%203.12-orange)](https://github.com/jkminder/data2neo/actions/workflows) \n\n---\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/source/assets/images/data2neo_banner.png\" alt=\"Data2Neo banner\"/\u003e\n\u003c/p\u003e\n\n---\n**Data2Neo** is a library that simplifies the conversion of data in relational format to a graph knowledge database. It reliefs you of the cumbersome manual work of writing the conversion code and let's you focus on the conversion schema and data processing. \n\nThe library is built specifically for converting data into a [neo4j](https://neo4j.com/) graph (minimum version 5.2). The library further supports extensive customization capabilities to clean and remodel data. As neo4j python client it uses the native [neo4j python client](https://neo4j.com/docs/getting-started/languages-guides/neo4j-python/).\n\n\n - [Documentation](https://Data2Neo.jkminder.ch)\n - [Reference Manual](https://data2neo.jkminder.ch/api/core.html)\n\nThis library has been developed at the [Chair of Systems Design at ETH Zürich](https://www.sg.ethz.ch). Please check out our accompanying paper: [Data2Neo - A Tool for Complex Neo4j Data Integration](https://arxiv.org/abs/2406.04995)\n\n## Installation\n```\npip install data2neo\n```\nThe Data2Neo library supports Python 3.8+.\n\n## Quick Start\nA quick example for converting data in a [Pandas](https://pandas.pydata.org) dataframe into a graph. The full example code can be found under [examples](/examples). For more details, please checkout the [full documentation][wiki]. We first define a *convertion schema* in a YAML style config file. In this config file we specify, which entites are converted into which nodes and which relationships. \n##### **`schema.yaml`**\n```yaml\nENTITY(\"Flower\"):\n    NODE(\"Flower\") flower:\n        - sepal_length = Flower.sepal_length\n        - sepal_width = Flower.sepal_width\n        - petal_length = Flower.petal_width\n        - petal_width = append(Flower.petal_width, \" milimeters\")\n    NODE(\"Species\", \"BioEntity\") species:\n        + Name = Flower.species\n    RELATIONSHIP(flower, \"is\", species):\n    \nENTITY(\"Person\"):\n    NODE(\"Person\") person:\n        + ID = Person.ID\n        - FirstName = Person.FirstName\n        - LastName = Person.LastName\n    RELATIONSHIP(person, \"likes\", MATCH(\"Species\", Name=Person.FavoriteFlower)):\n        - Since = \"4ever\"\n```\nThe library itself has 2 basic elements, that are required for the conversion: the `Converter` that handles the conversion itself and an `Iterator` that iterates over the relational data. The iterator can be implemented for arbitrary data in relational format. Data2Neo currently has preimplemented iterators under:\n- `Data2Neo.relational_modules.sqlite`  for [SQLite](https://www.sqlite.org/index.html) databases\n- `Data2Neo.relational_modules.pandas` for [Pandas](https://pandas.pydata.org) dataframes\n\nWe will use the `PandasDataFrameIterator` from `Data2Neo.relational_modules.pandas`. Further we will use the `IteratorIterator` that can wrap multiple iterators to handle multiple dataframes. Since a pandas dataframe has no type/table name associated, we need to specify the name when creating a `PandasDataFrameIterator`. We also define define a custom function `append` that can be refered to in the schema file and that appends a string to the attribute value. For an entity with `Flower[\"petal_width\"] = 5`, the outputed node will have the attribute `petal_width = \"5 milimeters\"`.\n```python\nimport neo4j\nimport pandas as pd \nfrom data2neo.relational_modules.pandas import PandasDataFrameIterator \nfrom data2neo import IteratorIterator, Converter, Attribute, register_attribute_postprocessor\nfrom data2neo.utils import load_file\n\n# Setup the neo4j uri and credentials\nuri = \"bolt:localhost:7687\"\nauth = neo4j.basic_auth(\"neo4j\", \"password\")\n\npeople = ... # a dataframe with peoples data (ID, FirstName, LastName, FavoriteFlower)\npeople_iterator = PandasDataFrameIterator(people, \"Person\")\niris = ... # a dataframe with the iris dataset\niris_iterator = PandasDataFrameIterator(iris, \"Flower\")\n\n# register a custom data processing function\n@register_attribute_postprocessor\ndef append(attribute, append_string):\n    new_attribute = Attribute(attribute.key, str(attribute.value) + append_string)\n    return new_attribute\n\n# Create IteratorIterator\niterator = IteratorIterator([people_iterator, iris_iterator])\n\n# Create converter instance with schema, the final iterator and the graph\nconverter = Converter(load_file(\"schema.yaml\"), iterator, uri, auth)\n# Start the conversion\nconverter()\n```\n# Known issues\nIf you encounter a bug or an unexplainable behavior, please check the [known issues](https://github.com/jkminder/Data2Neo/labels/bug) list. If your issue is not found, submit a new one.\n\n[wiki]: https://data2neo.jkminder.ch/index.html\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjkminder%2Fdata2neo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjkminder%2Fdata2neo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjkminder%2Fdata2neo/lists"}