{"id":13781008,"url":"https://github.com/morph-kgc/morph-kgc","last_synced_at":"2025-05-11T14:34:32.181Z","repository":{"id":37838280,"uuid":"311956260","full_name":"morph-kgc/morph-kgc","owner":"morph-kgc","description":"Powerful RDF Knowledge Graph Generation with RML Mappings","archived":false,"fork":false,"pushed_at":"2024-10-21T11:00:01.000Z","size":34346,"stargazers_count":189,"open_issues_count":26,"forks_count":34,"subscribers_count":13,"default_branch":"main","last_synced_at":"2024-11-09T05:05:53.681Z","etag":null,"topics":["data-engineering","data-integration","database","etl","knowledge-graph","python","r2rml","rdf","rdf-star","rml"],"latest_commit_sha":null,"homepage":"https://morph-kgc.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/morph-kgc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-11T11:54:10.000Z","updated_at":"2024-10-21T11:00:06.000Z","dependencies_parsed_at":"2023-02-19T08:46:28.192Z","dependency_job_id":"fe3a3c5c-e9ca-41d6-b7dd-21a690293558","html_url":"https://github.com/morph-kgc/morph-kgc","commit_stats":{"total_commits":1050,"total_committers":12,"mean_commits":87.5,"dds":0.3447619047619047,"last_synced_commit":"8eeda353df9e1fbe64685a74fd10bb2e8ee52958"},"previous_names":[],"tags_count":34,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morph-kgc%2Fmorph-kgc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morph-kgc%2Fmorph-kgc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morph-kgc%2Fmorph-kgc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morph-kgc%2Fmorph-kgc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/morph-kgc","download_url":"https://codeload.github.com/morph-kgc/morph-kgc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225064439,"owners_count":17415246,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-engineering","data-integration","database","etl","knowledge-graph","python","r2rml","rdf","rdf-star","rml"],"created_at":"2024-08-03T18:01:22.070Z","updated_at":"2025-05-11T14:34:32.156Z","avatar_url":"https://github.com/morph-kgc.png","language":"Python","funding_links":[],"categories":["KGC Materializers"],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://raw.githubusercontent.com/morph-kgc/morph-kgc/main/logo/logo.png\" height=\"100\" alt=\"morph\"\u003e\n\u003c/p\u003e\n\n[![License](https://img.shields.io/pypi/l/morph-kgc.svg)](https://github.com/morph-kgc/morph-kgc/blob/main/LICENSE)\n[![DOI](https://zenodo.org/badge/311956260.svg?style=flat)](https://zenodo.org/badge/latestdoi/311956260)\n[![Latest PyPI version](https://img.shields.io/pypi/v/morph-kgc?style=flat)](https://pypi.python.org/pypi/morph-kgc)\n[![Python Version](https://img.shields.io/pypi/pyversions/morph-kgc.svg)](https://pypi.python.org/pypi/morph-kgc)\n[![PyPI status](https://img.shields.io:/pypi/status/morph-kgc?)](https://pypi.python.org/pypi/morph-kgc)\n[![build](https://github.com/morph-kgc/morph-kgc/actions/workflows/ci.yml/badge.svg)](https://github.com/morph-kgc/morph-kgc/actions/workflows/ci.yml)\n[![Documentation Status](https://readthedocs.org/projects/morph-kgc/badge/?version=stable)](https://morph-kgc.readthedocs.io)\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ByFx_NOEfTZeaJ1Wtw3UwTH3H3-Sye2O?usp=sharing)\n\n**Morph-KGC** is an engine that constructs **[RDF](https://www.w3.org/TR/rdf11-concepts/)** knowledge graphs from heterogeneous data sources with the **[R2RML](https://www.w3.org/TR/r2rml/)** and **[RML](https://w3id.org/rml/core/spec)** mapping languages. Morph-KGC is built on top of [pandas](https://pandas.pydata.org/) and it leverages *mapping partitions* to significantly reduce execution times and memory consumption for large data sources.\n\n## Features :sparkles:\n\n- User-friendly mappings with **[YARRRML](https://rml.io/yarrrml/spec/)**.\n- Transformation functions with **[RML-FNML](https://w3id.org/rml/fnml/spec)**, including **Python UDFs**.\n- [RDF-star](https://w3c.github.io/rdf-star/cg-spec/2021-12-17.html) generation with **[RML-star](https://w3id.org/rml/star/spec)**.\n- **[RML views](https://2023.eswc-conferences.org/wp-content/uploads/2023/05/paper_Arenas-Guerrero_2023_Boosting.pdf)** over tabular data sources and [JSON](https://www.json.org) files.\n- Integration with **[RDFLib](https://rdflib.readthedocs.io)**, **[Oxigraph](https://pyoxigraph.readthedocs.io/en)** and [Kafka](https://kafka-python.readthedocs.io).\n- **Optimized** to materialize large knowledge graphs.\n- **Remote** data and mapping files.\n- Input data formats:\n    - **Relational databases**: [MySQL](https://www.mysql.com/), [PostgreSQL](https://www.postgresql.org/), [Oracle](https://www.oracle.com/database/), [Microsoft SQL Server](https://www.microsoft.com/sql-server), [MariaDB](https://mariadb.org/), [SQLite](https://www.sqlite.org).\n    - **Tabular files**: [CSV](https://en.wikipedia.org/wiki/Comma-separated_values), [TSV](https://en.wikipedia.org/wiki/Tab-separated_values), [Excel](https://www.microsoft.com/en-us/microsoft-365/excel), [Parquet](https://parquet.apache.org/documentation), [Feather](https://arrow.apache.org/docs/python/feather.html), [ORC](https://orc.apache.org/), [Stata](https://www.stata.com/), [SAS](https://www.sas.com), [SPSS](https://www.ibm.com/analytics/spss-statistics-software), [ODS](https://en.wikipedia.org/wiki/OpenDocument).\n    - **Hierarchical files**: [JSON](https://www.json.org), [XML](https://www.w3.org/TR/xml/).\n    - **In-memory data structures**: [Python Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries), [DataFrames](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).\n    - **Cloud data lake solutions**: [Databricks](https://www.databricks.com/), [Snowflake](https://www.snowflake.com/).\n    - **Property graph databases**: [Neo4j](https://neo4j.com/), [Kùzu](https://kuzudb.com).\n\n## Documentation :bookmark_tabs:\n\n**[Read the documentation](https://morph-kgc.readthedocs.io)**.\n\n## Tutorial :woman_teacher:\n\nLearn quickly with the tutorial in **[Google Colaboratory](https://colab.research.google.com/drive/1ByFx_NOEfTZeaJ1Wtw3UwTH3H3-Sye2O?usp=sharing)**!\n\n## Getting Started :rocket:\n\n**[PyPI](https://pypi.org/project/morph-kgc/)** is the fastest way to install Morph-KGC:\n```bash\npip install morph-kgc\n```\n\nWe recommend to use [virtual environments](https://docs.python.org/3/library/venv.html#) to install Morph-KGC.\n\nTo run the engine via **command line** you just need to execute the following:\n```bash\npython3 -m morph_kgc config.ini\n```\n\nCheck the **[documentation](https://morph-kgc.readthedocs.io/endocumentation/#configuration)** to see how to generate the configuration **INI file**. **[Here](https://github.com/morph-kgc/morph-kgc/blob/main/examples/configuration-file/default_config.ini)** you can also see an example INI file.\n\nIt is also possible to run Morph-KGC as a **library** with **[RDFLib](https://rdflib.readthedocs.io)** and **[Oxigraph](https://pyoxigraph.readthedocs.io/en)**:\n```python\nimport morph_kgc\n\n# generate the triples and load them to an RDFLib graph\ng_rdflib = morph_kgc.materialize('/path/to/config.ini')\n# work with the RDFLib graph\nq_res = g_rdflib.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')\n\n# generate the triples and load them to Oxigraph\ng_oxigraph = morph_kgc.materialize_oxigraph('/path/to/config.ini')\n# work with Oxigraph\nq_res = g_oxigraph.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')\n\n# the methods above also accept the config as a string\nconfig = \"\"\"\n            [DataSource1]\n            mappings: /path/to/mapping/mapping_file.rml.ttl\n            db_url: mysql+pymysql://user:password@localhost:3306/db_name\n         \"\"\"\ng_rdflib = morph_kgc.materialize(config)\n```\n\n## License :unlock:\n\nMorph-KGC is available under the **[Apache License 2.0](https://github.com/morph-kgc/morph-kgc/blob/main/LICENSE)**.\n\n## Author \u0026 Contact :mailbox_with_mail:\n\n- **[Julián Arenas-Guerrero](https://github.com/arenas-guerrero-julian/) - [julian.arenas.guerrero@upm.es](mailto:julian.arenas.guerrero@upm.es)**\n\n*[Ontology Engineering Group](https://oeg.fi.upm.es)*, *[Universidad Politécnica de Madrid](https://www.upm.es/internacional)*.\n\n## Citing :speech_balloon:\n\nIf you used Morph-KGC in your work, please cite the **[SWJ paper](https://www.doi.org/10.3233/SW-223135)**:\n\n```bib\n@article{arenas2024morph,\n  title     = {{Morph-KGC: Scalable knowledge graph materialization with mapping partitions}},\n  author    = {Arenas-Guerrero, Julián and Chaves-Fraga, David and Toledo, Jhon and Pérez, María S. and Corcho, Oscar},\n  journal   = {Semantic Web},\n  year      = {2024},\n  volume    = {15},\n  number    = {1},\n  pages     = {1-20},\n  issn      = {2210-4968},\n  publisher = {IOS Press},\n  doi       = {10.3233/SW-223135}\n}\n```\n\n## Sponsor :shield:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://github.com/morph-kgc/morph-kgc-docs/blob/main/docs/assets/BASF.png\" height=\"100\" alt=\"BASF\"\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmorph-kgc%2Fmorph-kgc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmorph-kgc%2Fmorph-kgc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmorph-kgc%2Fmorph-kgc/lists"}