{"id":17648082,"url":"https://github.com/kiran94/dgraphpandas","last_synced_at":"2025-05-07T05:50:08.017Z","repository":{"id":52248202,"uuid":"352042304","full_name":"kiran94/dgraphpandas","owner":"kiran94","description":"Transform Pandas DataFrames into RDF Exports to be sent to DGraph","archived":false,"fork":false,"pushed_at":"2024-12-18T11:33:17.000Z","size":2761,"stargazers_count":2,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-07T05:49:56.419Z","etag":null,"topics":["dataframe","dgraph","pandas","rdf"],"latest_commit_sha":null,"homepage":"https://kiran94.github.io/dgraphpandas/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kiran94.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-27T10:21:45.000Z","updated_at":"2024-12-18T11:31:44.000Z","dependencies_parsed_at":"2025-03-10T19:37:28.800Z","dependency_job_id":"43e69ee5-cdb7-4dfb-b5e5-3362d5a49fef","html_url":"https://github.com/kiran94/dgraphpandas","commit_stats":{"total_commits":60,"total_committers":2,"mean_commits":30.0,"dds":"0.44999999999999996","last_synced_commit":"f9a0e1f7697014b6e032df918ce3120b31befe9b"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiran94%2Fdgraphpandas","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiran94%2Fdgraphpandas/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiran94%2Fdgraphpandas/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kiran94%2Fdgraphpandas/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kiran94","download_url":"https://codeload.github.com/kiran94/dgraphpandas/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252823693,"owners_count":21809709,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataframe","dgraph","pandas","rdf"],"created_at":"2024-10-23T11:16:13.638Z","updated_at":"2025-05-07T05:50:07.947Z","avatar_url":"https://github.com/kiran94.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# dgraphpandas\n\n[![Build](https://github.com/kiran94/dgraphpandas/actions/workflows/python-package.yml/badge.svg)](https://github.com/kiran94/dgraphpandas/actions/workflows/python-package.yml) ![PyPI](https://img.shields.io/pypi/v/dgraphpandas?color=blue\u0026style=flat-square) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Coverage Status](https://coveralls.io/repos/github/kiran94/dgraphpandas/badge.svg)](https://coveralls.io/github/kiran94/dgraphpandas) [![Codacy Badge](https://app.codacy.com/project/badge/Grade/3484574402e0408c97849301b354be8d)](https://www.codacy.com/gh/kiran94/dgraphpandas/dashboard?utm_source=github.com\u0026amp;utm_medium=referral\u0026amp;utm_content=kiran94/dgraphpandas\u0026amp;utm_campaign=Badge_Grade)\n\nA Library (with accompanying cli tool) to transform [Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html#user-guide) DataFrames into Exports ([RDF](https://en.wikipedia.org/wiki/Resource_Description_Framework)) to be sent to [DGraph Live Loader](https://dgraph.io/docs/deploy/fast-data-loading/live-loader/)\n\n```sh\npython -m pip install dgraphpandas\n```\n\n- [Documentation](https://kiran94.github.io/dgraphpandas/)\n- [PyPi](https://pypi.org/project/dgraphpandas/)\n\n## Usage\n\n### Command Line\n\n```sh\n❯ dgraphpandas --help\nusage: dgraphpandas [-h] [-x {upserts,schema,types}] [-f FILE] -c CONFIG\n                    [-ck CONFIG_FILE_KEY] [-o OUTPUT_DIR] [--console]\n                    [--export_csv] [--encoding ENCODING]\n                    [--chunk_size CHUNK_SIZE]\n                    [--gz_compression_level GZ_COMPRESSION_LEVEL]\n                    [--key_separator KEY_SEPARATOR]\n                    [--add_dgraph_type_records ADD_DGRAPH_TYPE_RECORDS]\n                    [--drop_na_intrinsic_objects DROP_NA_INTRINSIC_OBJECTS]\n                    [--drop_na_edge_objects DROP_NA_EDGE_OBJECTS]\n                    [--illegal_characters ILLEGAL_CHARACTERS]\n                    [--illegal_characters_intrinsic_object ILLEGAL_CHARACTERS_INTRINSIC_OBJECT]\n                    [--version] [-v {DEBUG,INFO,WARNING,ERROR,NOTSET}]\n```\n\nThis is a real example which you can find in the [samples folder](https://github.com/kiran94/dgraphpandas/tree/main/samples) and run from the root of this repository:\n\n```sh\ndgraphpandas \\\n  --config samples/planets/dgraphpandas.json \\\n  --config_file_key planet \\\n  --file samples/planets/solar_system.csv \\\n  --output samples/planets/output\n```\n\n### Module\n\nThis example can also be found in [Notebook](docs/notebooks/) form.\n\n```py\nimport dgraphpandas as dpd\n\n# Define a Configuration for your data files(s). Explained further in the Configuration section.\nconfig = {\n  \"transform\": \"horizontal\",\n  \"files\": {\n    \"planet\": {\n      \"subject_fields\": [\"id\"],\n      \"edge_fields\": [\"type\"],\n      \"type_overrides\": {\n        \"order_from_sun\": \"int32\",\n        \"diameter_earth_relative\": \"float32\",\n        \"diameter_km\": \"float32\",\n        \"mass_earth_relative\": \"float32\",\n        \"mean_distance_from_sun_au\": \"float32\",\n        \"orbital_period_years\": \"float32\",\n        \"orbital_eccentricity\": \"float32\",\n        \"mean_orbital_velocity_km_sec\": \"float32\",\n        \"rotation_period_days\": \"float32\",\n        \"inclination_axis_degrees\": \"float32\",\n        \"mean_temperature_surface_c\": \"float32\",\n        \"gravity_equator_earth_relative\": \"float32\",\n        \"escape_velocity_km_sec\": \"float32\",\n        \"mean_density\": \"float32\",\n        \"number_moons\": \"int32\",\n        \"rings\": \"bool\"\n      },\n      \"ignore_fields\": [\"image\", \"parent\"]\n    }\n  }\n}\n\n# Perform a Horizontal Transform on the passed file using the config/key\n# Generate RDF Upsert statements\nintrinsic, edges = dpd.to_rdf('solar_system.csv', config, 'planet', output_dir='.', export_rdf=True)\n\n# Do something with these statements e.g write to zip and ship to DGraph\n# The cli will zip this output automatically\n# In module mode when you provide output_dir and export_rdf it will automatically zip and write to disk\nprint(intrinsic)\nprint(edges)\n```\n\nAlternatively, you could call the underlying methods\n\n```py\n# Perform a Horizontal Transform on the passed file using the config/key\nintrinsic, edges = horizontal_transform('solar_system.csv', config, \"planet\")\n# Generate RDF Upsert statements\nintrinsic_upserts, edges_upserts = generate_upserts(intrinsic, edges)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkiran94%2Fdgraphpandas","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkiran94%2Fdgraphpandas","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkiran94%2Fdgraphpandas/lists"}