{"id":15159708,"url":"https://github.com/gorgonun/data-to-graph","last_synced_at":"2026-01-20T05:32:00.332Z","repository":{"id":254021083,"uuid":"834162394","full_name":"gorgonun/data-to-graph","owner":"gorgonun","description":"Data2Graph - a tool to automatically migrate data from semi-structured sources to graph databases","archived":false,"fork":false,"pushed_at":"2024-09-23T22:53:18.000Z","size":21149,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-07T17:45:03.496Z","etag":null,"topics":["graph","json","migration","mongodb","neo4j","parallel","pipeline","python","ray","streaming","turbo-c2"],"latest_commit_sha":null,"homepage":"https://gorgonun.github.io/data-to-graph/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gorgonun.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-26T14:53:26.000Z","updated_at":"2024-09-23T22:53:21.000Z","dependencies_parsed_at":"2024-11-03T08:41:17.514Z","dependency_job_id":"f615ac37-5090-4870-8902-7b88bd49ce93","html_url":"https://github.com/gorgonun/data-to-graph","commit_stats":{"total_commits":33,"total_committers":1,"mean_commits":33.0,"dds":0.0,"last_synced_commit":"cc6f734145850646de9aadd301c4274f98d69351"},"previous_names":["gorgonun/data-to-graph"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/gorgonun/data-to-graph","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorgonun%2Fdata-to-graph","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorgonun%2Fdata-to-graph/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorgonun%2Fdata-to-graph/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorgonun%2Fdata-to-graph/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gorgonun","download_url":"https://codeload.github.com/gorgonun/data-to-graph/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gorgonun%2Fdata-to-graph/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28596418,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-20T02:08:49.799Z","status":"ssl_error","status_checked_at":"2026-01-20T02:08:44.148Z","response_time":117,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["graph","json","migration","mongodb","neo4j","parallel","pipeline","python","ray","streaming","turbo-c2"],"created_at":"2024-09-26T21:41:39.168Z","updated_at":"2026-01-20T05:32:00.310Z","avatar_url":"https://github.com/gorgonun.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data2Graph\n\nData2Graph is a tool to automatically migrate data from semi-structured sources to graph databases, with current support for MongoDB and Neo4J.\n\n## Getting Started\n\nTo get started with the project, you can use the docker images provided in the GitHub Packages or build the images yourself using the Dockerfiles provided in the repository or the docker compose file. Also, you can run the project locally using the Makefile provided in the repository.\n\n### Prerequisites\n\n#### To Use Docker Images\n\n- Docker [Docker Documentation](https://docs.docker.com/get-docker/).\n\n#### To Build Docker Images\n\n- Docker [Docker Documentation](https://docs.docker.com/get-docker/).\n- Docker Compose [Docker Compose Documentation](https://docs.docker.com/compose/install/).\n\n#### Running Locally Without Docker Compose\n\n- Python 3.10 or higher [Python Documentation](https://www.python.org/downloads/).\n- Poetry [Poetry Documentation](https://python-poetry.org/docs/).\n- Node.js 20.0.0 or higher [Node.js Documentation](https://nodejs.org/en/download/).\n\n### Using Docker Images\n\nTo use the images provided in the GitHub Packages, you can use the following commands:\n\n```bash\nexport NEO4J_URL=\u003cneo4j url\u003e\nexport NEO4J_USER=\u003cneo4j user\u003e\nexport NEO4J_PASSWORD=\u003cneo4j password\u003e\nexport PROMETHEUS_HOST=\u003cprometheus host\u003e\n\ndocker run -d -p 8000:8000 -p 8265:8265 -p 7475:7475 --name data2graph_backend -e NEO4J_URL=$NEO4J_URL -e NEO4J_USER=$NEO4J_USER -e NEO4J_PASSWORD=$NEO4J_PASSWORD -e PROMETHEUS_HOST=$PROMETHEUS_HOST  ghcr.io/gorgonun/data_to_graph_backend:latest\n\ndocker run -d -p 80:80 --name data2graph_frontend ghcr.io/gorgonun/data_to_graph_frontend:latest\n```\n\nThe frontend image will be available at `http://localhost:80` and will make requests to the backend at `http://localhost:8000`. To change the backend URL, you can use the environment variable `VITE_API_URL` and build the image.\n\n### Building Docker Images\n\nTo build the images yourself, you can use the following command:\n\n```bash\nmake build_images\n```\n\nThis will use docker compose to build the images and make them available for use. If you want to build the images individually, you can use the following commands:\n\n```bash\nexport VITE_API_URL=http://localhost:8000\n\ndocker build -t data2graph_frontend --build-arg VITE_API_URL=$VITE_API_URL  ./frontend/data2graph/\ndocker build -t data2graph_backend .\n```\n\n### Running Locally with Docker Compose\n\nTo run the project locally, you can use the docker compose file provided in the repository. The docker compose file will start a ray cluster, some services of user choice and the frontend. You can use the following command to start the project:\n\n```bash\nmake start_infra\n```\n\n#### Makefile\n\n##### Parameters\n\n- **profile**: The profile to be used to run the infrastructure. The options are `basic`, `full` and `load-data`.\n- **clean**: If true, it will delete the previous configurations in the environment files and recreate them according to the specified configurations.\n- **mongodb_database**: The name of the MongoDB database. (default: `test`)\n- **mongodb_collection**: The name of the MongoDB collection. (default: `nyt`)\n- **mongodb_url**: The connection URL for MongoDB. (default: `mongodb://mongodb:27017/`)\n- **mongodb_data_folder**: The folder to save the MongoDB data. (default: `./data/mongodb/`)\n- **neo4j_user**: The username for Neo4J. (default: `neo4j`)\n- **neo4j_password**: The password for Neo4J. (default: `admin`)\n- **neo4j_port**: The port for Neo4J. (default: `7687`)\n- **neo4j_data_folder**: The folder to save the Neo4J data. (default: `./data/neo4j/`)\n- **neo4j_host**: The host for Neo4J. (default: `neo4j`)\n- **prometheus_host**: The host for Prometheus. (default: `http://prometheus`)\n\nTo run the project with the `full` profile and mongodb url, you can use the following command:\n\n```bash\nmake start_infra profile=full mongodb_url=mongodb://localhost:27017/\n```\n\n##### Commands\n\n- **write_env_values**: Writes the environment variables values to the .env files.\n- **setup_env_file**: Writes the environment variables values to the .env files, deleting the old .env files if `clean = true`.\n- **start_infra**: Starts the infrastructure with Docker Compose with the defined configurations.\n- **stop_infra**: Stops the infrastructure with Docker Compose.\n- **run**: Starts the execution of the migration script.\n\n##### Profiles\n\nIn the Docker Compose file, two profiles are configured: `basic` and `full`. In the `basic` profile, only the essential services are configured, being the source and destination databases and the data for testing. In the `full` profile, all services from the `basic` profile are configured, with the addition of MongoExpress to facilitate the visualization and debugging of the data available in MongoDB. The `load-data` profile is used to load data into the MongoDB.\n\n##### .env Files\n\nTo run the script and setup the environment with docker compose, some environment variables are needed. The environment variables are stored in some .env files that are created with the command **write_env_values** and with other commands that needs it. The .env files are:\n\n- **.env**: The environment variables for the migration script.\n- **.env.mongo_seed**: The environment variables for the MongoDB seed.\n- **.env.mongo_express**: The environment variables for the MongoExpress.\n- **.env.mongodb**: The environment variables for the MongoDB.\n- **.env.neo4j**: The environment variables for the Neo4J.\n\n### Running Locally without Docker Compose\n\nTo run the project locally, you need to install the dependencies and run the project using the Makefile provided in the repository. The Makefile will start a ray cluster and submit the main job. Also, you will need to run the frontend with Node.js 20.0.0 or higher. You can use the following command to install the poetry dependencies:\n\n```bash\npoetry install\n```\n\nand then you can choose one of the followings:\n\n```bash\nmake run\n```\n\nor\n\n```bash\nmake start_ray_cluster\nmake submit_main_job\n```\n\nTo start the frontend, you can use the following commands:\n\n```bash\ncd frontend/data2graph\nnpm install\nnpm run dev\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgorgonun%2Fdata-to-graph","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgorgonun%2Fdata-to-graph","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgorgonun%2Fdata-to-graph/lists"}