{"id":32991383,"url":"https://github.com/linkedpipes/etl","last_synced_at":"2025-11-16T03:01:40.513Z","repository":{"id":38318589,"uuid":"49733838","full_name":"linkedpipes/etl","owner":"linkedpipes","description":"LinkedPipes ETL is an RDF based, lightweight ETL tool","archived":false,"fork":false,"pushed_at":"2025-11-07T07:47:36.000Z","size":21953,"stargazers_count":157,"open_issues_count":197,"forks_count":30,"subscribers_count":16,"default_branch":"develop","last_synced_at":"2025-11-07T09:21:27.773Z","etag":null,"topics":["etl","knowledge-graph","linked-data","linkedpipes","rdf","semantic-web"],"latest_commit_sha":null,"homepage":"https://etl.linkedpipes.com","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linkedpipes.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-01-15T17:23:57.000Z","updated_at":"2025-11-07T07:33:12.000Z","dependencies_parsed_at":"2024-01-06T14:09:17.862Z","dependency_job_id":"6462b5c2-e1ed-42a0-ba24-19a32f29f807","html_url":"https://github.com/linkedpipes/etl","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/linkedpipes/etl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedpipes%2Fetl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedpipes%2Fetl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedpipes%2Fetl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedpipes%2Fetl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linkedpipes","download_url":"https://codeload.github.com/linkedpipes/etl/tar.gz/refs/heads/develop","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedpipes%2Fetl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":284654194,"owners_count":27041729,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-16T02:00:05.974Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["etl","knowledge-graph","linked-data","linkedpipes","rdf","semantic-web"],"created_at":"2025-11-13T09:00:33.301Z","updated_at":"2025-11-16T03:01:40.508Z","avatar_url":"https://github.com/linkedpipes.png","language":"Java","funding_links":[],"categories":["Machine Learning","大数据","Interoperability and Digital Infrastructure"],"sub_categories":["BBedit"],"readme":"# LinkedPipes ETL\n[![Travis Status](https://travis-ci.com/linkedpipes/etl.svg?branch=develop)](https://travis-ci.com/linkedpipes/etl)\n\nLinkedPipes ETL is an RDF based, lightweight ETL tool.\n- [Library of components](https://etl.linkedpipes.com/components) to get you started faster\n- [Sharing of configuration](https://etl.linkedpipes.com/templates/) among individual pipelines using templates\n- RDF configuration of transformation pipelines\n\n## Requirements\n- Linux, Windows, iOS\n- [Docker]\n- [Docker Compose] is optional as `docker compose` is supported by modern versions of [Docker] \n\n### For building locally\n- [Java] 21, 22\n- [Git]\n- Optionally [Maven]\n- [Node.js] 18 \u0026 npm\n\n## Installation and startup\nYou can run LP-ETL in Docker, or build it from the source.\n\n### Docker\nTo start LP-ETL you can use:\n```\ngit clone https://github.com/linkedpipes/etl.git\ncd etl\ndocker compose up\n```\nThis would use pre-build images stored at [GitHub Packages].\nThe images are build from the main branch.\n\nAlternatively you can use one liner.\nFor example to run LP-ETL from ```develop``` branch on ```http://localhost:9080``` use can use following command:\n```\ncurl https://raw.githubusercontent.com/linkedpipes/etl/develop/docker-compose.yml | LP_ETL_PORT=9080 LP_VERSION=develop docker-compose -f - up\n```\n\nYou may need to run the ```docker``` command as ```sudo``` or be in the ```docker``` group.\n\n#### Building Docker images\nYou can build LP-ETL images your self.\nNote that on Windows, there is an [issue with buildkit](https://github.com/moby/buildkit/issues/1684).\nSee the [temporary workaround](https://github.com/linkedpipes/etl/issues/851#issuecomment-814058925).\n\n#### Configuration\nEnvironment variables:\n- ```LP_VERSION``` - default value ```main```, determine the version of Docker images.\n- ```LP_ETL_DOMAIN``` - The URL of the instance, this is used instead of the ```domain.uri``` from the configuration. \n- ```LP_ETL_PORT``` - Specify port mapping for frontend, this is where you can connect to your instance.\n  This does NOT have to be the same as port in ```LP_ETL_DOMAIN``` in case of reverse-proxying.\n\n```docker compose``` utilizes several volumes that can be used to access/provide data.\nSee ```docker-compose.yml``` comments for examples and configuration.\nYou may want to create your own ```docker-compose.yml``` for custom configuration.\n\n### From source on Linux\n\n#### Installation\n```sh\n$ git clone https://github.com/linkedpipes/etl.git\n$ cd etl\n$ mvn install\n```\n\n#### Configuration\nThe configuration file ```deploy/configuration.properties``` can be edited, mainly changing paths to working, storage, log and library directories. \n\n#### Startup\n```sh\n$ cd deploy\n$ ./executor.sh \u003e\u003e executor.log \u0026\n$ ./executor-monitor.sh \u003e\u003e executor-monitor.log \u0026\n$ ./storage.sh \u003e\u003e storage.log \u0026\n$ ./frontend.sh \u003e\u003e frontend.log \u0026\n```\n\n#### Running LP-ETL as a systemd service\nSee example service files in the ```deploy/systemd``` folder.\n\n### From source on Windows\nNote that it is also possible to use [Bash on Ubuntu on Windows] or [Cygwin] and proceed as with Linux.\n\n#### Installation\n```sh\ngit clone https://github.com/linkedpipes/etl.git\ncd etl\nmvn install\n```\n\n#### Configuration\nThe configuration file ```deploy/configuration.properties``` can be edited, mainly changing paths to working, storage, log and library directories. \n\n#### Startup\nIn the ```deploy``` folder, run\n- ```executor.bat```\n- ```executor-monitor.bat```\n- ```storage.bat```\n- ```frontend.bat```\n\n## Data import\nYou can copy pipelines and templates data from one instance to another directly.\n\nAssume that you have copy of a data directory ```./data-source``` with ```pipelines``` and ```templates``` subdirectories. \nYou can obtain the directory from any running instance, you can even merge content of multiple of those directories together.\nIn the next step you would like to import the data into a new instance. \nYou can just copy the files to respective directories under ```./data-target```.\nKeep in mind that this would preserve the IRIs.\n\nShould you need to change the IRIs, you should employ import and export functionality available in the frontend.\n\n## Plugins - Components\nThe components live in the ```jars``` directory.\nIf you need to create your own component, you can copy an existing component and change it.\n \n## Update notes\n\u003e Update note 5: 2019-09-03 breaking changes in the configuration file. Remove ```/api/v1``` from the ```executor-monitor.webserver.uri```, so it looks like: ```executor-monitor.webserver.uri = http://localhost:8081```. You can also remove ```executor.execution.uriPrefix``` as the value is derived from ```domain.uri```.\n\n\u003e Update note 4: 2019-07-03 we changed the way frontend is run. If you do not use our script to run it, you need to update yours. \n\n\u003e Update note 3: When upgrading from develop prior to 2017-02-14, you need to delete ```{deploy}/jars``` and ```{deploy}/osgi```. \n\n\u003e Update note 2: When upgrading from master prior to 2016-11-04, you need to move your pipelines folder from e.g., ```/data/lp/etl/pipelines``` to ```/data/lp/etl/storage/pipelines```, update the configuration.properites file and possibly the update/restart scripts as there is a new component, ```storage```.\n\n\u003e Update note 1: When upgrading from master prior to 2016-04-07, you need to delete your old execution data (e.g., in /data/lp/etl/working/data)\n\n[Java]: \u003chttp://www.oracle.com/technetwork/java/javase/downloads/index.html\u003e\n[Git]: \u003chttps://git-scm.com/\u003e\n[Maven]: \u003chttps://maven.apache.org/\u003e\n[Node.js]: \u003chttps://nodejs.org\u003e\n[Cygwin]: \u003chttps://www.cygwin.com/\u003e\n[Bash on Ubuntu on Windows]: \u003chttps://msdn.microsoft.com/en-us/commandline/wsl/about\u003e\n[Docker]: \u003chttps://www.docker.com/\u003e\n[Docker Compose]: \u003chttps://docs.docker.com/compose/\u003e\n[DockerHub]: \u003chttps://hub.docker.com/\u003e\n[GitHub Packages]: \u003chttps://github.com/orgs/linkedpipes/packages\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedpipes%2Fetl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinkedpipes%2Fetl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedpipes%2Fetl/lists"}