{"id":22302638,"url":"https://github.com/dataoneorg/slinky","last_synced_at":"2025-10-05T09:40:56.549Z","repository":{"id":38017414,"uuid":"346855310","full_name":"DataONEorg/slinky","owner":"DataONEorg","description":"Slinky, the DataONE Graph Store","archived":false,"fork":false,"pushed_at":"2022-12-18T19:07:37.000Z","size":5898,"stargazers_count":4,"open_issues_count":39,"forks_count":4,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-04-05T07:01:26.749Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DataONEorg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-11T22:21:58.000Z","updated_at":"2023-08-13T19:30:43.000Z","dependencies_parsed_at":"2023-01-29T19:45:51.907Z","dependency_job_id":null,"html_url":"https://github.com/DataONEorg/slinky","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/DataONEorg/slinky","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataONEorg%2Fslinky","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataONEorg%2Fslinky/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataONEorg%2Fslinky/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataONEorg%2Fslinky/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DataONEorg","download_url":"https://codeload.github.com/DataONEorg/slinky/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataONEorg%2Fslinky/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266111439,"owners_count":23877980,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-03T18:40:34.345Z","updated_at":"2025-10-05T09:40:51.511Z","avatar_url":"https://github.com/DataONEorg.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Slinky, the DataONE Graph Store\n\n## Overview\nService for the DataONE Linked Open Data graph.\n\nThis repository contains a deployable service that continuously updates the [DataOne](https://www.dataone.org/) [Linked Open Data](http://linkeddata.org/) graph. It was originally developed as a provider of data for the [GeoLink](http://www.geolink.org/) project, but now is a core component of the DataONE services. The service uses [Docker Compose](https://docs.docker.com/compose/) to manage a set of [Docker](https://www.docker.com/) containers that run the service. The service is intended to be deployed to a virtual machine and run with [Docker Compose](https://docs.docker.com/compose/).\n\nThe main infrastructure of the service is composed of four [Docker Compose](https://docs.docker.com/compose/) services:\n\n1. `web`: An [Apache httpd](https://httpd.apache.org/) front-end serving static files and also reverse-proxying to an [Apache Tomcat](http://tomcat.apache.org/) server running a [GraphDB](http://graphdb.ontotext.com/display/GraphDB6/Home) Lite instance which is bundled with [OpenRDF Sesame](http://rdf4j.org) Workbench.\n2. `scheduler`: An [APSchduler](https://apscheduler.readthedocs.org) process that schedules jobs (e.g., update graph with new datasets) on the `worker` at specified intervals\n3. `worker`: An [RQ](http://python-rq.org/) worker process to run scheduled jobs\n4. `redis`: A [Redis](http://redis.io) instance to act as a persistent store for the `worker` and for saving application state\n\nIn addition to the core infrastructure services (above), a set of monitoring/logging services are spun up by default. As of writing, these are mostly being used for development and testing but they may be useful in production:\n\n1. `elasticsearch`: An [ElasticSearch](https://www.elastic.co/products/elasticsearch) instance to store, index, and support analysis of logs\n2. `logstash`: A [Logstash](https://www.elastic.co/products/logstash) instance to facilitate the log pipeline\n3. `kibana`: A [Kibana](https://www.elastic.co/products/kibana) instance to search and vizualize logs\n4. `logspout`: A [Logspout](https://github.com/gliderlabs/logspout) instance to collect logs from the [Docker](https://www.docker.com/) containers\n5. `cadvisor`: A [cAdvisor](https://github.com/google/cadvisor) instance to monitor resource usage on each [Docker](https://www.docker.com/) container\n6. `rqdashboard`: An [RQ Dashboard](https://github.com/nvie/rq-dashboard) instance to monitor jobs.\n\nAs the service runs, the graph store will be continuously updated as datasets are added/updated on [DataOne](https://www.dataone.org/). Another scheduled job exports the statements in the graph store and produces a Turtle dump of all statements at [http://dataone.org/d1lod.ttl](http://dataone.org/d1lod.ttl).\n\n### Contents of This Repository\n\n```\n.\n├── d1lod       # Python package which supports other services\n├── docs        # Detailed documentation beyond this file\n├── logspout    # Custom Dockerfile for logspout\n├── logstash    # Custom Dockerfile for logstash\n├── redis       # Custom Dockerfile for Redis\n├── rqdashboard # Custom Dockerfile for RQ Dashboard\n├── scheduler   # Custom Dockerfile for APScheduler process\n├── web         # Apache httpd + Tomcat w/ GraphDB\n├── worker      # Custom Dockerfile for RQWorker process\n└── www         # Local volume holding static files\n```\n\nNote: In order to run the service without modification, you will need to create a 'webapps' directory in the root of this repository containing 'openrdf-workbench.war' and 'openrdf-sesame.war':\n\n```\n.\n├── webapps\n│   ├── openrdf-sesame.war\n└   └── openrdf-workbench.war\n```\n\nThese aren't included in the repository because we're using GraphDB Lite which doesn't have a public download URL. These WAR files can just be the base Sesame WAR files which support a variety of backend graph stores but code near https://github.com/ec-geolink/d1lod/blob/master/d1lod/d1lod/sesame/store.py#L90 will need to be modified correspondingly.\n\n\n## What's in the graph?\n\nFor an overview of what concepts the graph contains, see the [mappings](/docs/mappings.md) documentation.\n\n\n## Getting up and running\n\nAssuming you are set up to to use [Docker](https://www.docker.com/) (see the [User Guide](https://docs.docker.com/engine/userguide/) to get set up):\n\n```\ngit clone https://github.com/DataONEorg/slinky\ncd slinky\n# Create a webapps folder with openrdf-sesame.war and openrdf-workbench.war (See above note)\ndocker-compose up # May take a while\n```\n\nAfter running the above `docker-compose` command, the above services should be started and available (if appropriate) on their respective ports:\n1. Apache httdp → $DOCKER_HOST:80`\n2. OpenRDF Workbench → `$DOCKER_HOST:8080/openrdf-workbench/`\n3. Kibana (logs) → `$DOCKER_HOST:5601`\n4. cAdvisor → `$DOCKER_HOST:8888`\n\nWhere `$DOCKER_HOST` is `localhost` if you're running [Docker](https://www.docker.com/) natively or some IP address if you're running [Docker Machine](https://docs.docker.com/machine/). Consult the [Docker Machine](https://docs.docker.com/machine/) documentation to find this IP address. When deployed on a Linux machine, [Docker](https://www.docker.com/) is able to bind to localhost under the default configuration.\n\n\n## Testing\n\nTests are written using [PyTest](http://pytest.org/latest/). Install [PyTest](http://pytest.org/latest/) with\n\n```\npip install pytest\ncd d1lod\npy.test\n```\n\nAs of writing, only tests for the supporting Python package (in directory './d1lod') have been written.\nNote: The test suite assumes you have a running instance of [OpenRDF Sesame](http://rdf4j.org) running on http://localhost:8080 which means the Workbench is located at http://localhost:8080/openrdf-workbench and the Sesame interface is available at http://localhost:8080/openrdf-sesame.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataoneorg%2Fslinky","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdataoneorg%2Fslinky","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataoneorg%2Fslinky/lists"}