https://github.com/veupathdb/vdi-service
VEuPathDB Dataset Installer
https://github.com/veupathdb/vdi-service
vdi
Last synced: 3 months ago
JSON representation
VEuPathDB Dataset Installer
- Host: GitHub
- URL: https://github.com/veupathdb/vdi-service
- Owner: VEuPathDB
- License: apache-2.0
- Created: 2022-12-08T18:56:05.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2025-07-30T16:14:03.000Z (3 months ago)
- Last Synced: 2025-07-30T18:55:08.992Z (3 months ago)
- Topics: vdi
- Language: Kotlin
- Homepage:
- Size: 5.43 MB
- Stars: 0
- Watchers: 19
- Forks: 0
- Open Issues: 44
-
Metadata Files:
- Readme: readme.adoc
- License: license
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
= VEuPathDB Dataset Installer Service (VDI)
:source-highlighter: highlightjs
// :toc::confluence: https://veupathdb.atlassian.net/wiki/spaces
ifdef::env-github[]
:tip-caption: :bulb:
:note-caption: :information_source:
:important-caption: :heavy_exclamation_mark:
:caution-caption: :fire:
:warning-caption: :warning:
endif::[]ifndef::env-github[]
:icons: font
endif::[]== Links
=== Documentation
[unstyled]
* https://veupathdb.github.io/vdi-service/core/[HTTP API]
* https://veupathdb.github.io/vdi-service/schema/config/full-config.html[Full Stack Config]
* https://veupathdb.github.io/vdi-service/schema/config/plugin-config.html[Plugin Config]=== Schema
JSON schema files for use with editor validation & autocomplete.
[unstyled]
* https://veupathdb.github.io/vdi-service/schema/config/full-config.json[Service Stack Configuration Schema]
* https://veupathdb.github.io/vdi-service/schema/data/dataset-characteristics.eda.json[EDA Dataset Characteristics Schema]
* https://veupathdb.github.io/vdi-service/schema/data/dataset-characteristics.metaschema.json[Dataset Characteristics Metaschema]=== Dataset Characteristics Definitions
Editable YAML schema files that define the valid properties that may appear in
dataset characteristics.[unstyled]
* link:schema/data/dataset-characteristics.eda.yml[EDA Dataset Characteristics]== Development
=== Repository Structure
.Stack Management
[unstyled]
* link:config/[Stack Configuration Files]
* link:compose/[Docker Compose Files]
* link:schema/config/[Stack Config JSON Schema].Component Source
[unstyled]
* link:project/conventions/[Gradle Build Conventions]
* link:project/common/[Shared Components]
* link:project/core/[Core VDI Service Components]
* link:project/plugin-server/[VDI Plugin Handler Server]=== Core VDI Service
==== Container Image Build
[source, shell]
----
./gradlew build-image
----////
== Documentation Links
=== API
==== Production
* link:https://veupathdb.github.io/vdi-service/prod/vdi-api.html[REST Service API Doc]
//* Configuration Schema Doc
//* Full Configuration Schema
//* Configuration Schema Root==== QA
* link:https://veupathdb.github.io/vdi-service/qa/vdi-api.html[REST Service API Doc]
//* Configuration Schema Doc
//* Full Configuration Schema
//* Configuration Schema Root==== Dev
* link:https://veupathdb.github.io/vdi-service/dev/vdi-api.html[REST Service API Doc]
* EDA Dataset Characteristics Schema
** link:schema/data/dataset-characteristics.eda.yml[Source YAML Schema]
** link:https://veupathdb.github.io/vdi-service/dev/schema/data/dataset-characteristics.eda.json[Compiled JSON Schema]
* Genomics Dataset Characteristics Schema
** link:schema/data/dataset-characteristics.genomics.yml[Source YAML Schema]
** link:https://veupathdb.github.io/vdi-service/dev/schema/data/dataset-characteristics.genomics.json[Compiled JSON Schema]
* link:https://veupathdb.github.io/vdi-service/dev/schema/data/dataset-characteristics.metaschema.json[Dataset Characteristiscs Metaschema]=== Administration
.Confluence
* link:{confluence}/TECH/folder/1006829569[Administration Docs Folder]
* link:{confluence}/TECH/pages/1006698498/Purge+Broken+Dataset+Folders+from+MinIO[Purge Broken Datasets from Object Store]
* link:{confluence}/TECH/pages/1283817474/Handling+Failed+Dataset+Installs[Handling Failed Dataset Installs]
* link:{confluence}/UI/pages/553680929/VDI+User+and+Administration+Guide[User and General Admin Guide]=== Deployment & Configuration
* link:https://veupathdb.github.io/vdi-service/dev/config-schema.html[Configuration Schema Doc]
* link:https://veupathdb.github.io/vdi-service/dev/schema/config/full-config.json[Full Configuration Schema]
* link:https://veupathdb.github.io/vdi-service/dev/schema/config/stack-config.json[Configuration Schema Root]=== Design
.Document Links
[%collapsible]
====
Initial Design::
+
--
* link:docs/outdated/overview/overview.html[Original Overview]
--Feature Expansion::
+
--
* link:{confluence}/UI/pages/1292599331/VDI+Feature+Dataset+Data+Revisioning[Dataset Revisioning]
--
====== Development
=== Run the Stack Locally
==== Configure Compose Environment
Copy the `./compose/example.local.env` file into the project root with the name
`.env`.[source, shell]
----
cp compose/example.local.env .env
----Edit the `.env` file and fill in the required variable values.
===== Optional: Select Image Versions
If specific docker image versions are desired for running a test, additional
environment variables may be added to the `.env` file to specify image versions.If no image version is specified for an image, `latest` will be assumed.
.Image Env Vars
[%collapsible]
====
[source, dotenv]
----
VDI_CACHE_DB_TAG=latest
VDI_KAFKA_TAG=latestVDI_SERVICE_TAG=latest
VDI_PLUGIN_BIGWIG_TAG=latest
VDI_PLUGIN_BIOM_TAG=latest
VDI_PLUGIN_EXAMPLE_TAG=latest
VDI_PLUGIN_GENELIST_TAG=latest
VDI_PLUGIN_ISASIMPLE_TAG=latest
VDI_PLUGIN_NOOP_TAG=latest
VDI_PLUGIN_WRANGLER_TAG=latest
VDI_PLUGIN_RNASEQ_TAG=latest
----
======== Start the Service Stack
The full service stack can be started and managed locally by using available
`make` commands for stack management.Initial Startup & Image Redeploy::
Use if the stack has never been run, has been previously destroyed via
`compose-down`, or to deploy rebuilt images (may be performed without stopping
the stack).
+
[source, shell]
----
make compose-up
----Shutdown & Destroy Stack::
Erases volumes and container state.
+
[source, shell]
----
make compose-down
----Halt Stack::
Maintains volumes and container state.
+
[source, shell]
----
make compose-stop
----Restart Halted Stack::
+
[source, shell]
----
make compose-start
----===== Optional: Build Local Changes
If local code changes have been made, and you wish to test those changes in the
container stack, a new image may be built using the `make` target `build-image`.[source, shell]
----
make build-image
----This build target requires the environment variables `GITHUB_USERNAME` and
`GITHUB_TOKEN` be available in the running shell. See the
{confluence}/TECH/pages/108560402/Deploy+Containerized+Services+for+Local+Development[Confluence Container Guide]
for additional information.=== Update Dataset Characteristics Schema
.Optional: Lightweight Checkout
[%collapsible]
====
Clones only the dataset characteristics schema files without pulling down the
full repository source.[source, shell]
----
git clone git@github.com:VEuPathDB/vdi-service --depth 1 --filter tree:0 \
&& cd vdi-service \
&& git sparse-checkout set --no-cone /schema/data \
&& git checkout
----
====The dataset characteristics validation schema files are JSON schema, written in
YAML that live in the link:schema/data/[data schema directory].The schema files themselves are validated using the included metaschema JSON
file, which may be plugged into many smart editors to automatically validate
the dataset schema as it is being edited.== Repo Structure
The VDI service repository root directory contains subdirectories for source
code, configuration, documentation, and deployment related files. Most
development tasks will be performed in the subprojects under the `./service`
directory.=== Service Components
==== Lanes
Dataset event handlers. Each lane is a separate process that subscribes to a
Kafka channel and operates on datasets whose information is provided in the
incoming events.* link:module/lane/hard-delete/[Hard Delete]
* link:module/lane/import/[Import]
* link:module/lane/install/[Install Data]
* link:module/lane/reconciliation/[Reconciliation]
* link:module/lane/sharing/[Share]
* link:module/lane/soft-delete/[Soft Delete]
* link:module/lane/update-meta/[Update Meta]==== Rest Service
The rest service is the public API through which users and administrators
communicate with and operate on the VDI system.* link:module/rest-service/[Rest API Service]
==== Daemons
Independent background tasks.
* link:module/daemon/event-router/[MinIO Event Router]
* link:module/daemon/pruner/[Stale Object Pruner]
* link:module/daemon/reconciler/[Dataset Reconciler]==== Bootstrapper
The bootstrapper is responsible for starting up the service modules listed above
and ensuring a full JVM shutdown if any service module crashes.* link:module/bootstrap/[Bootstrapper]
=== Internal Libs
.link:lib/dataset/[Dataset Management]
* link:lib/dataset/pruner[Dataset Pruner Implementation]
* link:lib/dataset/reconciler/[Dataset Reconciler Implementation]
* link:lib/dataset/reinstaller/[Dataset Reinstaller].link:lib/db/[Database Interaction]
* link:lib/db/application/[Application DB Client]
* link:lib/db/internal/[Internal DB Client]
* link:lib/db/common/[Shared DB Components].link:lib/plugin/[Plugin Communication]
* link:lib/plugin/client[Plugin HTTP Client]
* link:lib/plugin/registry/[Enabled Plugin Mapping].link:lib/external[External Service APIs]
* link:lib/external/kafka[Kafka Client]
* link:lib/external/ldap[LDAP Utilities]
* link:lib/external/rabbit[Rabbit Client]
* link:lib/external/s3[MinIO Dataset Management Wrapper].Misc
* link:lib/async/[Async Utilities]
* link:lib/common/[Universal Components]
* link:lib/config/[Dumb Service Config POJOs]
* link:lib/install-target/[Dataset Install Target Registry]
* link:lib/module-core/[Service/Module Core API]
* link:lib/test-utils[Unit Test Utilities]== VDI Project Repository Links
.Services
* https://github.com/VEuPathDB/vdi-service[VDI Core Service]
* https://github.com/VEuPathDB/vdi-plugin-handler-server[VDI Plugin Handler Service].Plugins
* https://github.com/VEuPathDB/vdi-plugin-bigwig[bigWig]
* https://github.com/VEuPathDB/vdi-plugin-biom[BIOM]
* https://github.com/VEuPathDB/vdi-plugin-genelist[Gene List]
* https://github.com/VEuPathDB/vdi-plugin-isasimple[ISA Study]
* https://github.com/VEuPathDB/vdi-plugin-noop[NoOp]
* https://github.com/VEuPathDB/vdi-plugin-wrangler[Phenotype]
* https://github.com/VEuPathDB/vdi-plugin-rnaseq[RNA-Seq].Docker Images
* https://github.com/VEuPathDB/vdi-internal-db[Cache DB Docker Image]
* https://github.com/VEuPathDB/docker-gus-apidb-base[Gus/ApiDB Schema Base] +
[.small]#_Not explicitly part of VDI, but the base image for several plugins_#.Service Libraries
* https://github.com/VEuPathDB/vdi-component-common[Commons Library]
* https://github.com/VEuPathDB/vdi-component-json[JSON Utilities].Plugin Libraries
* https://github.com/VEuPathDB/lib-vdi-plugin-rnaseq[lib-rnaseq]
* https://github.com/VEuPathDB/lib-vdi-plugin-study[lib-study].Misc
* https://github.com/VEuPathDB/vdi-plugin-example[Example Plugin]
* https://github.com/VEuPathDB/VdiSchema[VDI App DB Schema]////
== License
Copyright 2020 VEuPathDB
Licensed under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License. You may obtain a copy of the
License athttp://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.