{"id":23122852,"url":"https://github.com/folio-org/mod-source-record-storage","last_synced_at":"2026-04-15T15:00:47.335Z","repository":{"id":37576432,"uuid":"150270078","full_name":"folio-org/mod-source-record-storage","owner":"folio-org","description":"Persistent source record storage","archived":false,"fork":false,"pushed_at":"2026-01-16T13:43:14.000Z","size":3641,"stargazers_count":5,"open_issues_count":1,"forks_count":5,"subscribers_count":16,"default_branch":"master","last_synced_at":"2026-01-17T02:22:32.106Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/folio-org.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2018-09-25T13:30:41.000Z","updated_at":"2026-01-14T14:04:36.000Z","dependencies_parsed_at":"2023-10-16T22:13:39.979Z","dependency_job_id":"f82c6a4a-127e-4280-8965-7f575d821d6b","html_url":"https://github.com/folio-org/mod-source-record-storage","commit_stats":null,"previous_names":[],"tags_count":141,"template":false,"template_full_name":null,"purl":"pkg:github/folio-org/mod-source-record-storage","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-source-record-storage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-source-record-storage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-source-record-storage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-source-record-storage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/folio-org","download_url":"https://codeload.github.com/folio-org/mod-source-record-storage/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-source-record-storage/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28506593,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T10:25:30.148Z","status":"ssl_error","status_checked_at":"2026-01-17T10:25:29.718Z","response_time":85,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-17T07:31:03.686Z","updated_at":"2026-01-17T10:54:59.100Z","avatar_url":"https://github.com/folio-org.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mod-source-record-storage\n\nCopyright (C) 2018-2025 The Open Library Foundation\n\nThis software is distributed under the terms of the Apache License,\nVersion 2.0. See the file \"[LICENSE](LICENSE)\" for more information.\n\n\u003c!-- ../../okapi/doc/md2toc -l 2 -h 4 README.md --\u003e\n* [Introduction](#introduction)\n* [Compiling](#compiling)\n* [Docker](#docker)\n* [Installing the module](#installing-the-module)\n* [Deploying the module](#deploying-the-module)\n* [Interaction with Kafka](#interaction-with-kafka)\n* [Database schemas](#database-schemas)\n* [How to fill module with data for testing purposes](https://wiki.folio.org/x/G6bc)\n\n## Introduction\n\nFOLIO compatible source record storage module.\n\nProvides PostgreSQL based storage to complement the data import module. Written in Java, using the raml-module-builder and uses Maven as its build system.\n\n## Compiling\n\n\u003e [Docker](https://www.docker.com/) is now required to build mod-source-record-storage. [docker-maven-plugin](https://dmp.fabric8.io/) is used to create a Postgres Container\nfor running [Liquibase](https://www.liquibase.org/) scripts and generating [jOOQ](https://www.jooq.org/) schema DAOs for type safe SQL query building.\n\n```\n   mvn install\n```\n\nSee that it says \"BUILD SUCCESS\" near the end.\n\n## Docker\n\nBuild the docker container with:\n\n```\n   docker build -t mod-source-record-storage .\n```\n\nTest that it runs with:\n\n```\n   docker run -t -i -p 8081:8081 mod-source-record-storage\n```\n\n## Installing the module\n\nFollow the guide of\n[Deploying Modules](https://github.com/folio-org/okapi/blob/master/doc/guide.md#example-1-deploying-and-using-a-simple-module)\nsections of the Okapi Guide and Reference, which describe the process in detail.\n\nFirst of all you need a running Okapi instance.\n(Note that [specifying](../README.md#setting-things-up) an explicit 'okapiurl' might be needed.)\n\n```\n   cd .../okapi\n   java -jar okapi-core/target/okapi-core-fat.jar dev\n```\n\nWe need to declare the module to Okapi:\n\n```\ncurl -w '\\n' -X POST -D -   \\\n   -H \"Content-type: application/json\"   \\\n   -d @target/ModuleDescriptor.json \\\n   http://localhost:9130/_/proxy/modules\n```\n\nThat ModuleDescriptor tells Okapi what the module is called, what services it\nprovides, and how to deploy it.\n\n## Deploying the module\n\nNext we need to deploy the module. There is a deployment descriptor in\n`target/DeploymentDescriptor.json`. It tells Okapi to start the module on 'localhost'.\n\nDeploy it via Okapi discovery:\n\n```\ncurl -w '\\n' -D - -s \\\n  -X POST \\\n  -H \"Content-type: application/json\" \\\n  -d @target/DeploymentDescriptor.json  \\\n  http://localhost:9130/_/discovery/modules\n```\n\nThen we need to enable the module for the tenant:\n\n```\ncurl -w '\\n' -X POST -D -   \\\n    -H \"Content-type: application/json\"   \\\n    -d @target/TenantModuleDescriptor.json \\\n    http://localhost:9130/_/proxy/tenants/\u003ctenant_name\u003e/modules\n```\n\n\n## Interaction with Kafka\n\nThere are several properties that should be set for modules that interact with Kafka: **KAFKA_HOST, KAFKA_PORT, OKAPI_URL, ENV**(unique env ID).\nAfter setup, it is good to check logs in all related modules for errors. Data import consumers and producers work in separate verticles that are set up in RMB's InitAPI for each module. That would be the first place to check deploy/install logs.\n\n**Environment variables** that can be adjusted for this module and default values:\n* Relevant from the **Ramsons** release, module versions from 5.9.0:\n  * \"AUTHORITY_TO_BIB_LINK_CHANGE_HANDLER_RETRY_COUNT\": 5 \n* Relevant from the **Iris** release, module versions from 5.0.0:\n  * \"_srs.kafka.ParsedMarcChunkConsumer.instancesNumber_\": 1\n  * \"_srs.kafka.DataImportConsumer.instancesNumber_\": 1\n  * \"_srs.kafka.ParsedRecordChunksKafkaHandler.maxDistributionNum_\": 100\n  * \"_srs.kafka.DataImportConsumer.loadLimit_\": 5\n  * \"_srs.kafka.DataImportConsumerVerticle.maxDistributionNum_\": 100\n  * \"_srs.kafka.ParsedMarcChunkConsumer.loadLimit_\": 5\n* Relevant from the **Juniper** release, module versions from 5.1.0:\n  * \"_srs.kafka.QuickMarcConsumer.instancesNumber_\": 1\n  * \"_srs.kafka.QuickMarcKafkaHandler.maxDistributionNum_\": 100\n* Relevant from the **Juniper** release(module version from 5.1.0) to **Kiwi** release (module version from 5.2.0)\n  * \"_srs.kafka.cache.cleanup.interval.ms_\": 3600000\n  * \"_srs.kafka.cache.expiration.time.hours_\": 3\n* Relevant from the **Morning Glory** release(module version from 5.4.0):\n  * \"_srs.cleanup.last.updated.days_\": 7\n  * \"_srs.cleanup.limit_\": 100\n  * \"_srs.cleanup.cron.expression_\": 0 0 0 * * ?\n* Relevant from the **Orchid** release, module versions from 5.6.0:\n  * \"_srs.kafka.AuthorityLinkChunkKafkaHandler.maxDistributionNum_\": 100\n  * \"_srs.kafka.AuthorityLinkChunkConsumer.loadLimit_\": 2\n* Relevant from the **Poppy** release, module versions from 5.7.0:\n  * \"_srs.linking-rules-cache.expiration.time.hours_\": 12\n* Relevant from the **Ramsons** release, module versions from 5.9.8:\n  * \"SRS\\_MARCINDEXERS\\_DELETE\\_INTERVAL\\_SECONDS\": 1800\n  * \"SRS\\_MARCINDEXERS\\_DELETE\\_PLANNEDTIME\" (no default, comma separated list of times of the day, for example `01:55,03:55`)\n  * \"SRS\\_MARCINDEXERS\\_DELETE\\_DIRTYBATCHSIZE\": 100000\n* Variables for setting number of partitions of topics:\n  * DI_PARSED_RECORDS_CHUNK_SAVED_PARTITIONS\n  * DI_SRS_MARC_AUTHORITY_RECORD_MATCHED_PARTITIONS\n  * DI_SRS_MARC_AUTHORITY_RECORD_NOT_MATCHED_PARTITIONS\n  * DI_SRS_MARC_AUTHORITY_RECORD_DELETED_PARTITIONS\n  * DI_SRS_MARC_HOLDINGS_HOLDING_HRID_SET_PARTITIONS\n  * DI_SRS_MARC_HOLDINGS_RECORD_MODIFIED_READY_FOR_POST_PROCESSING_PARTITIONS\n  * DI_SRS_MARC_HOLDINGS_RECORD_UPDATED_PARTITIONS\n  * DI_SRS_MARC_BIB_RECORD_UPDATED_PARTITIONS\n  * DI_SRS_MARC_AUTHORITY_RECORD_MODIFIED_READY_FOR_POST_PROCESSING_PARTITIONS\n  * DI_SRS_MARC_BIB_RECORD_MATCHED_READY_FOR_POST_PROCESSING_PARTITIONS\n  * DI_LOG_SRS_MARC_AUTHORITY_RECORD_CREATED_PARTITIONS\n  * DI_LOG_SRS_MARC_AUTHORITY_RECORD_UPDATED_PARTITIONS\n  * DI_SRS_MARC_HOLDINGS_RECORD_MATCHED\n  * DI_SRS_MARC_HOLDINGS_RECORD_NOT_MATCHED\n  * DI_SRS_MARC_AUTHORITY_RECORD_UPDATED\n  * SRS_SOURCE_RECORDS_PARTITIONS\n  Default value for all partitions is 1\n* DOMAIN_EVENTS_ENABLED env variable defines if Source Record Domain Event publishing should occur. True by default.\n## Database schemas\n\nThe mod-source-record-storage module uses relational approach and Liquibase to define database schemas.\n\nDatabase schemas are described in Liquibase scripts using XML syntax.\nEvery script file should contain only one \"databaseChangeLog\" that consists of at least one \"changeset\" describing the operations on tables. \nScripts should be named using following format:\n`yyyy-MM-dd--hh-mm-schema_change_description`.  \\\n`yyyy-MM-dd--hh-mm` - date of script creation;  \\\n`schema_change_description` - short description of the change.\n\nEach \"changeset\" should be uniquely identified by the `\"author\"` and `\"id\"` attributes. It is advised to use the Github username as `\"author\"` attribute. \nThe `\"id\"` attribute value should be defined in the same format as the script file name.  \n\nIf needed, database schema name can be obtained using Liquibase context property `${database.defaultSchemaName}`.\n\nLiquibase scripts are stored in `/resources/liquibase/` directory.\nScripts files for module and tenant schemas are stored separately in `/resources/liquibase/module/scripts` and `/resources/liquibase/tenant/scripts` respectively.\n\\\nTo simplify the tracking of schemas changes, the tenant versioning is displayed in the directories structure:\n```\n/resources/liquibase\n    /tenant/scripts\n              /v-1.0.0\n                  /2019-08-14--14-00-create-tenant-table.xml\n              /v-2.0.0\n                  /2019-09-03--11-00-change-id-column-type.xml\n    /tenant/scripts\n              /v-1.0.0\n                  /2019-09-06--15-00-create-record-field-table.xml\n```\n\n### Database redesign\n\nThe database has recently been redesigned to use standard relational table design with less usage of JSONB columns and more use of foreign key constraints and default B-tree indexes optimized for single value columns. The rational was to improve performance of data retrieval and data import. A significant change was the addition of `leader_record_status` column on the `records` table that is populated via a trigger on insert and update on the `marc_records` table. This provides ability to query on status of MARC record quickly and also condition appropriate leader record status that indicate the record has been deleted.\n\n\u003cimg src=\"er-diagram.png\" alt=\"Source Record Storage ER Diagram\" style=\"display:block; float:none; margin-left:auto; margin-right:auto;\" /\u003e\n\n## [jOOQ](https://www.jooq.org/)\n\nDuring the redesign we opted to use jOOQ for type safe fluent SQL building. The jOOQ type safe tables and resources are generated during the `generate-source` Maven lifecycle using [vertx-jooq](https://github.com/jklingsporn/vertx-jooq) reactive Vert.x generator. The code is generated from the database metadata. For this to occur during build, `liquibase-maven-plugin` is used to consume the Liquibase changelog and provision a temporary database started using `embedded-postgresql-maven-plugin`.\n\n\u003e jOOQ affords plain SQL strings, but it is not recommended. Use of type safe Java abstraction including variable binding eliminates SQL injection vulnerabilities.\n\n## REST Client for mod-source-record-storage\n\nFor using module's endpoints it provides generated by RMB client. This client is packaged into the lightweight jar.\n\n### Maven dependency \n\n```xml\n    \u003cdependency\u003e\n      \u003cgroupId\u003eorg.folio\u003c/groupId\u003e\n      \u003cartifactId\u003emod-source-record-storage-client\u003c/artifactId\u003e\n      \u003cversion\u003ex.y.z\u003c/version\u003e\n      \u003ctype\u003ejar\u003c/type\u003e\n    \u003c/dependency\u003e\n```\nWhere x.y.z - version of mod-source-record-storage.\n\n### Usage\n\nSourceStorageClient is generated by RMB and provides methods for all modules endpoints described in the RAML file\n```java\n    // create records client object with okapi url, tenant id and token\n    SourceStorageRecordsClient client = new SourceStorageRecordsClient(\"localhost\", \"diku\", \"token\");\n```\nClient methods work with generated by RMB data classes based on json schemas. \nmod-source-record-storage-client jar contains only generated by RMB DTOs and clients. \n```\n    // create new record entity\n    Record record = new Record();\n    record.setRecordType(Record.RecordType.MARC_BIB);\n    record.setRawRecord(new RawRecord().withContent(\"content\"));\n```\nExample with sending request to the mod-source-record-storage for creating new Record\n```\n    // send request to mod-source-record-storage\n    client.postSourceStorageRecords(null, record, response -\u003e {\n      // processing response\n      if (response.statusCode() == 201) {\n        System.out.println(\"Record is successfully created.\");\n      }\n    });\n```\n\n## Load sample data for module testing\nTo load sample data after module initialization, you need to POST [testMarcRecordsCollection](https://github.com/folio-org/data-import-raml-storage/blob/master/schemas/mod-source-record-storage/testMarcRecordsCollection.json) DTO to `/source-storage/populate-test-marc-records`.\n\n```\n{\n  \"rawRecords\": [\n    ...\n  ]\n}\n```\n \n## Issue tracker\n\nSee project [MODSOURCE](https://issues.folio.org/browse/MODSOURCE)\nat the [FOLIO issue tracker](https://dev.folio.org/guidelines/issue-tracker/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffolio-org%2Fmod-source-record-storage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffolio-org%2Fmod-source-record-storage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffolio-org%2Fmod-source-record-storage/lists"}