{"id":23122664,"url":"https://github.com/folio-org/mod-erm-usage-harvester","last_synced_at":"2026-04-17T12:01:40.192Z","repository":{"id":36389478,"uuid":"160167785","full_name":"folio-org/mod-erm-usage-harvester","owner":"folio-org","description":"Harvest ERM usage statistics. Funded by European Regional Development Fund (EFRE).","archived":false,"fork":false,"pushed_at":"2026-03-11T19:14:49.000Z","size":1262,"stargazers_count":2,"open_issues_count":2,"forks_count":3,"subscribers_count":17,"default_branch":"master","last_synced_at":"2026-03-12T00:38:52.544Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/folio-org.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2018-12-03T09:44:34.000Z","updated_at":"2026-03-04T13:33:59.000Z","dependencies_parsed_at":"2026-02-18T14:01:38.267Z","dependency_job_id":null,"html_url":"https://github.com/folio-org/mod-erm-usage-harvester","commit_stats":null,"previous_names":[],"tags_count":44,"template":false,"template_full_name":null,"purl":"pkg:github/folio-org/mod-erm-usage-harvester","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-erm-usage-harvester","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-erm-usage-harvester/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-erm-usage-harvester/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-erm-usage-harvester/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/folio-org","download_url":"https://codeload.github.com/folio-org/mod-erm-usage-harvester/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/folio-org%2Fmod-erm-usage-harvester/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31928229,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-17T10:35:34.458Z","status":"ssl_error","status_checked_at":"2026-04-17T10:35:09.472Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-17T07:29:24.730Z","updated_at":"2026-04-17T12:01:40.186Z","avatar_url":"https://github.com/folio-org.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mod-erm-usage-harvester\n\nCopyright (C) 2018-2024 The Open Library Foundation\n\nThis software is distributed under the terms of the Apache License, Version 2.0. See the\nfile \"[LICENSE](LICENSE)\" for more information.\n\n![Development funded by European Regional Development Fund (EFRE)](assets/EFRE_2015_quer_RGB_klein.jpg)\n\n## Introduction\n\nModule for harvesting counter reports.\n\n## Requirements\n\n* The module needs to know about the Okapi URL ([see here](#setting-the-okapi-url)).\n* Environment variables for database connectivity need to be\n  provided ([see here](https://github.com/folio-org/raml-module-builder#environment-variables)).\n\n## Installation\n\n```\n$ git clone ...\n$ cd mod-erm-usage-harvester\n$ mvn clean install\n```\n\n### Run plain jar\n\n```\n$ env OKAPI_URL=http://127.0.0.1:9130 java -jar \\\n  mod-erm-usage-harvester-bundle/target/mod-erm-usage-harvester-bundle-fat.jar\n```\n\n### Run via Docker\n\n#### Build docker image\n\n```\n$ docker build -t mod-erm-usage-harvester .\n```\n\n#### Run docker image\n\n```\n$ docker run -e OKAPI_URL=http://127.0.0.1:9130 -p 8081:8081 mod-erm-usage-harvester\n```\n\n## Configuration\n\n### Listening port\n\nThe default listening port is `8081` and can be set by using `-Dhttp.port` parameter when running\nthe jar file or using the `-p` flag when using `docker run`.\n\n### Setting the Okapi URL\n\nUse the environment variable named `OKAPI_URL` to provide the URL to Okapi.\n\n### Proxy configuration\n\nProxy settings are configured via JVM system properties if you are running the plain jar.\n\n* `http.proxyHost`, `http.proxyPort`, `https.proxyHost`, `https.proxyPort`, `http.nonProxyHosts`\n\nAnd via environment variables if you are running the Docker container.\n\n* `HTTP_PROXY`, `HTTPS_PROXY`, `NO_PROXY`  \n  These get translated into JVM system properties by\n  the [base image](https://github.com/folio-org/folio-tools/tree/master/folio-java-docker/openjdk17).\n\n### Quartz scheduler\n\nQuartz configuration is located\nin [quartz.properties](mod-erm-usage-harvester-bundle/src/main/resources/org/quartz/quartz.properties)\n. If you wish to use another file, you must define the system property `org.quartz.properties` to\npoint to the file you want. You can also set individual quartz properties using system properties (\ne.g. `-Dorg.quartz.threadPool.threadCount=8`). The `org.quartz.threadPool.threadCount` \nproperty controls how many providers are harvested concurrently.\n\n### Hazelcast\n\nThe default Quartz configuration uses the `HazelcastJobStore` for clustering which relies on \nHazelcast. By default the [standard configuration](https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/resources/hazelcast-default.xml)\nshipped with hazelcast is used. You can supply your own XML or YAML configuration through the \n`hazelcast.config` system property or just put it into the working directory. If you're using \nclustering, make sure that member discovery is working by inspecting the logs. You might want to \ntailor the Hazelcast configuration to suit your particular deployment environment. You can read\nabout Hazelcast discovery mechanisms [here](https://docs.hazelcast.com/hazelcast/5.3/clusters/discovery-mechanisms).\n\n## Periodic harvesting\n\nPeriodic harvesting uses a system user that is automatically created and managed by the platform on\nEureka deployments. The system user is granted the `ermusageharvester.start-all.get` permission as\ndefined in the module descriptor's `metadata.user` section.\n\nPeriodic harvesting is set up through the `erm-usage-harvester/periodic` API. Configuration is done\nfor each tenant separately by using the `X-Okapi-Tenant` header.\nSee [PeriodicConfig](ramls/schemas/periodicConfig.json)\nand [periodic.raml](ramls/periodic.raml).\n\nExample:\n\n```\ncurl --request POST \\\n  --url http://localhost:9130/erm-usage-harvester/periodic \\\n  --header 'content-type: application/json' \\\n  --header 'x-okapi-tenant: diku' \\\n  --data '{\n  \"startAt\": \"2019-01-01T08:00:00.000+0000\",\n  \"periodicInterval\": \"daily\"\n}'\n```\n\nThis request will create a schedule which triggers harvesting for tenant `diku` each day at 8am UTC\nstarting on `2019-01-01`.\n\n__Note:__ Using `\"periodicInterval: \"monthly\"`  and `startAt` with days \u003e 28 will result in a _'last\nday of month'_ schedule.\n\nExample 2:\n\n```json\n{\n  \"startAt\": \"2019-01-29T08:00:00.000+0000\",\n  \"periodicInterval\": \"monthly\"\n}\n```\n\nThis configuration will trigger harvesting every last day of month at 8am UTC starting\non `2019-01-31`\nfollowed by `2019-02-28`, `2019-03-31`, `2019-04-30`, ... .\n\n## ServiceEndpoint implementations\n\nThe [ServiceEndpoint](mod-erm-usage-harvester-spi/src/main/java/org/olf/erm/usage/harvester/endpoints/ServiceEndpoint.java)\nimplementation defines how reports are fetched for a provider. To provide additional implementations\nyou will need to implement the\n[ServiceEndpointProvider](mod-erm-usage-harvester-spi/src/main/java/org/olf/erm/usage/harvester/endpoints/ServiceEndpointProvider.java)\ninterface and make it available on the classpath.\n\nSo far 3 implementations are provided:\n\n* `mod-erm-usage-harvester-cs41`\n  – [Counter Sushi 4.1](https://www.projectcounter.org/code-of-practice-sections/sushi/)\n* `mod-erm-usage-harvester-cs50`\n  – [Counter Sushi 5.0 API](https://app.swaggerhub.com/apis/COUNTER/counter-sushi_5_0_api/1.0.0)\n* `mod-erm-usage-harvester-nss` – [Germanys National Statistics Server](https://statistik.hebis.de/)\n\nImplementations available at runtime can be listed at `/erm-usage-harvester/impl`.\n\n```\n{\n  \"implementations\": [\n    {\n      \"name\": \"Counter-Sushi 4.1\",\n      \"description\": \"SOAP-based implementation for CounterSushi 4.1\",\n      \"type\": \"cs41\",\n      \"isAggregator\": false\n    },\n    {\n      \"name\": \"Counter 5.0\",\n      \"description\": \"Implementation for Counter/Sushi 5\",\n      \"type\": \"cs50\",\n      \"isAggregator\": false\n    },\n    {\n      \"name\": \"Nationaler Statistikserver\",\n      \"description\": \"Implementation for Germanys National Statistics Server (https://sushi.redi-bw.de).\",\n      \"type\": \"NSS\",\n      \"isAggregator\": true,\n      \"configurationParameters\": [\n        \"apiKey\",\n        \"requestorId\",\n        \"customerId\",\n        \"reportRelease\"\n      ]\n    }\n  ]\n}\n```\n\n### mod-erm-usage-harvester-cs50\n\n#### Request parameters\n\nTo enable the creation of standard views, master reports are retrieved with the following additional parameters:\n\n| Report | Attributes_To_Show                                                                     | Include_Parent_Details |\n| ------ | -------------------------------------------------------------------------------------- | ---------------------- |\n| DR     | Data_Type\\|Access_Method                                                               |                        |\n| IR     | Authors\\|Publication_Date\\|Article_Version\\|Data_Type\\|YOP\\|Access_Type\\|Access_Method | True                   |\n| PR     | Data_Type\\|Access_Method                                                               |                        |\n| TR     | Data_Type\\|Section_Type\\|YOP\\|Access_Type\\|Access_Method                               |                        |\n\n_Example:_  \n`/reports/dr?requestor_id=xxx\u0026customer_id=xxx\u0026begin_date=2021-01\u0026end_date=2021-12\u0026attributes_to_show=Data_Type|Access_Method`\n\n#### Additional processing\n\nDue to providers responding in various ways the provider response is intercepted and adjusted before processing.  \nThis is nescessary as some providers use `2xx` status codes to send sushi errors, but the generated client expects `2xx` codes to return counter reports and different codes to return sushi errors.  \nSo if reponses with status code `2xx` are received, it is checked whether the response data structure matches one of the 4 counter master reports (`TR`, `PR`, `DR` and `IR`). If it does match, no changes are made to the response. If it does not match, the response gets transformed into a `400 - Bad Request` response, preserving the original response body in cases listed below.\n\nSome observations and how they are handled so far:\n\n* Providers use `2xx` status codes to return sushi errors, not reports (gets routed and handled as `400` with original response body)\n* Providers return sushi errors as array instead of object (array makes it into the response body)\n* Providers return `\"null\"` instead of sushi error (returns a `InvalidReportException: null`)\n* Providers return reports with a `Report_Header` that contains a `Exception` object instead of a `Exceptions` array (not handled, will be interpreted as report without `Exceptions`)\n\n## Additional information\n\n### Issue tracker\n\nSee project [MODEUSHARV](https://issues.folio.org/browse/MODEUSHARV)\nat the [FOLIO issue tracker](https://dev.folio.org/guidelines/issue-tracker).\n\n### Other documentation\n\nOther [modules](https://dev.folio.org/source-code/#server-side) are described, with further FOLIO\nDeveloper documentation at [dev.folio.org](https://dev.folio.org/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffolio-org%2Fmod-erm-usage-harvester","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffolio-org%2Fmod-erm-usage-harvester","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffolio-org%2Fmod-erm-usage-harvester/lists"}