{"id":20710998,"url":"https://github.com/entur/antu","last_synced_at":"2026-04-24T10:05:27.794Z","repository":{"id":37090683,"uuid":"426954193","full_name":"entur/antu","owner":"entur","description":"Validation of NeTEx datasets","archived":false,"fork":false,"pushed_at":"2026-04-17T06:52:39.000Z","size":8774,"stargazers_count":5,"open_issues_count":17,"forks_count":1,"subscribers_count":16,"default_branch":"main","last_synced_at":"2026-04-17T08:32:32.621Z","etag":null,"topics":["java-17","ror","spring-boot"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"eupl-1.2","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/entur.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-11-11T10:09:57.000Z","updated_at":"2026-04-17T06:43:59.000Z","dependencies_parsed_at":"2023-09-22T00:13:29.555Z","dependency_job_id":"4e02e38b-be4e-4cc2-98ef-3f7b5b8a162c","html_url":"https://github.com/entur/antu","commit_stats":null,"previous_names":[],"tags_count":2209,"template":false,"template_full_name":null,"purl":"pkg:github/entur/antu","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/entur%2Fantu","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/entur%2Fantu/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/entur%2Fantu/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/entur%2Fantu/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/entur","download_url":"https://codeload.github.com/entur/antu/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/entur%2Fantu/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32218299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T09:47:08.147Z","status":"ssl_error","status_checked_at":"2026-04-24T09:46:41.165Z","response_time":64,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["java-17","ror","spring-boot"],"created_at":"2024-11-17T02:13:44.740Z","updated_at":"2026-04-24T10:05:27.788Z","avatar_url":"https://github.com/entur.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Antu\n\nValidate NeTEx datasets against\nthe [Nordic NeTEx Profile](https://enturas.atlassian.net/wiki/spaces/PUBLIC/pages/728891481/Nordic+NeTEx+Profile).\n\n# Data flow\n\nAntu receives NeTEx validation requests from [Marduk](https://github.com/entur/marduk).\nThe request refers to a given NeTEx codespace and a NeTEx dataset (zip archive) stored in a Google Cloud Storage\nbucket.  \nAntu extracts the individual PublicationDelivery files from the NeTEx archive and register a validation job for each of\nthem in a PubSub topic.  \nThe resulting workload is then split among the running Antu Kubernetes pods, processed asynchronously and in parallel.\nEach validation job produces a JSON-serialized ValidationReport object.  \nWhen all validation jobs are complete, the individual ValidationReports are combined in a single object and stored in\nGCS under a unique report ID.  \nAntu sends a message to a PubSub topic to notify Marduk that the validation is complete.\n\n# Validation rules\n\nAntu uses the [NeTEx validator library](https://github.com/entur/netex-validator-java) to execute a set of validation\nrules on the NeTEx dataset.  \nIn addition to the default rules present in this library, Antu defines a set of rules that are specific to Entur and\nrelevant in a Norwegian context.  \nThis applies to validation against the [National Stop Register](https://stoppested.entur.org/) or against the\nOrganisation Register.\n\n# API\n\nValidation reports can be downloaded thanks to a REST API. Reports are identified by their unique report ID.  \nThe API is OAuth2-protected and access rights must be sufficient to access a given report.\n\n# Kubernetes integration\n\nAntu is designed so that the validation workload can be split evenly among single-core Kubernetes pods.  \nThis results in smaller pods, both in terms of CPU and memory consumption, which makes the Kubernetes scheduling process\nmore efficient.\nThe number of pods is adjusted dynamically thanks to a Horizontal Pod Autoscaler.\n\n# Deployment\n\nEnTur deploys Antu using [Harness](https://app.harness.io/ng/account/8VwWgE0WRK67_PWDpkooNA/all/cd/orgs/entur/projects/ror/services/antu)\n\n# Parallel processing\n\nThe Nordic NeTEx Profile mandates that datasets are delivered as single-line files within a zip archive with,\noptionally, a set of \"common files\" that gather objects shared between lines.  \nThis means that validation of individual line files can be run in parallel and mostly independently of one another, with\nthe following exceptions:\n\n* **validating references** from a line file to a shared object in a \"common file\" requires that common files are\n  processed first, and line files afterwards, so that all shared ids can be collected before validating the line files.\n* **validating NeTEx ids uniqueness** across the dataset needs to be synchronized.\n  Antu uses distributed locks and distributed collections stored in Redis to ensure proper synchronization between\n  concurrent jobs.\n\n# Local environment configuration\n\nA minimal local setup requires a Redis memory store, a Google PubSub emulator and access to the stop place\nregistry ([Baba](https://github.com/entur/tiamat)) and the organization registry.\n\n## Redis memory store\n\nAntu uses a memory store to store the cache of stop places and organizations, as well as temporary files created during\nthe validation process.  \nA Docker Redis memory store instance can be used for local testing:\n\n```\ndocker run -p 6379:6379 --name redis-antu redis:6\n```\n\n## Google PubSub emulator\n\nSee https://cloud.google.com/pubsub/docs/emulator for details on how to install the Google PubSub emulator.  \nThe emulator is started with the following command:\n\n```\ngcloud beta emulators pubsub start\n```\n\nand will listen by default on port 8085.\n\nThe emulator port must be set in the Spring Boot application.properties file as well:\n\n```\nspring.cloud.gcp.pubsub.emulatorHost=localhost:8085\ncamel.component.google-pubsub.endpoint=localhost:8085\n```\n\n### Additional instructions for Mac OS when validating large datasets\n\nWhen validating large datasets, Google PubSub emulator may start to fail processing messages due to Mac OS limitations\non the number of ephemeral ports available for outgoing connections.\n\nThis limit can be configured by setting the following values in `/etc/sysctl.conf` (if file does not exist, create it):\n\n```\nnet.inet.ip.portrange.first=32768\nnet.inet.ip.portrange.last=65535\n```\n\nMac will need to be restarted for changes to take effect.\n\n## Access to the stop place registry\n\nAccess to the stop place registry is configured in the Spring Boot application.properties file:\n\n```\nantu.stop.registry.id.url=https://tiamat\n```\n\n## Access to the organization registry\n\nAccess to the organization registry is configured in the Spring Boot application.properties file:\n\n```\nantu.organisation.registry.url=https://org-reg\n```\n\n## Spring boot configuration file\n\nThe application.properties file used in unit tests src/test/resources/application.properties can be used as a\ntemplate.  \nThe Kubernetes configmap helm/antu/templates/configmap.yaml can also be used as a template.\n\n## Starting the application locally\n\n- Run `mvn package` to generate the Spring Boot jar.\n- The application can be started with the following command line:  \n  ```java -Xmx500m -Dspring.config.location=/path/to/application.properties -Dfile.encoding=UTF-8 -jar target/antu-0.0.1-SNAPSHOT.jar```\n\n# Antu rule set\n\nAntu comes with the following rule sets, depending on the validation profile used for validation:\n\n### For validation Profile `Timetable`\n\n| Sr. | Rule Code                                                   |                                                      Rule Description                                                       |\n|-----|-------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------:|\n| 1   | NETEX_ID_4                                                  |                                   Use of unapproved codespace. Approved codespaces are %s                                   |\n| 2   | NETEX_ID_4W                                                 |                                   Use of unapproved codespace. Approved codespaces are %s                                   |\n| 3   | NETEX_ID_2                                                  |                                               Invalid id structure on element                                               |\n| 4   | NETEX_ID_3                                                  |                                           Invalid structure on id %s. Expected %s                                           |\n| 5   | NETEX_ID_8                                                  |                                   Missing version attribute on elements with id attribute                                   |\n| 6   | NETEX_ID_9                                                  |                                  Missing version attribute on reference to local elements                                   |\n| 7   | NETEX_ID_6                                                  | Reference to %s is not allowed from element %s. Generally an element named XXXXRef may only reference elements if type XXXX |\n| 8   | NETEX_ID_7                                                  |                                               Invalid id structure on element                                               |\n| 9   | NETEX_ID_5                                                  |                                       Unresolved reference to external reference data                                       |\n| 10  | NETEX_ID_1                                                  |                                         Duplicate element identifiers across files                                          |\n| 11  | NETEX_ID_10                                                 |                                      Duplicate element identifiers across common files                                      |\n| 12  | INVALID_TRANSPORT_MODE                                      |                                                   Invalid transport mode                                                    |\n| 13  | TIMETABLED_PASSING_TIME_INCONSISTENT_TIME                   |                                    ServiceJourney has inconsistent TimetabledPassingTime                                    |\n| 14  | TIMETABLED_PASSING_TIME_INCOMPLETE_TIME                     |                                     ServiceJourney has incomplete TimetabledPassingTime                                     |\n| 15  | TIMETABLED_PASSING_TIME_NON_INCREASING_TIME                 |                                   ServiceJourney has non-increasing TimetabledPassingTime                                   |\n| 16  | HIGH_SPEED                                                  |                                              ServiceJourney has too high speed                                              |\n| 17  | LOW_SPEED                                                   |                                                ServiceJourney has low speed                                                 |\n| 18  | WARNING_SPEED                                               |                                                ServiceJourney has high speed                                                |\n| 19  | SAME_DEPARTURE_ARRIVAL_TIME                                 |                                      Same departure/arrival time for consecutive stops                                      |\n| 20  | CODESPACE                                                   |              Codespace %s is not in the list of valid codespaces for this data space. Valid codespaces are %s               |\n| 21  | VERSION_NON_NUMERIC                                         |                                                  Non-numeric NeTEx version                                                  |\n| 22  | JOURNEY_PATTERN_NO_BOARDING_ALLOWED_AT_LAST_STOP            |                                   Last StopPointInJourneyPattern must not allow boarding                                    |\n| 23  | JOURNEY_PATTERN_NO_ALIGHTING_ALLOWED_AT_FIRST_STOP          |                                  First StopPointInJourneyPattern must not allow alighting                                   |\n| 23  | SAME_STOP_POINT_IN_JOURNEY_PATTERNS                         |                                            JourneyPatterns have same StopPoints                                             |\n| 24  | INVALID_NUMBER_OF_SERVICE_LINKS_IN_JOURNEY_PATTERN          |                                      Invalid number of ServiceLinks in JourneyPattern                                       |\n| 25  | SAME_QUAY_REF_IN_CONSECUTIVE_STOP_POINTS_IN_JOURNEY_PATTERN |                                Same quay refs in consecutive stop points in journey pattern                                 |\n\n### For validation Profiles `TimetableFlexibleTransport` and `ImportTimetableFlexibleTransport`\n\n| Sr. | Rule Code             |                                                      Rule Description                                                       |\n|-----|-----------------------|:---------------------------------------------------------------------------------------------------------------------------:|\n| 1   | NETEX_FILE_NAME_1     |                                                      Invalid filename                                                       |\n| 2   | NETEX_ID_4W           |                                   Use of unapproved codespace. Approved codespaces are %s                                   |\n| 3   | NETEX_ID_2            |                                               Invalid id structure on element                                               |\n| 4   | NETEX_ID_3            |                                           Invalid structure on id %s. Expected %s                                           |\n| 5   | NETEX_ID_4            |                                   Use of unapproved codespace. Approved codespaces are %s                                   |\n| 6   | NETEX_ID_8            |                                   Missing version attribute on elements with id attribute                                   |\n| 7   | NETEX_ID_9            |                                  Missing version attribute on reference to local elements                                   |\n| 8   | NETEX_ID_6            | Reference to %s is not allowed from element %s. Generally an element named XXXXRef may only reference elements if type XXXX |\n| 9   | NETEX_ID_7            |                                               Invalid id structure on element                                               |\n| 10  | NETEX_ID_5            |                                       Unresolved reference to external reference data                                       |\n| 11  | NETEX_ID_1            |                                         Duplicate element identifiers across files                                          |\n| 12  | NETEX_ID_10           |                                      Duplicate element identifiers across common files                                      |\n| 13  | CODESPACE             |              Codespace %s is not in the list of valid codespaces for this data space. Valid codespaces are %s               |\n| 14  | VERSION_NON_NUMERIC   |                                                  Non-numeric NeTEx version                                                  |\n| 15  | INVALID_FLEXIBLE_AREA |                                                    Invalid flexible area                                                    |\n\n### For validation Profile `TimetableFlexibleTransportMerging`\n\n| Sr. | Rule Code   |                 Rule Description                  |\n|-----|-------------|:-------------------------------------------------:|\n| 1   | NETEX_ID_10 | Duplicate element identifiers across common files |\n| 2   | NETEX_ID_1  |    Duplicate element identifiers across files     |\n\n### For validation Profile `Stop`\n\n| Sr. | Rule Code   |                                                      Rule Description                                                       |\n|-----|-------------|:---------------------------------------------------------------------------------------------------------------------------:|\n| 1   | NETEX_ID_4W |                                   Use of unapproved codespace. Approved codespaces are %s                                   |\n| 2   | NETEX_ID_2  |                                               Invalid id structure on element                                               |\n| 3   | NETEX_ID_3  |                                           Invalid structure on id %s. Expected %s                                           |\n| 4   | NETEX_ID_4  |                                   Use of unapproved codespace. Approved codespaces are %s                                   |\n| 5   | NETEX_ID_8  |                                   Missing version attribute on elements with id attribute                                   |\n| 6   | NETEX_ID_9  |                                  Missing version attribute on reference to local elements                                   |\n| 7   | NETEX_ID_6  | Reference to %s is not allowed from element %s. Generally an element named XXXXRef may only reference elements if type XXXX |\n| 8   | NETEX_ID_7  |                                               Invalid id structure on element                                               |\n| 9   | NETEX_ID_5  |                                       Unresolved reference to external reference data                                       |\n| 10  | NETEX_ID_1  |                                         Duplicate element identifiers across files                                          |\n| 11  | NETEX_ID_10 |                                      Duplicate element identifiers across common files                                      |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fentur%2Fantu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fentur%2Fantu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fentur%2Fantu/lists"}