{"id":23789810,"url":"https://github.com/officiallysingh/spring-batch-commons","last_synced_at":"2025-08-03T04:13:33.620Z","repository":{"id":214143050,"uuid":"735005106","full_name":"officiallysingh/spring-batch-commons","owner":"officiallysingh","description":"Spring batch common components for partitioned jobs","archived":false,"fork":false,"pushed_at":"2024-01-11T08:22:55.000Z","size":228,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-01T17:16:42.907Z","etag":null,"topics":["fault-tolerance","job","partitioning","scalability","spring-batch","spring-batch-jobs","spring-boot"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/officiallysingh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-12-23T10:22:35.000Z","updated_at":"2024-03-22T14:53:02.000Z","dependencies_parsed_at":"2024-01-03T04:41:22.205Z","dependency_job_id":"1de271ed-6353-4b28-99e1-5ad853ed5f13","html_url":"https://github.com/officiallysingh/spring-batch-commons","commit_stats":null,"previous_names":["officiallysingh/spring-batch-commons"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/officiallysingh%2Fspring-batch-commons","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/officiallysingh%2Fspring-batch-commons/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/officiallysingh%2Fspring-batch-commons/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/officiallysingh%2Fspring-batch-commons/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/officiallysingh","download_url":"https://codeload.github.com/officiallysingh/spring-batch-commons/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240018690,"owners_count":19734872,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fault-tolerance","job","partitioning","scalability","spring-batch","spring-batch-jobs","spring-boot"],"created_at":"2025-01-01T17:16:50.512Z","updated_at":"2025-02-21T12:42:44.378Z","avatar_url":"https://github.com/officiallysingh.png","language":"Java","readme":"# Spring Batch common components\n\n[**Spring Batch**](https://docs.spring.io/spring-batch/reference/index.html) is a battle tested Java framework that makes it easy to write batch applications.\nBatch applications involve reliably and efficiently processing large volumes of data to and\nfrom various data sources (files, databases, messaging middleware, and so on).\nSpring Batch is great at doing this and provides the necessary foundation to meet the stringent requirements of batch applications.\nIt provides mechanisms for common tasks such as **task orchestration**, **partitioning**, and **restart**.\n\n![String Batch Architecture](https://github.com/officiallysingh/spring-batch-commons/blob/main/Spring_Batch.jpg)\n\n## Introduction\nSpring batch jobs may require boilerplate code to be written, which is extracted out in this library to promote reusability.\nCommon components of a Spring batch job are defined as Beans and can be reused across multiple jobs. \n**See usage** in Spring Batch Job implemented as [**`Spring Cloud Task`**](https://github.com/officiallysingh/spring-boot-batch-cloud-task) \nand [**`Spring Rest service`**](https://github.com/officiallysingh/spring-boot-batch-web).\n\n## Features\n* Provides common components and utility classes to easily create Spring batch jobs.\n* Provides opinionated default configurations for Spring batch jobs.\n* Supports partitioning of jobs to process data concurrently.\n* Autoconfigures fault tolerance with intelligent defaults to retry and recover for transient failure.\n* The records are processed in chunks, if the job fails midway,\n  it can be restarted from the last failed chunk without re processing already processed records.\n* Supports force restarting already completed jobs.\n* Supports skipping records in case of exceptions.\n* Supports logging of job and step execution events.\n\n## Classes\nFollowing are the classes provided by this library.\n* [`BatchConguration`](src/main/java/com/ksoot/spring/batch/common/BatchConfiguration.java) \nExtends [`DefaultBatchConfiguration`](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/configuration/support/DefaultBatchConfiguration.html) \nand defines default configuration for Spring batch jobs. It is autoconfigured by Spring boot.\n* [`AbstractJobExecutor`](src/main/java/com/ksoot/spring/batch/common/AbstractJobExecutor.java) \nExtendable by consumer application Job executor to execute job with **Run Id Incrementer** to force restart the job in case it was successfully completed in last execution.\n* [`AbstractPartitioner`](src/main/java/com/ksoot/spring/batch/common/AbstractPartitioner.java) Provides common implementation for partitioning Spring batch jobs. \nConsumer applications need to extend this class and provide implementation for **`partitioningList`** method to return `List` of partitioning candidate `String`s.\n* [`JobConfigurationSupport`](src/main/java/com/ksoot/spring/batch/common/JobConfigurationSupport.java) \nExtendable by consumer application to define new Simple and Partitioned jobs with default configurations. \nThe defaults can be overridden per job by consumer applications by overriding respective methods. \nOr default can be overridden globally in consumer application by defining new bean for respective component.\n* [`LoggingJobListener`](src/main/java/com/ksoot/spring/batch/common/LoggingJobListener.java) \nProvides default implementation for Spring batch job listener, which does nothing but logging only.\n* [`LoggingStepListener`](src/main/java/com/ksoot/spring/batch/common/LoggingStepListener.java) \nProvides default implementation for Spring batch step listeners, which does nothing but logging only.\n* [`MongoAggregationPagingItemReader`](src/main/java/com/ksoot/spring/batch/common/MongoAggregationPagingItemReader.java) \nCustom Mongo Paging Item reader using aggregation pipeline and pagination.\n* [`MongoUpsertItemWriter`](src/main/java/com/ksoot/spring/batch/common/MongoUpsertItemWriter.java) \nCustom Mongo Item writer for upsert operation.\n* [`ListFlattenerKafkaItemWriter`](src/main/java/com/ksoot/spring/batch/common/ListFlattenerKafkaItemWriter.java) \nCustom Kafka writer to write a `List` of items to kafka. \nCan be used in cases where the last `Processor` return a List of items, instead of a single item.\n* [`StepStatus`](src/main/java/com/ksoot/spring/batch/common/StepStatus.java) \nUtility Class to define custom Step status, can be enhanced to add more statuses.\n* [`SkipRecordException`](src/main/java/com/ksoot/spring/batch/common/SkipRecordException.java) \nCustom exception to represent skipped records in Spring batch jobs. Default implementation of `SkipPolicy` includes this exception.\n* [`BatchProperties`](src/main/java/com/ksoot/spring/batch/common/BatchProperties.java) \nSpring boot configuration property class to read batch properties from `application.properties` or `application.yml` file.\n\n## Autoconfigured Components\nFollowing are the components, autoconfigured as Beans by Spring boot with opinionated default behaviour.\nThe defaults can be customized by configurations and custom implementations in consumer application.\n\n* [`JobParametersIncrementer`](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/JobParametersIncrementer.html) \nto generate unique run id for each job execution in case of force restarting already successfully completed jobs.\nEach Job execution is uniquely identified by combination of its `identifying` parameters.\nIf a job is restarted with same identifying parameters, Spring batch will throw `JobInstanceAlreadyCompleteException`. So to force restart the job,\n[`AbstractJobExecutor#execute`](https://github.com/officiallysingh/spring-batch-commons/blob/04c4a7232f5e36ace5168c498fa96690615799f8/src/main/java/com/ksoot/spring/batch/common/AbstractJobExecutor.java#L22)\nmethod adds a unique `run.id` to the job execution parameters if `forceRestart` argument is `true`.\nIt can be overridden by defining new `JobParametersIncrementer` bean in consumer application.\nIt requires a database sequence named `run_id_sequence` to generate unique run id which can be overridden \nby setting `batch.run-id-sequence` property in `application.properties` or `application.yml` file. \n\n\u003e [!IMPORTANT]\nAlready running job can not be restarted, as Spring batch does not allow that. \nThough this behaviour can also be overridden but not recommended.\n\n```java\n@ConditionalOnMissingBean\n@Bean\nJobParametersIncrementer jobParametersIncrementer(\n  final DataSource dataSource, final BatchProperties batchProperties) {\n    return new DataFieldMaxValueJobParametersIncrementer(\n        new PostgresSequenceMaxValueIncrementer(dataSource, batchProperties.getRunIdSequence()));\n}\n```\n```sql\nCREATE SEQUENCE IF NOT EXISTS run_id_sequence START WITH 1 INCREMENT BY 1 NO MINVALUE NO MAXVALUE CACHE 1;\n```\n\n* [`BackOffPolicy`](https://www.javadoc.io/static/org.springframework.retry/spring-retry/2.0.5/org/springframework/retry/backoff/BackOffPolicy.html)\nto define back off policy for retrying failed steps. Default is [`ExponentialBackOffPolicy`](https://www.javadoc.io/static/org.springframework.retry/spring-retry/2.0.5/org/springframework/retry/backoff/ExponentialBackOffPolicy.html)\nBackoff delay and multiplier can be customized by setting `batch.backoff-initial-delay` and `batch.backoff-multiplier` properties in `application.properties` or `application.yml` file.\nIt can be overridden by defining new `BackOffPolicy` bean in consumer application.\n```java\n@ConditionalOnMissingBean\n@Bean\nBackOffPolicy backOffPolicy(final BatchProperties batchProperties) {\n    return BackOffPolicyBuilder.newBuilder()\n        .delay(batchProperties.getBackoffInitialDelay().toMillis())\n        .multiplier(batchProperties.getBackoffMultiplier())\n        .build();\n}\n```\n\n* [`RetryPolicy`](https://docs.spring.io/spring-retry/docs/api/current/org/springframework/retry/RetryPolicy.html)\nto define retry policy for retrying failed steps. By default, it retries for `TransientDataAccessException` and `RecoverableDataAccessException` exceptions for JPA and Mongo DB.\nIt works in conjunction with `BackOffPolicy`.\nIt can be overridden by defining new `RetryPolicy` bean in consumer application \nand customized by setting `batch.retry-max-attempts` property in `application.properties` or `application.yml` file.\n```java\n@ConditionalOnMissingBean\n@Bean\nRetryPolicy retryPolicy(final BatchProperties batchProperties) {\n    CompositeRetryPolicy retryPolicy = new CompositeRetryPolicy();\n    retryPolicy.setPolicies(\n        ArrayUtils.toArray(\n            this.noRetryPolicy(batchProperties), this.daoRetryPolicy(batchProperties)));\n    return retryPolicy;\n}\n```\n\n* [`SkipPolicy`](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/step/skip/SkipPolicy.html)\nto define skip policy for skipping records in case of exceptions. By default, it skips `ConstraintViolationException` and `SkipRecordException`.\nIt can be customized by setting `batch.skip-limit` property in `application.properties` or `application.yml` file.\nIt can be defined as [AlwaysSkipItemSkipPolicy](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/step/skip/AlwaysSkipItemSkipPolicy.html) \nto skip all records in case of any exception. \nSkipped exceptions must also be specified in noRollback in Step configuration which is handled by this library automatically. \nIt can be overridden by defining new `SkipPolicy` bean in consumer application. Similarly `skippedExceptions` can also be overridden. \n```java\n@ConditionalOnMissingBean\n@Bean\nSkipPolicy skipPolicy(final BatchProperties batchProperties) {\n    Map\u003cClass\u003c? extends Throwable\u003e, Boolean\u003e exceptionClassifiers =\n        this.skippedExceptions().stream().collect(Collectors.toMap(ex -\u003e ex, ex -\u003e Boolean.TRUE));\n    return new LimitCheckingItemSkipPolicy(batchProperties.getSkipLimit(), exceptionClassifiers);\n}\n\n@ConditionalOnMissingBean\n@Bean\nList\u003cClass\u003c? extends Throwable\u003e\u003e skippedExceptions() {\n    return List.of(ConstraintViolationException.class, SkipRecordException.class);\n}\n```\n\n* [`JobExecutionListener`](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/JobExecutionListener.html) \ndefault implementation as [`LoggingJobListener`](src/main/java/com/ksoot/spring/batch/common/LoggingJobListener.java)\nwhich does nothing but logging only. It can be overridden by defining new `JobExecutionListener` bean in consumer application.\n```java\n@ConditionalOnMissingBean\n@Bean\nJobExecutionListener jobExecutionListener() {\n    return new LoggingJobListener();\n}\n```\n\n* [`StepExecutionListener`](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/StepExecutionListener.html) \ndefault implementation as [`LoggingStepListener`](src/main/java/com/ksoot/spring/batch/common/LoggingStepListener.java) \nwhich does nothing but logging only. It can be overridden by defining new `StepExecutionListener` bean in consumer application.\n```java\n@ConditionalOnMissingBean\n@Bean\nStepExecutionListener stepExecutionListener() {\n    return new LoggingStepListener();\n}\n```\n\n## Configurations\nFollowing are the configuration properties to customize default Spring batch behaviour.\n```yaml\nbatch:\n  chunk-size: 100\n  skip-limit: 10\n  max-retries: 3\n  backoff-initial-delay: PT3S\n  backoff-multiplier: 2\n  page-size: 300\n  partition-size: 16\n  trigger-partitioning-threshold: 100\n#  task-executor: applicationTaskExecutor\n#  run-id-sequence: run_id_sequence\n```\n\n* **`batch.chunk-size`** : Number of items that are processed in a single transaction by a chunk-oriented step, Default: 100.\n* **`batch.skip-limit`** : Maximum number of items to skip as per configured Skip policy, exceeding which fails the job, Default: 10.\n* **`batch.max-retries`** : Maximum number of retry attempts as configured Retry policy, exceeding which fails the job, Default: 3.\n* **`batch.backoff-initial-delay`** : Time duration (in java.time.Duration format) to wait before the first retry attempt is made after a failure, Default: false.\n* **`batch.backoff-multiplier`** : Factor by which the delay between consecutive retries is multiplied, Default: 3.\n* **`batch.page-size`** : Number of records to be read in each page by Paging Item readers, Default: 100.\n* **`batch.partition-size`** : Number of partitions that will be used to process the data concurrently. \nShould be optimized as per available machine resources, Default: 8.\n* **`batch.trigger-partitioning-threshold`** : Minimum number of records to trigger partitioning otherwise \nit could be counter productive to do partitioning, Default: 100.\n* **`batch.task-executor`** : Bean name of the Task Executor to be used for executing the jobs. By default `SyncTaskExecutor` is used. \nSet to `applicationTaskExecutor` to use `SimpleAsyncTaskExecutor` provided by Spring. \nOr use any other custom `TaskExecutor` and set the bean name here. Don't set this property in Spring cloud task but Spring Rest applications.\n* **`batch.run-id-sequence`** : Run Id database sequence name, Default: `run_id_sequence`.\n\n\u003e [!IMPORTANT]\nTo take benefit from Java 21 Virtual threads with Spring boot 3.2 define a [**`VirtualThreadTaskExecutor`**](https://spring.io/blog/2023/11/23/spring-batch-5-1-ga-5-0-4-and-4-3-10-available-now/#virtual-threads-support) and configure the name as `batch.task-executor`.\n\n## Usage\n\n### Installation\nBuilt on Java 21, Spring boot 3.2.0+ and Spring batch 5.1.0+. For java version 17, build from source by changing the java version as follows.\n[**`pom.xml`**](pom.xml)\n```xml\n\u003cproperties\u003e\n    \u003cjava.version\u003e17\u003c/java.version\u003e\n\u003c/properties\u003e\n```\n\n\u003e **Current version: 1.0**\n\nAdd the `spring-batch-commons` jar to application dependencies. \n\nMaven\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.github.officiallysingh\u003c/groupId\u003e\n    \u003cartifactId\u003espring-batch-commons\u003c/artifactId\u003e\n    \u003cversion\u003e1.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\nGradle\n```groovy\nimplementation 'io.github.officiallysingh:spring-batch-commons:1.0'\n```\n\n### Define Jobs\nDefine jobs as Beans by extending [`JobConfigurationSupport`](src/main/java/com/ksoot/spring/batch/common/JobConfigurationSupport.java) class.\nDefault configurations can be overridden for a particular `Job` by overriding respective methods from `JobConfigurationSupport` \nsuch as `retryPolicy`, `skipPolicy` etc. \nTo override default beans globally, define new bean with same name in consumer application.\nRefer to example [`StatementJobConfiguration`](https://github.com/officiallysingh/spring-boot-batch-cloud-task/blob/main/src/main/java/com/ksoot/batch/job/StatementJobConfiguration.java)\n* Define `ItemReader`, `ItemProcessor` and `ItemWriter` beans for each job.\n* To define a simple job, use `simpleJob` method in `JobConfigurationSupport` and return a `Job` bean.\n```java\n@Bean\nJob statementJob(\n    final ItemReader\u003cDailyTransaction\u003e transactionReader,\n    final ItemProcessor\u003cDailyTransaction, Statement\u003e statementProcessor,\n    final ItemWriter\u003cStatement\u003e statementWriter) {\n  return newSimpleJob(\n      AppConstants.STATEMENT_JOB_NAME,\n      transactionReader,\n      statementProcessor,\n      statementWriter);\n}\n```\n\n* To define a partitioned job, use `partitionedJob` method in `JobConfigurationSupport` and return a `Job` bean.\n```java\n@Bean\nJob statementJob(\n    @Qualifier(\"statementJobPartitioner\") final AccountsPartitioner statementJobPartitioner,\n    final ItemReader\u003cDailyTransaction\u003e transactionReader,\n    final ItemProcessor\u003cDailyTransaction, Statement\u003e statementProcessor,\n    final ItemWriter\u003cStatement\u003e statementWriter)\n    throws Exception {\n  return newPartitionedJob(\n      AppConstants.STATEMENT_JOB_NAME,\n      statementJobPartitioner,\n      transactionReader,\n      statementProcessor,\n      statementWriter);\n}\n```\n\n* Partitioned jobs also require a partitioner bean to define partitioning strategy. \nDefine a `Partitioner` bean by extending [`AbstractPartitioner`](src/main/java/com/ksoot/spring/batch/common/AbstractPartitioner.java)\nand overriding `partitioningList` method to return `List` of partitioning candidate `String`s.\nRefer to example [`AccountsPartitioner`](https://github.com/officiallysingh/spring-boot-batch-cloud-task/blob/main/src/main/java/com/ksoot/batch/job/AccountsPartitioner.java).\n\u003e [!NOTE]\n\u003e Multiple partitions are created only when total numbers of records returned by `partitioningList` method are greater than `batch.trigger-partitioning-threshold` property.\nOtherwise, all records are processed in a single partition.\n* Define a Job executor bean by extending [`AbstractJobExecutor`](src/main/java/com/ksoot/spring/batch/common/AbstractJobExecutor.java) to execute the job. \nRefer to example [`StatementJobExecutor`](https://github.com/officiallysingh/spring-boot-batch-cloud-task/blob/main/src/main/java/com/ksoot/batch/job/StatementJobExecutor.java).\n* Define a [`SkipListener`](https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/SkipListener.html) bean to handle skipped records. \nYou may want to save skipped records in a separate collection or table and retry later.\nRefer to example [`StatementJobSkipListener`](https://github.com/officiallysingh/spring-boot-batch-cloud-task/blob/main/src/main/java/com/ksoot/batch/job/StatementJobSkipListener.java).\n\n\u003e [!IMPORTANT]\nAny component needing access to `stepExecutionContext` must be defined as `@StepScope` bean \nand to access `jobParameters` or `jobExecutionContext` must be defined as `@JobScope` bean\n\n## Author\n[**Rajveer Singh**](https://www.linkedin.com/in/rajveer-singh-589b3950/), In case you find any issues or need any support, please email me at raj14.1984@gmail.com\n\n## References\n* Refer to Spring Batch Job implemented as Spring Cloud Task [**`spring-boot-batch-cloud-task`**](https://github.com/officiallysingh/spring-boot-batch-cloud-task).\n* Refer to Spring Batch Job implemented as Spring Rest application [**`spring-boot-batch-web`**](https://github.com/officiallysingh/spring-boot-batch-web).\n* For exception handling refer to [**`spring-boot-problem-handler`**](https://github.com/officiallysingh/spring-boot-problem-handler).\n* For Spring Data MongoDB Auditing refer to [**`spring-boot-mongodb-auditing`**](https://github.com/officiallysingh/spring-boot-mongodb-auditing).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fofficiallysingh%2Fspring-batch-commons","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fofficiallysingh%2Fspring-batch-commons","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fofficiallysingh%2Fspring-batch-commons/lists"}