{"id":29363033,"url":"https://github.com/patternhelloworld/persistence-excel-bridge","last_synced_at":"2025-07-09T09:22:22.547Z","repository":{"id":241922390,"uuid":"807134107","full_name":"patternhelloworld/persistence-excel-bridge","owner":"patternhelloworld","description":"Memory-efficient millions of rows' transfer between Excel and a database using Apache POI, Spring Events, and Async Threads ","archived":false,"fork":false,"pushed_at":"2024-06-03T06:20:40.000Z","size":1157,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-15T04:02:30.238Z","etag":null,"topics":["apache-poi","excel-export","excel-import","java-17","jpa","microsoft-excel","querydsl","spring-boot"],"latest_commit_sha":null,"homepage":"https://mvnrepository.com/artifact/io.github.patternknife.pxb/persistence-excel-bridge","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/patternhelloworld.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-28T14:41:23.000Z","updated_at":"2024-09-01T14:55:30.000Z","dependencies_parsed_at":"2024-10-15T04:02:32.148Z","dependency_job_id":"0f935756-2ce2-4162-b6fe-3c7a19286275","html_url":"https://github.com/patternhelloworld/persistence-excel-bridge","commit_stats":null,"previous_names":["andrew-kang-g/persistence-excel-bridge","patternknife/persistence-excel-bridge","patternhelloworld/persistence-excel-bridge"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/patternhelloworld/persistence-excel-bridge","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patternhelloworld%2Fpersistence-excel-bridge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patternhelloworld%2Fpersistence-excel-bridge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patternhelloworld%2Fpersistence-excel-bridge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patternhelloworld%2Fpersistence-excel-bridge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/patternhelloworld","download_url":"https://codeload.github.com/patternhelloworld/persistence-excel-bridge/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patternhelloworld%2Fpersistence-excel-bridge/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264428915,"owners_count":23606722,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-poi","excel-export","excel-import","java-17","jpa","microsoft-excel","querydsl","spring-boot"],"created_at":"2025-07-09T09:22:10.432Z","updated_at":"2025-07-09T09:22:22.534Z","avatar_url":"https://github.com/patternhelloworld.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Persistence-Excel-Bridge\n\n\u003e Memory-efficient mass data transfer between Excel and database using Apache POI, Spring Event, Async Threads\n- Create an Excel file with millions of rows of data from the database with minimal impact on heap memory\n\n## Table of Contents\n- [Features](#Features)\n- [Requirements](#Requirements)\n- [Quick Settings](#Quick-Settings)\n- [Quick Guide on APIs](#Quick-Guide-on-APIs)\n- [Structure (TO DO)](#Structure)\n---\n\n## Features\n\n- Fetch data using pagination in the case of data transfer from the database to Excel\n- Calculate the total data size and then creates a job queue using Spring Events\n  - Set the Max ID value to ignore any data generated after this point to avoid disrupting the pagination process.\n- Utilize idle threads to perform asynchronous chunked data transfer between Excel and the database.\n\n## Requirements\n\n| Category          | Dependencies                                     |\n|-------------------|--------------------------------------------------|\n| Backend-Language  | Java 17                                          |\n| Backend-Framework | Spring-Boot 3.1.2                                |\n| Libraries         | JPA \u0026 QueryDSL are necessary... More in pom.xml. |\n\n- Considered removing JPA and using JDBCTemplate directly to minimize library usage; however, keeping JPA for now to illustrate the structure of the library from a domain perspective.\n\n## Quick Settings\n\n### Central Repository OR Build source codes\n\n1) Central Repository\n````xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.github.patternknife.pxb\u003c/groupId\u003e\n    \u003cartifactId\u003epersistence-excel-bridge\u003c/artifactId\u003e\n    \u003cversion\u003e0.0.1\u003c/version\u003e\n\u003c/dependency\u003e\n````\n\n2) Build source codes\n- Build the 'persistence-excel-bridge' (library) by running at the project root (./)\n  - WIN : ``./mvnw clean install`` or ``.\\win-mvn-build.bat``\n  - Linux : After installing the Maven, ``mvn clean install``\n- Build the 'persistence-excel-bridge-docs' (Sample project for testing the library) by running at './docs'\n  - Same as above.\n\n### DB Schema\n- Running ``./docs/mysql/schema.sql`` covers both **persistence-excel-bridge** and **persistence-excel-bridge-docs** \n- Running ``./src/main/java/com/patternknife/pxb/domain/exceldbreadtask/schema/excel-db-read-task-schema.sql`` and ``./src/main/java/com/patternknife/pxb/domain/exceldbwritetask/schema/excel-db-write-task-schema.sql`` and ``./src/main/java/com/patternknife/pxb/domain/excelgrouptask/schema/excel-group-task-schema.sql`` cover only **persistence-excel-bridge**, which means the library requires only the three tables.\n\n### Properties\n- Add ``io.github.patternknife.pxb.dir.root.excel-group-tasks=files/private/excel-group-tasks`` to your App's ``application.properties``.\n\n### Things to Set in Your App\n- Let me explain this with the sample project, ``persistence-excel-bridge-docs``.\n- Set the highlighted sources (or equivalent items applicable to your situation) in your App.\n\n\n#### 1) SpringBootApplication : ``./docs/src/main/java/com/patternknife/pxbsample/PersistenceExcelBridgeDocsApplication.java``\n\n- ![spring-boot-application.png](./docs/references/readme/spring-boot-application.png)\n\n#### 2) EnableJpaRepositories : ``./docs/src/main/java/com/patternknife/pxbsample/config/database/CommonDataSourceConfiguration.java``\n\n- ![jpa-set.png](./docs/references/readme/jpa-set.png)\n\n- ``com.patternknife.pms.domain`` should be recognized in your App.\n\n#### 3) Spring Event \u0026 Thread Pools\n\n- ![folder-tree-queue.png](./docs/references/readme/folder-tree-queue.png)\n\n- Just copy \u0026 paste the folder, and you can change some values if you'd like such as the thread pool size.\n\n#### 4) Implementations of the Library\n\n- ![folder-tree-queue.png](./docs/references/readme/folder-tree.png)\n\n- 4-1) ``api`` : Just copy and paste the folder, then customize the settings such as the REST API addresses and payload.\n- 4-2) ``cache`` : Write .java files by referring to the sample.\n- 4-3) ``factoyr`` : Write .java files by referring to the sample.\n- 4-4) ``processor`` : Write .java files by referring to the sample.\n\n#### 5) Logging\n\n- Add the following to your ``logback-spring.xml``\n\n```xml\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003cconfiguration scan=\"true\" scanPeriod=\"30 seconds\" debug=\"true\"\u003e\n\n  \u003c!-- Set common variables here --\u003e\n  \u003cproperty name=\"LOGS_ABSOLUTE_PATH\" value=\"logs\"/\u003e\n  \n  \n  \u003cappender name=\"PxbAsyncLogConfig\"\n            class=\"ch.qos.logback.core.rolling.RollingFileAppender\"\u003e\n      \u003cfile\u003e${LOGS_ABSOLUTE_PATH}/pxb-async-log-config/current.log\u003c/file\u003e\n      \u003cencoder\n              class=\"ch.qos.logback.classic.encoder.PatternLayoutEncoder\"\u003e\n          \u003c!--\u003cpattern\u003e%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n\u003c/pattern\u003e--\u003e\n          \u003cPattern\u003e%d %p %C{1} [%t] %m%n\u003c/Pattern\u003e\n      \u003c/encoder\u003e\n\n      \u003crollingPolicy\n              class=\"ch.qos.logback.core.rolling.TimeBasedRollingPolicy\"\u003e\n          \u003c!-- Create a log file every minute and manage it in 10MB units. --\u003e\n          \u003cfileNamePattern\u003e${LOGS_ABSOLUTE_PATH}/pxb-async-log-config/%d{yyyy-MM}/past-%d{yyyy-MM-dd_HH}.%i.log\n          \u003c/fileNamePattern\u003e\n          \u003cmaxHistory\u003e50\u003c/maxHistory\u003e\n          \u003ctimeBasedFileNamingAndTriggeringPolicy\n                  class=\"ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP\"\u003e\n              \u003cmaxFileSize\u003e10MB\u003c/maxFileSize\u003e\n          \u003c/timeBasedFileNamingAndTriggeringPolicy\u003e\n      \u003c/rollingPolicy\u003e\n  \u003c/appender\u003e\n\n\n  \u003clogger name=\"io.github.patternknife.pxb.config.logger.module\" level=\"TRACE\"\u003e\n  \u003cappender-ref ref=\"PxbAsyncLogConfig\" /\u003e\n  \u003c!--\u003cappender-ref ref=\"Console\" /\u003e--\u003e\n  \u003c/logger\u003e\n  \n  \n  \u003croot level=\"info\"\u003e\n    \u003cappender-ref ref=\"Console\"/\u003e\n  \u003c/root\u003e\n\n\n\u003c/configuration\u003e\n```\n\n## Quick Guide on APIs\n- APIs you need to use are in files ending with \"-Service\".\n - ExcelGroupService (Common, Group, 8 Apis)\n - ExcelDBWriteService (Excel-\u003eDB, One, 3 Apis)\n - ExcelDBReadService (Excel\u003c-DB, One, 3 Apis)\n - ExcelGroupTaskFileService (Common, File IO, 4 Apis)\n- A \"Group\" consists of the range of rows to be inserted into DB or created in an Excel file. \n - More information on the \"rows\" can be found in entities ``( ExcelGroupTask, ExcelDBWriteTask, ExcelDBReadTask, ExcelGroupTaskFileService )``\n- The processing of a \"Group\" should be asynchronous, leave full logs, and be memory-efficient.\n- You can create your own front-end using the provided APIs.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatternhelloworld%2Fpersistence-excel-bridge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpatternhelloworld%2Fpersistence-excel-bridge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatternhelloworld%2Fpersistence-excel-bridge/lists"}