https://github.com/patternhelloworld/persistence-excel-bridge
Memory-efficient millions of rows' transfer between Excel and a database using Apache POI, Spring Events, and Async Threads
https://github.com/patternhelloworld/persistence-excel-bridge
apache-poi excel-export excel-import java-17 jpa microsoft-excel querydsl spring-boot
Last synced: 5 months ago
JSON representation
Memory-efficient millions of rows' transfer between Excel and a database using Apache POI, Spring Events, and Async Threads
- Host: GitHub
- URL: https://github.com/patternhelloworld/persistence-excel-bridge
- Owner: patternhelloworld
- Created: 2024-05-28T14:41:23.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-03T06:20:40.000Z (over 1 year ago)
- Last Synced: 2024-10-15T04:02:30.238Z (about 1 year ago)
- Topics: apache-poi, excel-export, excel-import, java-17, jpa, microsoft-excel, querydsl, spring-boot
- Language: Java
- Homepage: https://mvnrepository.com/artifact/io.github.patternknife.pxb/persistence-excel-bridge
- Size: 1.1 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Persistence-Excel-Bridge
> Memory-efficient mass data transfer between Excel and database using Apache POI, Spring Event, Async Threads
- Create an Excel file with millions of rows of data from the database with minimal impact on heap memory
## Table of Contents
- [Features](#Features)
- [Requirements](#Requirements)
- [Quick Settings](#Quick-Settings)
- [Quick Guide on APIs](#Quick-Guide-on-APIs)
- [Structure (TO DO)](#Structure)
---
## Features
- Fetch data using pagination in the case of data transfer from the database to Excel
- Calculate the total data size and then creates a job queue using Spring Events
- Set the Max ID value to ignore any data generated after this point to avoid disrupting the pagination process.
- Utilize idle threads to perform asynchronous chunked data transfer between Excel and the database.
## Requirements
| Category | Dependencies |
|-------------------|--------------------------------------------------|
| Backend-Language | Java 17 |
| Backend-Framework | Spring-Boot 3.1.2 |
| Libraries | JPA & QueryDSL are necessary... More in pom.xml. |
- Considered removing JPA and using JDBCTemplate directly to minimize library usage; however, keeping JPA for now to illustrate the structure of the library from a domain perspective.
## Quick Settings
### Central Repository OR Build source codes
1) Central Repository
````xml
io.github.patternknife.pxb
persistence-excel-bridge
0.0.1
````
2) Build source codes
- Build the 'persistence-excel-bridge' (library) by running at the project root (./)
- WIN : ``./mvnw clean install`` or ``.\win-mvn-build.bat``
- Linux : After installing the Maven, ``mvn clean install``
- Build the 'persistence-excel-bridge-docs' (Sample project for testing the library) by running at './docs'
- Same as above.
### DB Schema
- Running ``./docs/mysql/schema.sql`` covers both **persistence-excel-bridge** and **persistence-excel-bridge-docs**
- Running ``./src/main/java/com/patternknife/pxb/domain/exceldbreadtask/schema/excel-db-read-task-schema.sql`` and ``./src/main/java/com/patternknife/pxb/domain/exceldbwritetask/schema/excel-db-write-task-schema.sql`` and ``./src/main/java/com/patternknife/pxb/domain/excelgrouptask/schema/excel-group-task-schema.sql`` cover only **persistence-excel-bridge**, which means the library requires only the three tables.
### Properties
- Add ``io.github.patternknife.pxb.dir.root.excel-group-tasks=files/private/excel-group-tasks`` to your App's ``application.properties``.
### Things to Set in Your App
- Let me explain this with the sample project, ``persistence-excel-bridge-docs``.
- Set the highlighted sources (or equivalent items applicable to your situation) in your App.
#### 1) SpringBootApplication : ``./docs/src/main/java/com/patternknife/pxbsample/PersistenceExcelBridgeDocsApplication.java``
- 
#### 2) EnableJpaRepositories : ``./docs/src/main/java/com/patternknife/pxbsample/config/database/CommonDataSourceConfiguration.java``
- 
- ``com.patternknife.pms.domain`` should be recognized in your App.
#### 3) Spring Event & Thread Pools
- 
- Just copy & paste the folder, and you can change some values if you'd like such as the thread pool size.
#### 4) Implementations of the Library
- 
- 4-1) ``api`` : Just copy and paste the folder, then customize the settings such as the REST API addresses and payload.
- 4-2) ``cache`` : Write .java files by referring to the sample.
- 4-3) ``factoyr`` : Write .java files by referring to the sample.
- 4-4) ``processor`` : Write .java files by referring to the sample.
#### 5) Logging
- Add the following to your ``logback-spring.xml``
```xml
${LOGS_ABSOLUTE_PATH}/pxb-async-log-config/current.log
%d %p %C{1} [%t] %m%n
${LOGS_ABSOLUTE_PATH}/pxb-async-log-config/%d{yyyy-MM}/past-%d{yyyy-MM-dd_HH}.%i.log
50
10MB
```
## Quick Guide on APIs
- APIs you need to use are in files ending with "-Service".
- ExcelGroupService (Common, Group, 8 Apis)
- ExcelDBWriteService (Excel->DB, One, 3 Apis)
- ExcelDBReadService (Excel<-DB, One, 3 Apis)
- ExcelGroupTaskFileService (Common, File IO, 4 Apis)
- A "Group" consists of the range of rows to be inserted into DB or created in an Excel file.
- More information on the "rows" can be found in entities ``( ExcelGroupTask, ExcelDBWriteTask, ExcelDBReadTask, ExcelGroupTaskFileService )``
- The processing of a "Group" should be asynchronous, leave full logs, and be memory-efficient.
- You can create your own front-end using the provided APIs.