{"id":18457963,"url":"https://github.com/alokkusingh/spring-batch-pdf-parser","last_synced_at":"2025-04-08T05:33:49.721Z","repository":{"id":39672315,"uuid":"262099451","full_name":"alokkusingh/spring-batch-pdf-parser","owner":"alokkusingh","description":"Spring Batch PDF Parser - Bank Statements","archived":false,"fork":false,"pushed_at":"2023-06-26T20:30:35.000Z","size":505,"stargazers_count":3,"open_issues_count":8,"forks_count":5,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-23T06:51:08.665Z","etag":null,"topics":["bank-statement-documents","csv-parser","h2-database","java-8","jpa","logback","lombok","openpdf","pdfpassword","pdfreader","slf4j","spring-boot","spring-devtools","springbatch"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alokkusingh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-05-07T16:16:01.000Z","updated_at":"2024-12-06T22:49:54.000Z","dependencies_parsed_at":"2023-01-30T17:45:39.375Z","dependency_job_id":null,"html_url":"https://github.com/alokkusingh/spring-batch-pdf-parser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alokkusingh%2Fspring-batch-pdf-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alokkusingh%2Fspring-batch-pdf-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alokkusingh%2Fspring-batch-pdf-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alokkusingh%2Fspring-batch-pdf-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alokkusingh","download_url":"https://codeload.github.com/alokkusingh/spring-batch-pdf-parser/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247785917,"owners_count":20995641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bank-statement-documents","csv-parser","h2-database","java-8","jpa","logback","lombok","openpdf","pdfpassword","pdfreader","slf4j","spring-boot","spring-devtools","springbatch"],"created_at":"2024-11-06T08:16:16.541Z","updated_at":"2025-04-08T05:33:44.671Z","avatar_url":"https://github.com/alokkusingh.png","language":"Java","readme":"# Bank Account Statement Reader\nSpring Boot Batch Processor\n\n## Functionality\n- Reads downloaded bank PDF statements (with password/without password protection) or imported CSV files from bank site\n- Parses it (based on plugable parsing logic)\n- Process the records (transaction categorization, amount extraction)\n- Writes to H2 DB\n- Finaly export to csv format order by transaction date to be imported to Excel or Google Sheet\n\n### Supported Bank Statemetnts\n1. Citi Bank Saving Account \n2. Kotak Mahindra Bank Saving Account \n\n### How to run\n````\njava -jar target/spring-batch-pdf-parser-0.0.2-SNAPSHOT.jar --file.path.base.dir=/home/alok/data/git/BankStatements\n````\n\n### Enhancements - 18 Feb 2022\n#### Current Status - as on 18 Feb 2022\n1. Parses all files on startup\n2. File polling?\n3. Manual pulling statements from private GitHub repo\n4. Manual pushing generated report to private GitHub repo\n#### Phases\n1. Stop automatic file polling - if any\n2. Expose API to upload file to be parsed\n3. Expose API to download CSV report\n4. Implement ReactJS UI \n   \n   4.1 To upload the bank statement\n   \n   4.2 To download CSV report\n   \n   4.3 Tp see the detailed reports\n\n#### Build\n1. Maven Package\n   ````\n   mvn clean package\n   ````\n2. Docker Build, Push \u0026 Run\n   ````\n   docker build -t alokkusingh/statement-parser:latest -t alokkusingh/statement-parser:2.2.0 --build-arg JAR_FILE=target/spring-batch-pdf-parser-1.0.3-SNAPSHOT.jar .\n   ````\n   ````\n   docker push alokkusingh/statement-parser:latest\n   ````\n   ````\n   docker push alokkusingh/statement-parser:2.2.0\n   ````\n   ````\n   docker run -d -v /home/alok/data/git/BankStatements:/Users/aloksingh/BankStatements:rw,Z -p 8081:8081 --rm --name statement-parser alokkusingh/statement-parser\n   ````\n   \n### Manual commands\n````\ndocker run -it --entrypoint /bin/bash -v /home/alok/data/git/BankStatements:/Users/aloksingh/BankStatements:rw,Z -p 8081:8081 --rm --name statement-parser alokkusingh/statement-parser\n````\n````\njava -Djava.security.egd=file:/dev/urandom -Dspring.profiles.active=prod -Dspring.datasource.url=jdbc:mysql://192.168.0.200:32306/home-stack -Dspring.datasource.hikari.minimum-idle=5 -Dspring.datasource.hikari.connection-timeout=20000 -Dspring.datasource.hikari.maximum-pool-size=10 -Dspring.datasource.hikari.idle-timeout=10000 -Dpring.datasource.hikari.max-lifetime=1000 -Dspring.datasource.hikari.auto-commit=true -jar /opt/app.jar\n````\n````\ndocker run -v /home/alok/data/git/BankStatements:/Users/aloksingh/BankStatements:rw,Z -p 8081:8081 --rm --name statement-parser alokkusingh/statement-parser --java.security.egd=file:/dev/urandom --spring.profiles.active=prod --spring.datasource.url=jdbc:mysql://192.168.0.200:32306/home-stack --spring.datasource.hikari.minimum-idle=5 --spring.datasource.hikari.connection-timeout=20000 --spring.datasource.hikari.maximum-pool-size=10 --spring.datasource.hikari.idle-timeout=10000 --pring.datasource.hikari.max-lifetime=1000 --spring.datasource.hikari.auto-commit=true\n````\n\n## MQTT Commands\n### Root Certificate - for client signer and domain signer\n````\nopenssl genrsa -des3 -out mqtt-signer-ca.key 2048\n````\n````\nopenssl req -x509 -new -nodes -key mqtt-signer-ca.key -sha256 -days 365 -out mqtt-signer-ca.crt -subj /C=IN/ST=KA/L=Bengalury/O=Home/CN=alok-signer\n````\n#### Client Cert - alok\n````\nopenssl genrsa -out mqtt.client.alok.key 2048\n````\n````\nopenssl req -new -sha256 -key mqtt.client.alok.key -subj /C=IN/ST=KA/L=S=Bengaluru/O=Home/CN=alok -out mqtt.client.alok.csr\n````\n````\nopenssl x509 -req -in mqtt.client.alok.csr -CA mqtt-signer-ca.crt -CAkey mqtt-signer-ca.key -CAcreateserial -out mqtt.client.alok.crt -days 365 -sha256\n````\n\n####  Server Domain Cert - localhost\n````\nopenssl genrsa -out server.key 2048\n````\n````\nopenssl req -new -sha256 -out server.csr -key server.key -subj /C=IN/ST=KA/L=S=Bengaluru/O=Home/CN=localhost\n````\n````\nopenssl x509 -req -in server.csr -CA mqtt-signer-ca.crt -CAkey mqtt-signer-ca.key -CAcreateserial -out server.crt -days 360 -sha256\n````\n\n#### Add client alok cert to PKCS 12 keystore - then it is imported in JKS using KeyStore Explorer\n````\nopenssl pkcs12 -export -out mqtt.client.alok.p12 -name \"alok\" -inkey mqtt.client.alok.key -in mqtt.client.alok.crt\n````\n\n#### Start Mosquito Broker\n````\n/opt/homebrew/opt/mosquitto/sbin/mosquitto -c /opt/homebrew/etc/mosquitto/mosquitto.conf\n````\n\n#### Publish using alok cert\n````\nmosquitto_pub --cafile mqtt-signer-ca.crt --cert mqtt.client.alok.crt --key mqtt.client.alok.key -d -h localhost -p 8883 -t test -m \"Hello\" --tls-version tlsv1.2 --debug\n````\n\n#### Client Cert - rachna\n````\nopenssl genrsa -out mqtt.client.rachna.key 2048\n````\n````\nopenssl req -new -sha256 -key mqtt.client.rachna.key -subj /C=IN/ST=KA/L=S=Bengaluru/O=Home/CN=rachna -out mqtt.client.rachna.csr\n````\n````\nopenssl x509 -req -in mqtt.client.rachna.csr -CA mqtt-signer-ca.crt -CAkey mqtt-signer-ca.key -CAcreateserial -out mqtt.client.rachna.crt -days 365 -sha256\n````\n\n#### Publish/Subscribe using rachna cert\n````\nmosquitto_sub --cafile mqtt-signer-ca.crt --cert mqtt.client.rachna.crt --key mqtt.client.rachna.key -d -h localhost -p 8883 -t home/stack/stmt-res --tls-version tlsv1.2 --debug\n````\n````\nmosquitto_pub --cafile mqtt-signer-ca.crt --cert mqtt.client.rachna.crt --key mqtt.client.rachna.key -d -h localhost -p 8883 -t home/stack/stmt-req -m \"Hello\" --tls-version tlsv1.2 --debug\n````\n\n#### Request Topic\n````\nhome/stack/stmt-req\n````\n#### Sample Request Payload\n````\n{\n\"correlationId\": \"sdcsd1234\",\n\"httpMethod\": \"GET\",\n\"uri\": \"/fin/expense?yearMonth=current_month\",\n\"body\": \"\"\n}\n````","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falokkusingh%2Fspring-batch-pdf-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falokkusingh%2Fspring-batch-pdf-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falokkusingh%2Fspring-batch-pdf-parser/lists"}