https://github.com/hwclass/pdf-over-the-wire
A demonstrational spike to simulate uploading pdf files into S3 via AWS Lambda and LocalStack
https://github.com/hwclass/pdf-over-the-wire
Last synced: 11 months ago
JSON representation
A demonstrational spike to simulate uploading pdf files into S3 via AWS Lambda and LocalStack
- Host: GitHub
- URL: https://github.com/hwclass/pdf-over-the-wire
- Owner: hwclass
- Created: 2025-02-11T14:29:07.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-06T11:20:11.000Z (over 1 year ago)
- Last Synced: 2025-06-29T08:37:44.925Z (12 months ago)
- Language: Python
- Size: 180 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PDF over the wire with Lambda and LocalStack
## Aim of the Proof of Concept
This proof of concept demonstrates how to upload, store, and retrieve PDF files using AWS Lambda and LocalStack. The repository provides a local development environment that simulates AWS services, allowing for testing and development without incurring costs or needing an internet connection.
## Prerequisites
Make sure you have the following installed:
- [Docker](https://docs.docker.com/get-docker/)
- [AWS CLI](https://aws.amazon.com/cli/)
- [LocalStack](https://localstack.cloud/docs/get-started/)
## Commands
### Start LocalStack
```sh
docker run --rm -it -p 4566:4566 localstack/localstack
```
### Create the local S3 bucket (pdf-upload-bucket)
```sh
aws --endpoint-url=http://localhost:4566 s3 mb s3://pdf-upload-bucket
```
### Build and run the API locally
```sh
sam build
sam local start-api --debug
```
### Upload a test file to the local S3 bucket (pdf-upload-bucket)
```sh
echo "Hello LocalStack S3" > testfile.txt
aws --endpoint-url=http://localhost:4566 s3 cp testfile.txt s3://pdf-upload-bucket/
```
### List the files in the local S3 bucket (pdf-upload-bucket)
```sh
aws --endpoint-url=http://localhost:4566 s3 ls s3://pdf-upload-bucket/
```
### Download PDF from the local S3 bucket (pdf-upload-bucket)
```sh
aws s3 cp s3://pdf-upload-bucket/.pdf ./.pdf --endpoint-url=http://localhost:4566
open .pdf
```
### Test uploading the PDF to the local S3 bucket (pdf-upload-bucket)
```sh
curl -X POST "http://127.0.0.1:3000/upload" \
-H "Content-Type: application/pdf" \
--data-binary "@api/Test_Invoice.pdf"
```
### Install diff-pdf
```sh
brew install diff-pdf
```
### Compare the original PDF with the downloaded PDF
```sh
diff-pdf api/Test_Invoice.pdf api/9fa9f0f2-eb3d-44fe-a919-f8bbe3189ba0.pdf --view
```
## Upload from the frontend
### Run the UI
```sh
npm run dev # go to http://localhost:5176 & upload a PDF file & click on Upload PDF
```
### Compare the original PDF with the downloaded PDF
```sh
diff-pdf api/Test_Invoice.pdf api/9fa9f0f2-eb3d-44fe-a919-f8bbe3189ba0.pdf --view
```
## For PDF/A-3 Convert Flow
* Compile and run the lambda function locally (under /convert directory)
* Upload a PDF file to the local S3 bucket (pdf-upload-bucket)
* Run the command for converting the PDF to PDF/A-3 (Includes output indent, mark info and adding metadata - embedding fonts are excluded now due to the GhostScript complexity)
```sh
curl -X POST "http://127.0.0.1:3000/convert" \
-H "Content-Type: application/json" \
-d '{"file_key": ".pdf"}'
```
* Download the PDF file from the local S3 bucket (pdf-upload-bucket - it should be named as converted-.pdf)
```sh
aws --endpoint-url=http://localhost:4566 s3 cp s3://pdf-upload-bucket/converted-.pdf ./converted-.pdf
```
* Compare the original PDF with the downloaded PDF (Todo)
## Notes
* You can find the profile files within the /api/convert directory (sRGB_v4_ICC_preference.icc)