Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aeksco/aws-pdf-textract-pipeline
:mag: Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript
aws aws-cdk aws-textract cdk cloudformation data-pipeline dynamodb jest lambda pdf puppeteer s3 serverless sns textract typescript webscraping
Last synced: 14 May 2024
![](https://github.com/aeksco.png)