{"id":20417304,"url":"https://github.com/spidy20/indian-document-extraction-aws","last_synced_at":"2025-09-24T05:32:08.922Z","repository":{"id":253118948,"uuid":"842526262","full_name":"Spidy20/Indian-Document-Extraction-AWS","owner":"Spidy20","description":"This solution is a proof of concept for an Indian Legal Document Extractor utilizing AWS Lambda, Bedrock, SQS, and S3.","archived":false,"fork":false,"pushed_at":"2024-08-14T14:32:26.000Z","size":641,"stargazers_count":7,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-11-15T06:29:52.601Z","etag":null,"topics":["bedrock","document-scanner","lambda","s3","sqs"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Spidy20.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-14T14:28:51.000Z","updated_at":"2024-10-29T05:42:40.000Z","dependencies_parsed_at":"2024-08-14T16:05:10.799Z","dependency_job_id":"74e1d5c9-0916-48d1-9383-73dd662454a4","html_url":"https://github.com/Spidy20/Indian-Document-Extraction-AWS","commit_stats":null,"previous_names":["spidy20/indian-document-extraction-aws"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Spidy20","download_url":"https://codeload.github.com/Spidy20/Indian-Document-Extraction-AWS/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234045723,"owners_count":18770994,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bedrock","document-scanner","lambda","s3","sqs"],"created_at":"2024-11-15T06:25:42.711Z","updated_at":"2025-09-24T05:32:08.536Z","avatar_url":"https://github.com/Spidy20.png","language":"Python","funding_links":["https://www.buymeacoffee.com/spidy20","https://www.paypal.me/spidy1820"],"categories":[],"sub_categories":[],"readme":"# Indian🇮🇳 Legal Document Extractor using AWS Bedrock, Lambda, and SQS\n\n### [Watch this tutorial►](https://youtu.be/YWmnD_QcZQU)\n\u003cimg src=\"https://github.com/Spidy20/Indian-Document-Extraction-AWS/blob/master/Youtube_thumb.jpg\"\u003e\n\n- In this tutorial, we'll be creating a GPT-4 AWS Helper ChatBot utilizing Langchain, Lambda, API Gateway, and PostgreSQL PGVector hosted on an EC2 instance as our Vector database.\n\n### Implementation Architecture\n\u003cimg src=\"https://github.com/Spidy20/Indian-Document-Extraction-AWS/blob/master/Indian-Doocument-Extractor-POC-V.10.png\"\u003e\n\n### Used Services\n- **AWS Lambda**: Responsible for managing the backend of the Document Extractor using the Boto3 SDK.\n- **AWS SQS**: Manages the scalability of solutions by maintaining a queue, enabling multiple documents to be processed.\n- **Bedrock Claude Sonnet Model**: Used to extract JSON data from legal documents.\n- **S3**: Stores images of legal documents and triggers SQS with a Lambda function upon upload.\n\n### Implementation Setup\n- Creating an S3 Bucket.\n- Creating a Lambda function and lambda handler.\n- Setting up the SQS Queue and Policy.\n- Configuring S3 event to trigger SQS with AWS Lambda\n- Testing the automation workflow.\n- Testing with Indian legal documents.\n\n\n\n### Give Star⭐ to this repository, and fork it to support me. \n\n### [Buy me a Coffee☕](https://www.buymeacoffee.com/spidy20)\n### [Donate me on PayPal(It will inspire me to do more projects)](https://www.paypal.me/spidy1820)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspidy20%2Findian-document-extraction-aws","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspidy20%2Findian-document-extraction-aws","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspidy20%2Findian-document-extraction-aws/lists"}