{"id":42307971,"url":"https://github.com/Spidy20/Indian-Document-Extraction-AWS","last_synced_at":"2026-03-03T01:01:25.805Z","repository":{"id":253118948,"uuid":"842526262","full_name":"Spidy20/Indian-Document-Extraction-AWS","owner":"Spidy20","description":"This solution is a proof of concept for an Indian Legal Document Extractor utilizing AWS Lambda, Bedrock, SQS, and S3.","archived":false,"fork":false,"pushed_at":"2024-08-14T14:32:26.000Z","size":641,"stargazers_count":8,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-06T09:46:35.926Z","etag":null,"topics":["bedrock","document-scanner","lambda","s3","sqs"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Spidy20.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-14T14:28:51.000Z","updated_at":"2025-01-13T11:23:02.000Z","dependencies_parsed_at":"2024-08-14T16:05:10.799Z","dependency_job_id":"74e1d5c9-0916-48d1-9383-73dd662454a4","html_url":"https://github.com/Spidy20/Indian-Document-Extraction-AWS","commit_stats":null,"previous_names":["spidy20/indian-document-extraction-aws"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Spidy20/Indian-Document-Extraction-AWS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Spidy20","download_url":"https://codeload.github.com/Spidy20/Indian-Document-Extraction-AWS/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Spidy20%2FIndian-Document-Extraction-AWS/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30028228,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-03T00:31:48.536Z","status":"ssl_error","status_checked_at":"2026-03-03T00:30:56.176Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bedrock","document-scanner","lambda","s3","sqs"],"created_at":"2026-01-27T11:12:46.287Z","updated_at":"2026-03-03T01:01:25.685Z","avatar_url":"https://github.com/Spidy20.png","language":"Python","funding_links":["https://www.buymeacoffee.com/spidy20","https://www.paypal.me/spidy1820"],"categories":["Python"],"sub_categories":[],"readme":"# Indian🇮🇳 Legal Document Extractor using AWS Bedrock, Lambda, and SQS\n\n### [Watch this tutorial►](https://youtu.be/YWmnD_QcZQU)\n\u003cimg src=\"https://github.com/Spidy20/Indian-Document-Extraction-AWS/blob/master/Youtube_thumb.jpg\"\u003e\n\n- In this tutorial, we'll be creating a GPT-4 AWS Helper ChatBot utilizing Langchain, Lambda, API Gateway, and PostgreSQL PGVector hosted on an EC2 instance as our Vector database.\n\n### Implementation Architecture\n\u003cimg src=\"https://github.com/Spidy20/Indian-Document-Extraction-AWS/blob/master/Indian-Doocument-Extractor-POC-V.10.png\"\u003e\n\n### Used Services\n- **AWS Lambda**: Responsible for managing the backend of the Document Extractor using the Boto3 SDK.\n- **AWS SQS**: Manages the scalability of solutions by maintaining a queue, enabling multiple documents to be processed.\n- **Bedrock Claude Sonnet Model**: Used to extract JSON data from legal documents.\n- **S3**: Stores images of legal documents and triggers SQS with a Lambda function upon upload.\n\n### Implementation Setup\n- Creating an S3 Bucket.\n- Creating a Lambda function and lambda handler.\n- Setting up the SQS Queue and Policy.\n- Configuring S3 event to trigger SQS with AWS Lambda\n- Testing the automation workflow.\n- Testing with Indian legal documents.\n\n\n\n### Give Star⭐ to this repository, and fork it to support me. \n\n### [Buy me a Coffee☕](https://www.buymeacoffee.com/spidy20)\n### [Donate me on PayPal(It will inspire me to do more projects)](https://www.paypal.me/spidy1820)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSpidy20%2FIndian-Document-Extraction-AWS","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSpidy20%2FIndian-Document-Extraction-AWS","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSpidy20%2FIndian-Document-Extraction-AWS/lists"}