{"id":51320722,"url":"https://github.com/laststonedjs/smart_document_system","last_synced_at":"2026-07-01T13:02:23.414Z","repository":{"id":355717141,"uuid":"1229273873","full_name":"laststonedjs/smart_document_system","owner":"laststonedjs","description":"This project is a full-stack document processing system that ingests business documents (Invoices \u0026 Purchase Orders), extracts structured data, validates it, and provides an interactive interface for review and correction.","archived":false,"fork":false,"pushed_at":"2026-05-20T07:22:22.000Z","size":3224,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-20T11:11:02.703Z","etag":null,"topics":["axios","csv-parser","express","mongodb","mongodb-atlas","multer","nodejs","pdf-parser","react","react-router","tesseractjs"],"latest_commit_sha":null,"homepage":"https://smart-document-system-zeta.vercel.app","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/laststonedjs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-04T21:59:14.000Z","updated_at":"2026-05-14T12:14:45.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/laststonedjs/smart_document_system","commit_stats":null,"previous_names":["laststonedjs/smart_document_system"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/laststonedjs/smart_document_system","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/laststonedjs%2Fsmart_document_system","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/laststonedjs%2Fsmart_document_system/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/laststonedjs%2Fsmart_document_system/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/laststonedjs%2Fsmart_document_system/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/laststonedjs","download_url":"https://codeload.github.com/laststonedjs/smart_document_system/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/laststonedjs%2Fsmart_document_system/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35007278,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-01T02:00:05.325Z","response_time":130,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["axios","csv-parser","express","mongodb","mongodb-atlas","multer","nodejs","pdf-parser","react","react-router","tesseractjs"],"created_at":"2026-07-01T13:02:22.638Z","updated_at":"2026-07-01T13:02:23.409Z","avatar_url":"https://github.com/laststonedjs.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Smart Document Processing System  \n\n---\n\n## Overview\n\nThis project is a full-stack document processing system that ingests business documents (Invoices \u0026 Purchase Orders), extracts structured data, validates it, and provides an interactive interface for review and correction.\n\nThe system is designed to handle **real-world imperfect data**, including OCR-based inputs. \n\n## Input Data\n\n- PDF documents (clean and semi-structured)\n- Images (including messy / OCR-like)\n- CSV files (structured)\n- TXT files (semi-structured)\n\n---\n\n## Tech Stack\n\n### Frontend\n- React\n- Axios\n\n### Backend\n- Node.js\n- Express\n\n### Database\n- MongoDB (Mongoose)\n\n### OCR\n- Tesseract.js\n\n---\n\n## Setup Instructions\n\n### 1. Clone the repository\n\n```bash\ngit clone https://github.com/laststonedjs/smart-document-system.git\n\n2. Backend Setup\ncd server\nnpm install\n\nCreate .env file:\n\nMONGO_URI=mongodb_connection_string\nPORT=5000\n\nRun server:\nnpm run dev\n\n3. Frontend Setup\ncd client/smart-document\nnpm install\nnpm run dev\n\nFrontend runs on:\nhttp://localhost:5173\n\nBackend runs on:\nhttp://localhost:5000\n\n```\n\nAPI Endpoints\nUpload\n```bash\nPOST /api/upload/pdf\nPOST /api/upload/image\nPOST /api/upload/txt\nPOST /api/upload/csv\n```\nDocuments\n```bash\nGET /api/documents\nPOST /api/documents\nPUT /api/documents/:id\n```\n\n### Example Workflow\n- Upload document\n- System extracts raw text\n- Data is structured and validated\n- Issues are highlighted\n- User edits incorrect fields\n- Document is saved and marked as validated\n\n### Notes\n- Some test documents contain intentional errors\n- The system is designed to detect and report inconsistencies\n\n### Future Improvements\n- Due date parsing\n- Authentication system\n- Role-based review workflow\n- Better OCR accuracy tuning\n- Export (PDF, CSV, Excel)\n\n### AI Usage\n\nAI tools (ChatGPT, Gemini AI) were used for:\n\n- Debugging\n- Googling things\n- Code optimization\n\nAll implementation details are fully understood.\n\nLive App: (https://smart-document-system-zeta.vercel.app/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flaststonedjs%2Fsmart_document_system","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flaststonedjs%2Fsmart_document_system","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flaststonedjs%2Fsmart_document_system/lists"}