{"id":31045916,"url":"https://github.com/navdevl/chris-cred-reader","last_synced_at":"2025-09-14T17:47:41.322Z","repository":{"id":313460896,"uuid":"1047488833","full_name":"Navdevl/chris-cred-reader","owner":"Navdevl","description":null,"archived":false,"fork":false,"pushed_at":"2025-09-13T07:04:23.000Z","size":60,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-13T09:28:30.899Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Navdevl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-30T14:35:50.000Z","updated_at":"2025-09-13T07:04:26.000Z","dependencies_parsed_at":"2025-09-06T08:31:41.215Z","dependency_job_id":"3f8e410c-5e9d-4f7d-b1f4-060747246036","html_url":"https://github.com/Navdevl/chris-cred-reader","commit_stats":null,"previous_names":["navdevl/chris-cred-reader"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Navdevl/chris-cred-reader","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Navdevl%2Fchris-cred-reader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Navdevl%2Fchris-cred-reader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Navdevl%2Fchris-cred-reader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Navdevl%2Fchris-cred-reader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Navdevl","download_url":"https://codeload.github.com/Navdevl/chris-cred-reader/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Navdevl%2Fchris-cred-reader/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275143585,"owners_count":25413091,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-14T02:00:10.474Z","response_time":75,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-14T17:47:40.043Z","updated_at":"2025-09-14T17:47:41.293Z","avatar_url":"https://github.com/Navdevl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Credit Card PDF Statement Processing System\n\nAutomatically process credit card PDF statements from Google Drive and populate transaction data into Google Sheets.\n\n## Features\n\n- **Automated PDF Processing**: Monitors Google Drive folder for credit card statements\n- **Multi-Bank Support**: Supports Axis and ICICI bank statements (HDFC, SBI coming soon)\n- **Password Protection**: Handles password-protected PDFs using filename convention\n- **Duplicate Prevention**: Prevents duplicate transactions using hash-based detection\n- **Google Sheets Integration**: Automatically populates transactions in a single spreadsheet\n- **File Management**: Moves processed files to organized subfolders\n- **Robust Error Handling**: Comprehensive logging and retry mechanisms\n\n## Quick Start\n\n### 1. Prerequisites\n- Python 3.9+\n- Google Cloud Project with Drive and Sheets APIs enabled\n- Google service account credentials\n\n### 2. Installation\n```bash\ngit clone \u003crepository-url\u003e\ncd chris-cred-reader\npip install -r requirements.txt\n```\n\n### 3. Configuration\n1. Follow the detailed setup in `instructions.md`\n2. Copy `.env.example` to `.env` and update values:\n```bash\ncp .env.example .env\n```\n\n### 4. Run Application\n```bash\npython src/main.py\n```\n\n## File Structure\n\n```\nchris-cred-reader/\n├── requirements.md          # Detailed project specifications\n├── instructions.md          # Google API setup guide\n├── src/\n│   ├── main.py             # Application entry point\n│   ├── config.py           # Configuration management\n│   ├── models.py           # Data models\n│   ├── pdf_processor.py    # PDF processing engine\n│   ├── google_drive_client.py    # Google Drive integration\n│   ├── google_sheets_client.py   # Google Sheets integration\n│   └── bank_parsers/       # Bank-specific PDF parsers\n│       ├── base_parser.py\n│       ├── axis_parser.py\n│       ├── hdfc_parser.py\n│       ├── sbi_parser.py\n│       └── icici_parser.py\n├── tests/                  # Unit tests\n├── Dockerfile             # Docker configuration\n├── fly.toml               # Fly.io deployment config\n└── requirements.txt       # Python dependencies\n```\n\n## PDF Filename Convention\n\nCredit card PDF files must follow this naming pattern:\n```\n\u003cbank_name\u003e-\u003cpassword\u003e-\u003cidentifier\u003e.pdf\n```\n\n**Examples:**\n- `axis-mypass123-jan2024.pdf`\n- `hdfc-secretword-statement.pdf` \n- `sbi-password123-dec2023.pdf`\n- `icici-mykey456-quarterly.pdf`\n\n**Supported Bank Names:**\n- `axis` - Axis Bank\n- `hdfc` - HDFC Bank  \n- `sbi` - State Bank of India\n- `icici` - ICICI Bank\n\n## Google Sheets Format\n\nThe application creates a single sheet with these columns:\n- **Date**: Transaction date (YYYY-MM-DD)\n- **Bank**: Bank name\n- **Txn ID**: Transaction reference ID\n- **Description**: Transaction description\n- **Amount**: Amount (positive=credit, negative=debit)\n- **Category**: Manual categorization field\n- **Processed Date**: When the transaction was processed\n\n## Environment Variables\n\n| Variable | Description | Example |\n|----------|-------------|---------|\n| `GOOGLE_APPLICATION_CREDENTIALS` | Path to service account JSON | `/path/to/service-account.json` |\n| `GOOGLE_DRIVE_FOLDER_ID` | Google Drive folder ID to monitor | `1BxiMVs0XRA5nFMdKvBdBZjg...` |\n| `GOOGLE_SHEET_ID` | Google Sheets spreadsheet ID | `1BxiMVs0XRA5nFMdKvBdBZjg...` |\n| `LOG_LEVEL` | Logging level | `INFO` |\n| `POLL_INTERVAL_MINUTES` | How often to check for new files | `15` |\n\n## Deployment\n\n### Local Development\n```bash\npython src/main.py\n```\n\n### Fly.io Deployment\n1. Install fly CLI: https://fly.io/docs/getting-started/installing-flyctl/\n2. Login: `fly auth login`\n3. Deploy: `fly deploy`\n\nSet environment variables:\n```bash\nfly secrets set GOOGLE_APPLICATION_CREDENTIALS=\"$(cat service-account.json)\"\nfly secrets set GOOGLE_DRIVE_FOLDER_ID=\"your_folder_id\"\nfly secrets set GOOGLE_SHEET_ID=\"your_sheet_id\"\n```\n\n## Monitoring and Logs\n\n### View Application Logs\n```bash\n# Local\ntail -f app.log\n\n# Fly.io  \nfly logs\n```\n\n### Processing Metrics\nThe application logs:\n- Files processed per cycle\n- Transactions extracted\n- Duplicate detections\n- Error messages and stack traces\n\n## Troubleshooting\n\n### Common Issues\n\n**\"Permission denied\" on Google APIs:**\n- Verify service account has access to Drive folder and Sheet\n- Check that APIs are enabled in Google Cloud Console\n\n**\"File not found\" errors:**\n- Verify `GOOGLE_DRIVE_FOLDER_ID` and `GOOGLE_SHEET_ID` are correct\n- Check folder/sheet sharing permissions\n\n**PDF processing failures:**\n- Verify filename follows convention: `\u003cbank\u003e-\u003cpassword\u003e-\u003cid\u003e.pdf`\n- Check that PDF password is correct\n- Ensure bank name is supported (axis/hdfc/sbi/icici)\n\n**No transactions extracted:**\n- Bank statement format may have changed\n- Check logs for parser-specific errors\n- Consider updating bank parser logic\n\n### Debug Mode\nSet `LOG_LEVEL=DEBUG` for verbose logging:\n```bash\nexport LOG_LEVEL=DEBUG\npython src/main.py\n```\n\n## Architecture\n\n### Processing Flow\n1. **Monitor** → Poll Google Drive folder every 15 minutes\n2. **Download** → Get new PDF files matching naming convention  \n3. **Parse** → Extract password from filename, process PDF\n4. **Extract** → Use bank-specific parser to get transactions\n5. **Validate** → Check for duplicates in Google Sheet\n6. **Insert** → Batch insert new transactions\n7. **Archive** → Move processed file to \"processed\" subfolder\n\n### Error Handling\n- **Retry Logic**: 3 attempts for API failures\n- **Graceful Degradation**: Continue processing other files on individual failures\n- **Comprehensive Logging**: Track all operations and errors\n\n## Contributing\n\n1. Follow existing code patterns and conventions\n2. Add tests for new bank parsers\n3. Update documentation for new features\n4. Test thoroughly before deployment\n\n## Security Notes\n\n- Service account credentials stored as environment variables\n- PDF passwords visible in filenames (consider secure alternatives)\n- All API communication over HTTPS\n- No sensitive data cached locally\n\n## License\n\nMIT License","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnavdevl%2Fchris-cred-reader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnavdevl%2Fchris-cred-reader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnavdevl%2Fchris-cred-reader/lists"}