{"id":50390950,"url":"https://github.com/mathieubuisson/bill-ingestion","last_synced_at":"2026-05-30T18:01:29.694Z","repository":{"id":351528222,"uuid":"1210896153","full_name":"MathieuBuisson/bill-ingestion","owner":"MathieuBuisson","description":null,"archived":false,"fork":false,"pushed_at":"2026-04-23T21:02:47.000Z","size":59,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-23T22:16:31.755Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MathieuBuisson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-14T21:40:21.000Z","updated_at":"2026-04-23T21:02:51.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/MathieuBuisson/bill-ingestion","commit_stats":null,"previous_names":["mathieubuisson/bill-ingestion"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/MathieuBuisson/bill-ingestion","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MathieuBuisson%2Fbill-ingestion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MathieuBuisson%2Fbill-ingestion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MathieuBuisson%2Fbill-ingestion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MathieuBuisson%2Fbill-ingestion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MathieuBuisson","download_url":"https://codeload.github.com/MathieuBuisson/bill-ingestion/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MathieuBuisson%2Fbill-ingestion/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33703065,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-30T18:01:28.257Z","updated_at":"2026-05-30T18:01:29.682Z","avatar_url":"https://github.com/MathieuBuisson.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Electricity Bill Ingestion\n\nPython-based automation for downloading, processing, and organizing electricity bills.\n\n## Overview\n\nThis project automates the ingestion of electricity bills by:\n\n1. Downloading electricity bills from Bord Gáis Energy\n2. Converting the bill PDF to Markdown\n3. Uploading the bill PDF file to a specific Google Drive folder (`Finance/Taxes/\u003cYYYY\u003e/Income Tax/Electricity Receipts`)\n4. Copying the Markdown file to a personal knowledge base\n5. Sending a notification email with the link and other details\n\n## Architecture\n\nThe project follows a modular architecture with clear separation of concerns:\n\n- **Downloaders**: Handle bill retrieval from utility providers\n- **Converters**: Transform file formats (PDF → Markdown)\n- **Cloud Services**: Manage Google Drive and Gmail operations\n- **Configuration**: Centralized settings and environment management\n\n## Prerequisites\n\n- Python 3.13 or higher\n- Google Drive and Gmail accounts with OAuth2 credentials\n- Bord Gáis online account credentials\n\n## Setup Instructions\n\n### 1. Clone the Repository\n\n```bash\ngit clone https://github.com/MathieuBuisson/bill-ingestion.git\ncd bill-ingestion\n```\n\n### 2. Create a Virtual Environment\n\n```bash\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n```\n\n### 3. Install Dependencies\n\n```bash\npip install -r requirements.txt\npython -m playwright install chromium\n```\n\n### 4. Configure Environment Variables\n\nCreate a `.env` file in the project root:\n\n```bash\n# Bord Gáis credentials\nBORDGAIS_EMAIL=your-email@example.com\nBORDGAIS_PASSWORD=your-password\nBORDGAIS_ACCOUNT_ID=your-account-id\n\n# Google credentials\nGOOGLE_CREDENTIALS_FILE=credentials.json\nNOTIFICATION_EMAIL=your-email@gmail.com\n\n# Paths\nMARKDOWN_DESTINATION_FOLDER=/path/to/your/knowledge/base\n\n# Logging\nLOG_LEVEL=INFO\n```\n\n### 5. Set Up Google OAuth2 Credentials\n\n1. Go to [Google Cloud Console](https://console.cloud.google.com/)\n2. Create a new project\n3. Enable Google Drive API and Gmail API\n4. Create OAuth2 credentials (Desktop application)\n5. Download and save as `credentials.json` in the project root\n\n### 6. Run the Workflow\n\nRun the bill ingestion workflow:\n\n```bash\npython -m bill_ingestion.main\n```\n\n## Project Structure\n\n```\nbill-ingestion/\n├── .github/\n│   └── workflows/\n│       └── ci.yml                    # GitHub Actions CI pipeline\n├── .env                              # Environment variables (add to .gitignore)\n├── .gitignore\n├── README.md\n├── pyproject.toml                    # Tool configurations (pytest, black, mypy, etc.)\n├── requirements.txt\n├── setup.py\n│\n├── data/                             # Downloaded PDFs (runtime generated)\n├── logs/                             # Application logs (runtime generated)\n├── temp/                             # OAuth tokens and temporary files (runtime generated)\n│\n├── src/\n    └── bill_ingestion/\n        ├── __init__.py\n        ├── main.py                   # Entry point / orchestrator\n        ├── config.py                 # Configuration \u0026 credentials\n        │\n        ├── downloaders/\n        │   ├── __init__.py\n        │   └── bordgais.py           # Bord Gáis bill download logic\n        │\n        ├── converters/\n        │   ├── __init__.py\n        │   └── pdf_to_markdown.py    # PDF → Markdown conversion\n        │\n        ├── cloud/\n        │   ├── __init__.py\n        │   ├── google_drive.py       # Google Drive operations\n        │   └── gmail_service.py      # Email notification service\n        │\n        └── utils/\n            ├── __init__.py\n            ├── logger.py             # Logging configuration\n            └── exceptions.py         # Custom exceptions\n│\n├── tests/                            # Unit tests for the application\n```\n\n## Usage Examples\n\n### Manual Workflow Execution\n\n```python\nfrom bill_ingestion.main import ingest_bill_workflow\n\ningest_bill_workflow()\n```\n\n## Environment Variables\n\n| Variable | Description | Required |\n|----------|-------------|----------|\n| `BORDGAIS_EMAIL` | Bord Gáis account email | Yes |\n| `BORDGAIS_PASSWORD` | Bord Gáis account password | Yes |\n| `BORDGAIS_ACCOUNT_ID` | Bord Gáis account ID | Yes |\n| `GOOGLE_CREDENTIALS_FILE` | Path to Google OAuth2 credentials | Yes |\n| `NOTIFICATION_EMAIL` | Email to receive bill notifications | Yes |\n| `MARKDOWN_DESTINATION_FOLDER` | Destination folder for converted Markdown files | Yes |\n| `LOG_LEVEL` | Logging level (INFO, DEBUG, etc.) | No |\n\n## Security Notes\n\n- Never commit `.env` file or `credentials.json` to version control\n- Use environment variables for all sensitive data\n- Rotate Google OAuth2 tokens regularly\n- Consider using a secrets manager for production deployments\n\n## Troubleshooting\n\n### Playwright Browser Issues\n\nIf you encounter Playwright installation issues on Windows:\n\n```bash\npython -m playwright install --with-deps chromium\n```\n\n### Google Authentication Issues\n\nEnsure your Google Cloud project has:\n- Google Drive API enabled\n- Gmail API enabled\n- Correct OAuth2 scopes in credentials\n\n### Bord Gáis Login Issues\n\nThe Bord Gáis website may change its structure. If the downloader fails:\n1. Review the error logs\n2. Inspect the website HTML structure\n3. Update selectors in `src/bill_ingestion/downloaders/bordgais.py`\n\n## Contributing\n\nWhen making changes:\n1. Follow PEP 8 style guidelines\n2. Add type hints to new functions\n3. Update tests for new functionality\n4. Update documentation as needed\n5. Run tests locally using PowerShell: `$env:PYTHONPATH=\"src\"; python -m pytest`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmathieubuisson%2Fbill-ingestion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmathieubuisson%2Fbill-ingestion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmathieubuisson%2Fbill-ingestion/lists"}