{"id":50949215,"url":"https://github.com/fahadnasir13/financial_data-analyzer_tool","last_synced_at":"2026-06-17T23:32:35.053Z","repository":{"id":310534497,"uuid":"1040243580","full_name":"fahadnasir13/financial_data-analyzer_tool","owner":"fahadnasir13","description":"A Python-based framework for analyzing, cleaning, and reconciling financial data stored in Excel workbooks.","archived":false,"fork":false,"pushed_at":"2025-08-18T17:24:12.000Z","size":1417,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-18T19:23:06.602Z","etag":null,"topics":["data-analysis","excel","financial","python","store"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fahadnasir13.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-18T17:11:17.000Z","updated_at":"2025-08-18T17:26:46.000Z","dependencies_parsed_at":"2025-08-18T19:33:37.821Z","dependency_job_id":null,"html_url":"https://github.com/fahadnasir13/financial_data-analyzer_tool","commit_stats":null,"previous_names":["fahadnasir13/financial_data-analyzer_tool"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/fahadnasir13/financial_data-analyzer_tool","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fahadnasir13%2Ffinancial_data-analyzer_tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fahadnasir13%2Ffinancial_data-analyzer_tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fahadnasir13%2Ffinancial_data-analyzer_tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fahadnasir13%2Ffinancial_data-analyzer_tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fahadnasir13","download_url":"https://codeload.github.com/fahadnasir13/financial_data-analyzer_tool/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fahadnasir13%2Ffinancial_data-analyzer_tool/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34470323,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","excel","financial","python","store"],"created_at":"2026-06-17T23:32:34.865Z","updated_at":"2026-06-17T23:32:35.046Z","avatar_url":"https://github.com/fahadnasir13.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":" 📊 Financial Data Analyzer \u0026 Reconciliation Tool\n\nA Python-based framework for **analyzing, cleaning, and reconciling financial data** stored in Excel workbooks.  \n\nIt not only parses complex financial formats (currencies, dates, shorthand like `1.5M`, etc.) but also provides advanced reconciliation features:\n\n- ✅ **Direct Matching** – Identify one-to-one matches between transactions and targets  \n- 🔢 **Subset Sum Matching** – Detect combinations of transactions that add up to a target amount  \n- 🤖 **Machine Learning \u0026 Heuristics** – Optimized dynamic programming, genetic algorithms, and fuzzy similarity scoring  \n- 📈 **Performance Benchmarking** – Compare brute force vs. optimized methods across dataset sizes  \n- 📝 **Excel Reporting** – Clean reports with matched transactions, targets, and differences  \n\n---\n\n## 🔧 Features\n\n### Data Parsing \u0026 Cleaning\n- Handles multiple currency formats: `$1,234.56`, `(2,500.00)`, `€1.234,56`, `₹1,23,456.78`\n- Understands shorthand notations: `1.5M`, `2B`, etc.\n- Parses dates in multiple formats: `MM/DD/YYYY`, `DD/MM/YYYY`, `Q4 2023`, Excel serials\n- Cleans data into **standardized floats** (amounts) and **ISO 8601 dates**\n\n### Reconciliation Engine\n- **Direct Matching**: Exact 1-to-1 matches  \n- **Subset Sum Analysis**:  \n  - *Brute Force*: Tests all combinations (small datasets only)  \n  - *Dynamic Programming*: Efficient exact matching for medium datasets  \n  - *Genetic Algorithm*: Heuristic search for large datasets  \n- **Fuzzy Matching**: String similarity + amount tolerance for approximate reconciliation  \n\n### Benchmarking\n- Compare execution time across methods (brute force vs. DP vs. GA)  \n- Visualize scaling performance with runtime plots  \n\n### Reporting\n- Generates Excel output with:  \n  - Cleaned transactions  \n  - Cleaned targets  \n  - Match reports (transactions, target IDs, match type, differences, etc.)  \n\n---\n\n## 📂 Project Structure\n\nFinancial-data-analyzer/\n├── src/\n│ ├── parser.py # Data loading \u0026 cleaning\n│ ├── recon.py # Reconciliation engine (exact, brute, DP, GA, fuzzy)\n│ └── main.py # Main orchestrator script\n│\n├── examples/\n│ └── sample_data.xlsx # Example input workbook\n│\n├── output/\n│ ├── recon_results.xlsx # Reconciliation results\n│ └── benchmark_plots/ # Performance graphs\n│\n├── tests/\n│ ├── test_parser.py\n│ └── test_recon.py\n│\n├── requirements.txt\n└── README.md\n\n\n\n---\n\n## ⚙️ Installation\n\nClone the repository:\n\n```bash\ngit clone https://github.com/your-username/financial-data-analyzer.git\ncd financial-data-analyzer\n\n\npython -m venv venv\n# On Linux/Mac\nsource venv/bin/activate\n# On Windows\nvenv\\Scripts\\activate\n\n\npip install -r requirements.txt\n📑 Usage\nPrepare Input Excel\n\nSheet1 (Transactions)\n\nColumn A: Transaction Amount (e.g., 150.00)\n\nColumn B: Description (e.g., \"Invoice #001\")\n\nSheet2 (Targets)\n\nColumn C: Target Amount (e.g., 225.50)\n\nColumn D: Reference ID (e.g., \"REF001\")\n\nRun the Analyzer\n\n\npython src/main.py --input examples/sample_data.xlsx --output output/recon_results.xlsx\n\nView Results\n\nOpen output/recon_results.xlsx → includes sheets:\n\nTransactions_Clean – standardized transactions\n\nTargets_Clean – standardized targets\n\nMatches – reconciliation results\n\nExample:\n\n| target\\_id | ref\\_id | target | match\\_type   | txn\\_ids        | txn\\_amounts     | sum\\_amount | diff |\n| ---------- | ------- | ------ | ------------- | --------------- | ---------------- | ----------- | ---- |\n| TGT0001    | REF001  | 225.50 | exact\\_1to1   | \\[TXN0003]      | \\[225.50]        | 225.50      | 0.0  |\n| TGT0002    | REF002  | 300.00 | brute\\_subset | \\[TXN0005,TXN7] | \\[150.00,150.00] | 300.00      | 0.0  |\n| TGT0003    | REF003  | 450.75 | ga\\_subset    | \\[TXN0010,...]  | \\[200.25,250.50] | 450.75      | 0.0  |\n\n\n🔬 Methods Compared\n\n| Method         | Strengths                  | Weaknesses               | Best Use Case            |\n| -------------- | -------------------------- | ------------------------ | ------------------------ |\n| Exact Match    | Fast \u0026 simple              | Only 1:1 matches         | Small exact checks       |\n| Brute Force    | Guaranteed if feasible     | Exponential time         | Very small datasets      |\n| Dynamic Prog.  | Efficient, exact           | Works best with integers | Medium datasets          |\n| Genetic Algo   | Scales, finds near-exact   | Approximate, stochastic  | Large datasets           |\n| Fuzzy Matching | Handles noisy descriptions | Approximate only         | Name/desc reconciliation |\n\n\n🛠 Configuration\n\nTunable parameters in main.py:\n\nbrute_max_subset_size: max subset size for brute force (default: 4)\n\nbrute_time_budget_s: per-target time limit (default: 1.0s)\n\ndp_candidate_limit: pool size for DP (default: 25)\n\nga_candidate_limit: pool size for GA (default: 30)\n\n✅ Roadmap\n\nAdd web dashboard for interactive reconciliation\n\nDatabase integration (Postgres, SQL Server)\n\nML classifier for predictive reconciliation\n\nBatch-processing for enterprise-scale datasets\n\n👨‍💻 Contributing\n\nContributions welcome!\n\nFork the repo\n\nCreate a feature branch\n\nSubmit a PR with details\n\n📜 License\n\nMIT License – free to use, modify, and distribute.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffahadnasir13%2Ffinancial_data-analyzer_tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffahadnasir13%2Ffinancial_data-analyzer_tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffahadnasir13%2Ffinancial_data-analyzer_tool/lists"}