https://github.com/anudeepkaddala/bankds
This repository contains a Python-based solution for cleaning, matching, and formatting bank data. The primary goal is to match banks from two datasets based on their names and associate each bank with its respective asset size. The final output is a cleaned dataset with asset sizes in Indian-style currency format.
https://github.com/anudeepkaddala/bankds
data-analysis data-science fuzzy-matching pandas python
Last synced: 2 months ago
JSON representation
This repository contains a Python-based solution for cleaning, matching, and formatting bank data. The primary goal is to match banks from two datasets based on their names and associate each bank with its respective asset size. The final output is a cleaned dataset with asset sizes in Indian-style currency format.
- Host: GitHub
- URL: https://github.com/anudeepkaddala/bankds
- Owner: anudeepkaddala
- Created: 2025-10-02T17:31:39.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-10-02T17:43:21.000Z (9 months ago)
- Last Synced: 2025-10-02T19:27:26.514Z (9 months ago)
- Topics: data-analysis, data-science, fuzzy-matching, pandas, python
- Language: Python
- Homepage:
- Size: 230 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Bank Asset Size Matching & Formatting
This repository contains a Python-based solution for cleaning, matching, and formatting bank data. The primary goal is to match banks from two datasets based on their names and associate each bank with its respective asset size. The final output is a cleaned dataset with asset sizes in Indian-style currency format.
Key Features:
Data Preprocessing: Standardizes bank names and removes unnecessary suffixes for accurate matching.
Fuzzy Matching: Utilizes thefuzz library for approximate string matching between bank names in two datasets.
Data Transformation: Formats asset sizes into the Indian currency style for better presentation.
Output: Exports the final matched and formatted dataset to an Excel file.
Technologies:
Python
pandas
thefuzz (for fuzzy matching)
Excel for output