{"id":23907757,"url":"https://github.com/dakshdeephere/bank_eda-practice","last_synced_at":"2026-05-07T02:31:26.702Z","repository":{"id":194611312,"uuid":"691209986","full_name":"dakshdeepHERE/Bank_EDA-Practice","owner":"dakshdeepHERE","description":"EDA analysis of Bank.csv dataset","archived":false,"fork":false,"pushed_at":"2024-02-24T20:22:10.000Z","size":998,"stargazers_count":0,"open_issues_count":1,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-23T14:43:44.289Z","etag":null,"topics":["analysis","data","data-visualization","dataanalysis","matplotlib","numpy","pandas","python3","seaborn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dakshdeepHERE.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-13T18:01:22.000Z","updated_at":"2023-09-13T18:14:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"e48de569-8654-4c7a-bf66-d5faf9049148","html_url":"https://github.com/dakshdeepHERE/Bank_EDA-Practice","commit_stats":null,"previous_names":["dakshdeephere/bank_eda-practice"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dakshdeepHERE/Bank_EDA-Practice","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dakshdeepHERE%2FBank_EDA-Practice","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dakshdeepHERE%2FBank_EDA-Practice/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dakshdeepHERE%2FBank_EDA-Practice/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dakshdeepHERE%2FBank_EDA-Practice/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dakshdeepHERE","download_url":"https://codeload.github.com/dakshdeepHERE/Bank_EDA-Practice/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dakshdeepHERE%2FBank_EDA-Practice/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32720109,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-07T02:14:30.463Z","status":"ssl_error","status_checked_at":"2026-05-07T02:14:29.405Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analysis","data","data-visualization","dataanalysis","matplotlib","numpy","pandas","python3","seaborn"],"created_at":"2025-01-05T03:14:13.676Z","updated_at":"2026-05-07T02:31:26.689Z","avatar_url":"https://github.com/dakshdeepHERE.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Bank Data Analysis Project\n\nThis repository contains a data analysis project that focuses on exploring and analyzing a dataset from a bank. The dataset, stored in a CSV file named `bank_data.csv`, contains various customer-related information, such as age, job, education, and financial details.\n\n## Table of Contents\n\n1. [Introduction](#introduction)\n2. [Getting Started](#getting-started)\n    - [Prerequisites](#prerequisites)\n    - [Installation](#installation)\n3. [Data Analysis](#data-analysis)\n    - [Importing Libraries](#importing-libraries)\n    - [Reading Dataset](#reading-dataset)\n    - [Data Cleaning](#data-cleaning)\n    - [Dropping Columns](#dropping-columns)\n    - [Dividing 'jobedu' Column](#dividing-jobedu-in-job-and-education)\n    - [Handling Missing Values](#handling-missing-values)\n    - [Finding Duplicates](#finding-duplicates)\n    - [Outlier Handling](#outlier-handling)\n    - [Standardizing Variables](#standarize-variable)\n    - [Univariate Analysis](#univariate-analysis-categorical-features)\n    - [Bivariate Analysis](#bivariate-analysis)\n4. [Conclusion](#conclusion)\n5. [Contributing](#contributing)\n6. [License](#license)\n\n## Introduction \u003ca name=\"introduction\"\u003e\u003c/a\u003e\n\nThis data analysis project aims to provide insights into the bank dataset, exploring various aspects of the data such as customer demographics, financial information, and the response variable. The project includes data cleaning, handling missing values, outlier detection, and various visualizations to help understand the data better.\n\n## Getting Started \u003ca name=\"getting-started\"\u003e\u003c/a\u003e\n\n### Prerequisites \u003ca name=\"prerequisites\"\u003e\u003c/a\u003e\n\nBefore running the code in this project, make sure you have the following Python libraries installed:\n\n- Pandas\n- NumPy\n- Matplotlib\n- Seaborn\n\n### Installation \u003ca name=\"installation\"\u003e\u003c/a\u003e\n\nYou can install the required Python libraries using pip:\n\n```bash\npip install pandas numpy matplotlib seaborn\n```\n\n## Data Analysis \u003ca name=\"data-analysis\"\u003e\u003c/a\u003e\n\nThe data analysis process is broken down into several steps, as outlined below:\n\n### Importing Libraries \u003ca name=\"importing-libraries\"\u003e\u003c/a\u003e\n\nThe project starts by importing necessary Python libraries and setting up the environment.\n\n### Reading Dataset \u003ca name=\"reading-dataset\"\u003e\u003c/a\u003e\n\nThe dataset, stored in the 'bank_data.csv' file, is read into a Pandas DataFrame, and the first few rows are displayed to get an initial overview.\n\n### Data Cleaning \u003ca name=\"data-cleaning\"\u003e\u003c/a\u003e\n\nData cleaning involves removing unwanted rows, columns, or values from the dataset to prepare it for analysis. In this project, some rows with missing or irrelevant data are dropped, and the 'jobedu' column is divided into separate 'job' and 'education' columns.\n\n### Dropping Columns \u003ca name=\"dropping-columns\"\u003e\u003c/a\u003e\n\nUnnecessary columns like 'customerid' are dropped to simplify the dataset.\n\n### Dividing 'jobedu' Column \u003ca name=\"dividing-jobedu-in-job-and-education\"\u003e\u003c/a\u003e\n\nA new `Education` column is created by extracting values from the `jobedu` column.\n\n### Handling Missing Values \u003ca name=\"handling-missing-values\"\u003e\u003c/a\u003e\n\nMissing values in the `age` and `month` columns are identified and handled appropriately. In the `pdays` column, missing values are replaced with NaN.\n\n### Finding Duplicates \u003ca name=\"finding-duplicates\"\u003e\u003c/a\u003e\n\nDuplicate records based on `age` and `response` columns are identified.\n\n### Outlier Handling \u003ca name=\"outlier-handling\"\u003e\u003c/a\u003e\n\nOutliers in numerical variables like `age`, `salary`, and `balance` are analyzed using boxplots and quantiles.\n\n### Standardizing Variables \u003ca name=\"standarize-variable\"\u003e\u003c/a\u003e\n\nThe 'duration' variable is standardized to ensure uniformity.\n\n### Univariate Analysis \u003ca name=\"univariate-analysis-categorical-features\"\u003e\u003c/a\u003e\n\nUnivariate analysis explores categorical features like `marital`, `job`, `education`, `poutcome`, and the target variable `response`. Visualizations such as bar plots and pie charts provide insights.\n\n### Bivariate Analysis \u003ca name=\"bivariate-analysis\"\u003e\u003c/a\u003e\n\nBivariate analysis examines relationships between variables, including numerical-numerical, categorical-numerical, and categorical-categorical relationships. Correlation analysis, boxplots, and heatmaps are used to visualize these relationships.\n\n## Conclusion \u003ca name=\"conclusion\"\u003e\u003c/a\u003e\n\nThis data analysis project provides a comprehensive exploration of the bank dataset, covering data cleaning, missing value handling, outlier detection, and various visualizations. The findings and insights gained from this analysis can be valuable for making informed decisions and building predictive models.\n\n## Contributing \u003ca name=\"contributing\"\u003e\u003c/a\u003e\n\nContributions to this project are welcome. If you have suggestions, improvements, or additional analyses to add, please feel free to contribute.\n\n## License \u003ca name=\"license\"\u003e\u003c/a\u003e\n\nThis project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdakshdeephere%2Fbank_eda-practice","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdakshdeephere%2Fbank_eda-practice","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdakshdeephere%2Fbank_eda-practice/lists"}