{"id":26730767,"url":"https://github.com/gallillio/data_science-data_visualizer_tool","last_synced_at":"2026-02-17T14:01:42.890Z","repository":{"id":199710918,"uuid":"690540247","full_name":"Gallillio/Data_Science-Data_Visualizer_Tool","owner":"Gallillio","description":"## About  Supervised ML Helper is a Python application that streamlines exploratory data analysis (EDA) and preprocessing for supervised machine learning. Featuring a user-friendly Tkinter interface, it enables users to load CSV files, visualize data, and perform essential transformations, making data preparation accessible for all skill levels.","archived":false,"fork":false,"pushed_at":"2025-03-21T19:58:47.000Z","size":13287,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-11T13:32:53.257Z","etag":null,"topics":["data-analysis","data-science","data-visualization","matplotlib","numpy","pandas","seaborn","sklearn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Gallillio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-12T11:53:10.000Z","updated_at":"2025-03-21T19:58:51.000Z","dependencies_parsed_at":"2025-03-21T20:31:33.212Z","dependency_job_id":"a2419a21-da07-40e2-a700-5293461af307","html_url":"https://github.com/Gallillio/Data_Science-Data_Visualizer_Tool","commit_stats":null,"previous_names":["gallillio/tkinter-data_visualizer","gallillio/data_science-data_visualizer_tool"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Gallillio/Data_Science-Data_Visualizer_Tool","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gallillio%2FData_Science-Data_Visualizer_Tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gallillio%2FData_Science-Data_Visualizer_Tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gallillio%2FData_Science-Data_Visualizer_Tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gallillio%2FData_Science-Data_Visualizer_Tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Gallillio","download_url":"https://codeload.github.com/Gallillio/Data_Science-Data_Visualizer_Tool/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gallillio%2FData_Science-Data_Visualizer_Tool/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29546746,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-17T13:00:00.370Z","status":"ssl_error","status_checked_at":"2026-02-17T12:57:14.072Z","response_time":100,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","data-visualization","matplotlib","numpy","pandas","seaborn","sklearn"],"created_at":"2025-03-27T23:31:34.157Z","updated_at":"2026-02-17T14:01:42.848Z","avatar_url":"https://github.com/Gallillio.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Visualizer Tool\n\n## Overview\n\nData Visualizer Tool is a Python application designed to assist users in performing exploratory data analysis (EDA). The application provides a user-friendly interface built with Tkinter, allowing users to easily load datasets, visualize data, and apply various transformations.\n\nThis project was created during my early years at university, and while it serves its purpose, Since Tkinter is considered an outdated tool for building GUIs, I plan to redo it using more modern frameworks like Flask, Django, or FastAPI as well as include Machine Learning capabilities. (If I have the time that is).\n\n## Features\n\n- **Load CSV Files**: Users can select and load CSV files into the application for analysis.\n- **Exploratory Data Analysis (EDA)**: The application provides detailed EDA capabilities, including:\n\n  - Summary statistics of the dataset.\n  - Visualization of missing values.\n  - Detailed analysis of individual columns, including histograms and common values.\n\n  ![Detailed EDA](Results%20Pictures/Detailed%20EDA.png)\n\n- **Data Transformation**: Users can perform various data transformations, including:\n\n  - Handling missing values (mean, median, or removal).\n  - Encoding categorical columns using label encoding.\n  - Renaming and removing columns.\n  - Removing duplicates from the dataset.\n\n- **Data Visualization**: The application includes several visualization options:\n\n  - Histograms for numerical data.\n  - Stacked bar charts for categorical data.\n  - Scatter plots to visualize interactions between two columns.\n\n  ![Histogram EDA](Results%20Pictures/Histogram%20EDA.png)\n\n- **Correlation Analysis**: Users can visualize correlations between different columns in the dataset using heatmaps.\n\n- **Interaction Analysis**: Users can explore interactions between different features in the dataset through scatter plots.\n\n- **User-Friendly Interface**: The application is designed with a simple and intuitive interface, making it accessible for users with varying levels of expertise in data science.\n\n## Getting Started\n\n### Prerequisites\n\n- Python 3.x\n- Required libraries:\n  - NumPy\n  - Pandas\n  - Matplotlib\n  - Seaborn\n  - Tkinter\n  - scikit-learn\n  - Pillow\n\n### Installation\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/yourusername/Supervised_ML_Helper.git\n   ```\n2. Navigate to the project directory:\n   ```bash\n   cd Supervised_ML_Helper\n   ```\n3. Install the required libraries:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n### Usage\n\n1. You can run the application using the provided Python file for the GUI:\n   ```bash\n   python Supervised_ML_Classifier_python.py\n   ```\n2. Alternatively, there is a Jupyter Notebook available for additional analysis and exploration of the dataset.\n3. Follow the on-screen instructions to load your dataset and explore the various features of the application.\n\n## Screenshots\n\nHere are some screenshots of the application in action:\n\n- **Histogram EDA**:\n  ![Histogram EDA](Results%20Pictures/Histogram%20EDA.png)\n\n- **Detailed EDA**:\n  ![Detailed EDA](Results%20Pictures/Detailed%20EDA.png)\n\n- **Correlation Heatmap**:\n  ![Correlation Heatmap](Results%20Pictures/Correlation.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgallillio%2Fdata_science-data_visualizer_tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgallillio%2Fdata_science-data_visualizer_tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgallillio%2Fdata_science-data_visualizer_tool/lists"}