{"id":25392605,"url":"https://github.com/radanpro/data-science","last_synced_at":"2025-09-10T23:47:55.156Z","repository":{"id":268422215,"uuid":"904243960","full_name":"radanpro/data-science","owner":"radanpro","description":null,"archived":false,"fork":false,"pushed_at":"2025-02-05T16:46:57.000Z","size":6661,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-10T01:51:50.202Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/radanpro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-16T14:16:14.000Z","updated_at":"2025-02-05T16:47:01.000Z","dependencies_parsed_at":"2024-12-16T17:39:17.983Z","dependency_job_id":"11f3ba51-aed2-441e-9943-7cd66bbd8ca9","html_url":"https://github.com/radanpro/data-science","commit_stats":null,"previous_names":["abdulrahmanradan/data-science","radanme/data-science","radanpro/data-science"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/radanpro/data-science","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radanpro%2Fdata-science","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radanpro%2Fdata-science/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radanpro%2Fdata-science/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radanpro%2Fdata-science/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/radanpro","download_url":"https://codeload.github.com/radanpro/data-science/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radanpro%2Fdata-science/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274549843,"owners_count":25306360,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-10T02:00:12.551Z","response_time":83,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-15T16:55:46.585Z","updated_at":"2025-09-10T23:47:55.142Z","avatar_url":"https://github.com/radanpro.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Processing and Visualization App\n\nThis is a **Python-based application** designed for data processing, visualization, and machine learning predictions. It provides a user-friendly interface built with **streamlit** for loading, processing, and analyzing datasets. The application also supports training and testing machine learning models, as well as making predictions based on user input.\n\n**Advanced Data Processing \u0026 Predictive Modeling Toolkit**\n\n[![Python](https://img.shields.io/badge/Python-3.9%2B-blue)](https://www.python.org/)\n[![Streamlit](https://img.shields.io/badge/Streamlit-1.28.0-FF4B4B)](https://streamlit.io/)\n[![License](https://img.shields.io/badge/License-MIT-green)](LICENSE)\n\n---\n\n## Features\n\n- **Data Loading**: Load datasets in CSV or Excel formats.\n\n- **Data Processing**:\n\n  - Handle missing values by dropping rows with null values.\n  - Encode categorical and target columns using Label Encoding.\n  - Balance datasets using Random Over-Sampling.\n  - Perform feature selection to identify the most relevant features.\n\n    **Algorithm Choices**:\n\n  - Random Forest (Customizable trees/depth)\n  - Neural Networks (Flexible architecture builder)\n\n- **Performance Metrics**:\n\n  - Accuracy, Precision, Recall, F1-Score\n  - Interactive Confusion Matrix\n  - Classification Report\n\n- **Data Visualization**:\n\n  - Plot histograms, scatter plots, and 3D scatter plots for up to 3 features.\n\n- **Machine Learning**:\n  - Train and test machine learning models (Random Forest or Deep Learning).\n  - Evaluate model performance using accuracy, precision, and recall metrics.\n- **Prediction**\n- **Single Prediction**: Interactive input form with real-time results\n- **Batch Prediction**: CSV upload with downloadable results\n- **Confidence Scores**: Probability visualization for predictions\n\n---\n\n## Requirements\n\nTo run this application, you need the following:\n\n- **Python 3.x** (Download and install Python from [here](https://www.python.org/downloads/)).\n- **Libraries**:\n  - `pandas`\n  - `scikit-learn`\n  - `tensorflow`\n  - `matplotlib`\n  - `seaborn`\n  - `tkinter`\n  - `imblearn`\n  - `numpy`\n\n---\n\n## Installation\n\nFollow these steps to set up and run the application:\n\n### 1. Clone the Repository\n\nFirst, clone the repository to your local machine:\n\n```bash\n\ngit clone https://github.com/radanpro/data-science.git\n```\n\n\u003c!-- ```bash\n # git clone https://github.com/Eng-Mosab-Alhopishi/Data-Sciences-Analyzer-App.git\n``` --\u003e\n\n### 2. Navigate to the Project Directory\n\nMove into the project folder:\n\n```bash\ncd data-Science\n```\n\n### 3. Set Up a Virtual Environment (Optional but Recommended)\n\nTo avoid conflicts with other Python projects, create a virtual environment:\n\n```bash\npython -m venv venv\n```\n\nActivate the virtual environment:\n\n- On Windows:\n\n  ```bash\n  venv\\Scripts\\activate\n  ```\n\n- On macOS/Linux:\n\n  ```bash\n  source venv/bin/activate\n  ```\n\n### 4. Install Required Libraries\n\nInstall the required libraries using the `requirements.txt` file:\n\n```bash\npip install -r requirements.txt\n```\n\nor\nAlternatively, you can install the libraries manually:\n\n```bash\npip install pandas scikit-learn tensorflow matplotlib seaborn imblearn numpy\n```\n\n### 5. Run the Application\n\nOnce the libraries are installed, run the application:\n\n```bash\nstreamlit run datap.py\n```\n\n---\n\n## Usage\n\n1. **Load Data**  \n   Click on the \"Load Data\" button to upload your dataset (CSV or Excel).\n\n2. **Process Data**  \n   Select the target column from the dropdown menu.  \n   Click \"Start Processing\" to clean, encode, and balance the dataset.\n\n3. **Visualize Data**  \n   Select up to 3 features from the list.  \n   Click \"Plot Data\" to generate visualizations.\n\n4. **Train Model**  \n   Choose a model (Random Forest or Deep Learning).  \n   Click \"Train Model\" to train the selected model.\n\n5. **Test Model**  \n   Click \"Test Model\" to evaluate the model's performance.\n\n6. **Make Predictions**  \n   Enter feature values in the prediction tab.  \n   Click \"Predict\" to see the prediction result.  \n   Click \"New Prediction\" to clear the input fields and reset the result.\n\n---\n\n## File Structure\n\nHere’s an overview of the project structure:\n\n```\nrepository-name/\n│\n├── datap.py                # Main application script\n├── requirements.txt       # List of required libraries\n├── README.md              # Project documentation\n├── LICENSE                # License file\n├── .gitignore             # Files to ignore in Git\n├── data/                  # Folder for sample datasets (optional)\n└── images/                # Folder for screenshots (optional)\n```\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See the `LICENSE` file for details.\n\n---\n\n## Contributing\n\nContributions are welcome! If you'd like to contribute, please follow these steps:\n\n1. Fork the repository.\n2. Create a new branch for your feature or bugfix.\n3. Commit your changes.\n4. Push your changes to the branch.\n5. Submit a pull request.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fradanpro%2Fdata-science","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fradanpro%2Fdata-science","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fradanpro%2Fdata-science/lists"}