{"id":15442654,"url":"https://github.com/takk8is/datasetanalysiseda","last_synced_at":"2025-09-02T19:47:57.913Z","repository":{"id":257607228,"uuid":"858544951","full_name":"Takk8IS/DatasetAnalysisEDA","owner":"Takk8IS","description":"A robust Python tool for comprehensive dataset analysis and machine learning model evaluation. This project automates the process of data preprocessing, exploratory data analysis (EDA), and predictive modeling, with a focus on handling common data inconsistencies.","archived":false,"fork":false,"pushed_at":"2024-09-17T14:38:08.000Z","size":5896,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-18T13:14:26.815Z","etag":null,"topics":["analytics","analyzer","chart","csv-files","data-science","data-visualization","datascience","dataset","datasets","davidccavalcante","eda","fjallstoppur","graphics","machine-learning","python","python3","takk-ag","takk-design","takk8is","xlsx-files"],"latest_commit_sha":null,"homepage":"https://takk.ag","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Takk8IS.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":"FUNDING.yml","license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null},"funding":"If you have any questions or need support, please open an issue."},"created_at":"2024-09-17T04:58:59.000Z","updated_at":"2024-09-17T14:38:11.000Z","dependencies_parsed_at":"2024-09-17T18:21:30.725Z","dependency_job_id":"5792b4a2-47d9-4db9-93bb-8410618d63d2","html_url":"https://github.com/Takk8IS/DatasetAnalysisEDA","commit_stats":null,"previous_names":["takk8is/datasetanalysiseda"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Takk8IS/DatasetAnalysisEDA","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Takk8IS%2FDatasetAnalysisEDA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Takk8IS%2FDatasetAnalysisEDA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Takk8IS%2FDatasetAnalysisEDA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Takk8IS%2FDatasetAnalysisEDA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Takk8IS","download_url":"https://codeload.github.com/Takk8IS/DatasetAnalysisEDA/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Takk8IS%2FDatasetAnalysisEDA/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273341430,"owners_count":25088346,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-02T02:00:09.530Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","analyzer","chart","csv-files","data-science","data-visualization","datascience","dataset","datasets","davidccavalcante","eda","fjallstoppur","graphics","machine-learning","python","python3","takk-ag","takk-design","takk8is","xlsx-files"],"created_at":"2024-10-01T19:29:07.805Z","updated_at":"2025-09-02T19:47:57.346Z","avatar_url":"https://github.com/Takk8IS.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Dataset Analysis EDA 📊\n\n[![Version](https://img.shields.io/badge/version-1.0.0-blue.svg)](https://github.com/Takk8IS/DatasetAnalysisEDA)\n[![Licence](https://img.shields.io/badge/licence-CC--BY--4.0-green.svg)](https://creativecommons.org/licenses/by/4.0/)\n[![GitHub issues](https://img.shields.io/github/issues/Takk8IS/DatasetAnalysisEDA.svg)](https://github.com/Takk8IS/DatasetAnalysisEDA/issues)\n[![GitHub stars](https://img.shields.io/github/stars/Takk8IS/DatasetAnalysisEDA.svg)](https://github.com/Takk8IS/DatasetAnalysisEDA/stargazers)\n\nDataset Analysis EDA is a Python-based tool designed for comprehensive exploratory data analysis (EDA) and machine learning model evaluation. This intelligent system processes various dataset formats, performs data preprocessing, conducts statistical analysis, and generates insightful visualizations.\n\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-01.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-02.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-03.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-04.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-05.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-06.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-07.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-08.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-09.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-10.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-11.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-12.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-13.png?raw=true)\n![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-14.png?raw=true)\n\n## 🌟 Key Features\n\n-   📄 **Multi-format Data Processing**: Handle various file formats including CSV and Excel.\n-   🧹 **Automated Data Preprocessing**: Includes grammar correction, handling of missing values, and feature encoding.\n-   📊 **Comprehensive EDA**: Generates statistical summaries, correlation analyses, and various visualizations.\n-   🤖 **Machine Learning Model Evaluation**: Implements Random Forest classification with cross-validation.\n-   📈 **Feature Importance Analysis**: Provides insights into the most influential features in the dataset.\n-   📉 **Advanced Visualizations**: Includes histograms, heatmaps, confusion matrices, and feature importance plots.\n-   🛠️ **Robust Error Handling**: Comprehensive error management to ensure smooth operation with various datasets.\n\n## 📦 Project Structure\n\n```plaintext\n├── AUTHORS.md\n├── DatasetAnalysis.py\n├── FUNDING.yml\n├── INFO.md\n├── LICENSE.md\n├── PRIVACY.md\n├── PlanilhaModelagem.csv\n├── PlanilhaModelagem.xlsx\n├── README.md\n├── images\n│   ├── screenshot-01.png\n│   ├── screenshot-02.png\n│   ├── screenshot-03.png\n│   ├── screenshot-04.png\n│   ├── screenshot-05.png\n│   ├── screenshot-06.png\n│   ├── screenshot-07.png\n│   ├── screenshot-08.png\n│   ├── screenshot-09.png\n│   ├── screenshot-10.png\n│   ├── screenshot-11.png\n│   ├── screenshot-12.png\n│   ├── screenshot-13.png\n│   └── screenshot-14.png\n└── requirements.txt\n```\n\n## 🏃‍♂️ How to Use\n\n1. **Clone the Repository**:\n\n    ```sh\n    git clone https://github.com/Takk8IS/DatasetAnalysisEDA.git\n    cd DatasetAnalysisEDA\n    ```\n\n2. **Install Dependencies**:\n\n    ```sh\n    pip install -r requirements.txt\n    ```\n\n3. **Run the Analysis**:\n\n    ```sh\n    python DatasetAnalysis.py PlanilhaModelagem.xlsx\n    ```\n\n4. **Review the Results**:\n    - The script will generate various plots and print analysis results in the console.\n    - Review the generated visualizations for insights about your dataset.\n\n## Contributing\n\nWe welcome contributions from the community! If you'd like to contribute, please:\n\n1. Fork the repository.\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`).\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`).\n4. Push to the branch (`git push origin feature/AmazingFeature`).\n5. Open a Pull Request.\n\n## Donations\n\nIf this project has been helpful, consider making a donation:\n\n**USDT (TRC-20)**: `TGpiWetnYK2VQpxNGPR27D9vfM6Mei5vNA`\n\nYour support helps us continue to develop innovative data analysis tools.\n\n## License\n\nThis project is licensed under the CC-BY-4.0 License. See the [LICENSE](LICENSE.md) file for more details.\n\n## About Takk™ Innovate Studio\n\nLeading the Digital Revolution as the Pioneering 100% Artificial Intelligence Team.\n\n-   Author: [David C Cavalcante](mailto:davcavalcante@proton.me)\n-   LinkedIn: [linkedin.com/in/hellodav](https://www.linkedin.com/in/hellodav/)\n-   X: [@Takk8IS](https://twitter.com/takk8is/)\n-   Medium: [takk8is.medium.com](https://takk8is.medium.com/)\n-   Website: [takk.ag](https://takk.ag/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftakk8is%2Fdatasetanalysiseda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftakk8is%2Fdatasetanalysiseda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftakk8is%2Fdatasetanalysiseda/lists"}