Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/takk8is/datasetanalysiseda
A robust Python tool for comprehensive dataset analysis and machine learning model evaluation. This project automates the process of data preprocessing, exploratory data analysis (EDA), and predictive modeling, with a focus on handling common data inconsistencies.
https://github.com/takk8is/datasetanalysiseda
analytics analyzer chart csv-files data-science data-visualization datascience dataset datasets davidccavalcante eda fjallstoppur graphics machine-learning python python3 takk-ag takk-design takk8is xlsx-files
Last synced: 21 days ago
JSON representation
A robust Python tool for comprehensive dataset analysis and machine learning model evaluation. This project automates the process of data preprocessing, exploratory data analysis (EDA), and predictive modeling, with a focus on handling common data inconsistencies.
- Host: GitHub
- URL: https://github.com/takk8is/datasetanalysiseda
- Owner: Takk8IS
- License: cc0-1.0
- Created: 2024-09-17T04:58:59.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-09-17T14:38:08.000Z (about 2 months ago)
- Last Synced: 2024-10-01T19:28:41.284Z (about 1 month ago)
- Topics: analytics, analyzer, chart, csv-files, data-science, data-visualization, datascience, dataset, datasets, davidccavalcante, eda, fjallstoppur, graphics, machine-learning, python, python3, takk-ag, takk-design, takk8is, xlsx-files
- Language: Python
- Homepage: https://takk.ag
- Size: 5.62 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: FUNDING.yml
- License: LICENSE.md
- Authors: AUTHORS.md
Awesome Lists containing this project
README
# Dataset Analysis EDA π
[![Version](https://img.shields.io/badge/version-1.0.0-blue.svg)](https://github.com/Takk8IS/DatasetAnalysisEDA)
[![Licence](https://img.shields.io/badge/licence-CC--BY--4.0-green.svg)](https://creativecommons.org/licenses/by/4.0/)
[![GitHub issues](https://img.shields.io/github/issues/Takk8IS/DatasetAnalysisEDA.svg)](https://github.com/Takk8IS/DatasetAnalysisEDA/issues)
[![GitHub stars](https://img.shields.io/github/stars/Takk8IS/DatasetAnalysisEDA.svg)](https://github.com/Takk8IS/DatasetAnalysisEDA/stargazers)Dataset Analysis EDA is a Python-based tool designed for comprehensive exploratory data analysis (EDA) and machine learning model evaluation. This intelligent system processes various dataset formats, performs data preprocessing, conducts statistical analysis, and generates insightful visualizations.
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-01.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-02.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-03.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-04.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-05.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-06.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-07.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-08.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-09.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-10.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-11.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-12.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-13.png?raw=true)
![Dataset Analysis EDA](https://github.com/Takk8IS/DatasetAnalysisEDA/blob/main/images/screenshot-14.png?raw=true)## π Key Features
- π **Multi-format Data Processing**: Handle various file formats including CSV and Excel.
- π§Ή **Automated Data Preprocessing**: Includes grammar correction, handling of missing values, and feature encoding.
- π **Comprehensive EDA**: Generates statistical summaries, correlation analyses, and various visualizations.
- π€ **Machine Learning Model Evaluation**: Implements Random Forest classification with cross-validation.
- π **Feature Importance Analysis**: Provides insights into the most influential features in the dataset.
- π **Advanced Visualizations**: Includes histograms, heatmaps, confusion matrices, and feature importance plots.
- π οΈ **Robust Error Handling**: Comprehensive error management to ensure smooth operation with various datasets.## π¦ Project Structure
```plaintext
βββ AUTHORS.md
βββ DatasetAnalysis.py
βββ FUNDING.yml
βββ INFO.md
βββ LICENSE.md
βββ PRIVACY.md
βββ PlanilhaModelagem.csv
βββ PlanilhaModelagem.xlsx
βββ README.md
βββ images
β βββ screenshot-01.png
β βββ screenshot-02.png
β βββ screenshot-03.png
β βββ screenshot-04.png
β βββ screenshot-05.png
β βββ screenshot-06.png
β βββ screenshot-07.png
β βββ screenshot-08.png
β βββ screenshot-09.png
β βββ screenshot-10.png
β βββ screenshot-11.png
β βββ screenshot-12.png
β βββ screenshot-13.png
β βββ screenshot-14.png
βββ requirements.txt
```## πββοΈ How to Use
1. **Clone the Repository**:
```sh
git clone https://github.com/Takk8IS/DatasetAnalysisEDA.git
cd DatasetAnalysisEDA
```2. **Install Dependencies**:
```sh
pip install -r requirements.txt
```3. **Run the Analysis**:
```sh
python DatasetAnalysis.py PlanilhaModelagem.xlsx
```4. **Review the Results**:
- The script will generate various plots and print analysis results in the console.
- Review the generated visualizations for insights about your dataset.## Contributing
We welcome contributions from the community! If you'd like to contribute, please:
1. Fork the repository.
2. Create your feature branch (`git checkout -b feature/AmazingFeature`).
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`).
4. Push to the branch (`git push origin feature/AmazingFeature`).
5. Open a Pull Request.## Donations
If this project has been helpful, consider making a donation:
**USDT (TRC-20)**: `TGpiWetnYK2VQpxNGPR27D9vfM6Mei5vNA`
Your support helps us continue to develop innovative data analysis tools.
## License
This project is licensed under the CC-BY-4.0 License. See the [LICENSE](LICENSE.md) file for more details.
## About Takkβ’ Innovate Studio
Leading the Digital Revolution as the Pioneering 100% Artificial Intelligence Team.
- Author: [David C Cavalcante](mailto:[email protected])
- LinkedIn: [linkedin.com/in/hellodav](https://www.linkedin.com/in/hellodav/)
- X: [@Takk8IS](https://twitter.com/takk8is/)
- Medium: [takk8is.medium.com](https://takk8is.medium.com/)
- Website: [takk.ag](https://takk.ag/)