Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/faizantkhan/automated-eda
This repository showcases tools for automatic Exploratory Data Analysis (EDA) in Python. These tools help you quickly understand your datasets and generate insightful reports.
https://github.com/faizantkhan/automated-eda
automatic automation autoviz data-analysis data-analysis-python data-science data-visualization dtale dtale-library eda exploratory-data-analysis ml pandas pandas-profiling python python-library sweetviz
Last synced: 7 days ago
JSON representation
This repository showcases tools for automatic Exploratory Data Analysis (EDA) in Python. These tools help you quickly understand your datasets and generate insightful reports.
- Host: GitHub
- URL: https://github.com/faizantkhan/automated-eda
- Owner: FAIZANTKHAN
- Created: 2024-08-16T14:57:03.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-08-20T18:19:58.000Z (5 months ago)
- Last Synced: 2024-11-15T09:14:04.273Z (2 months ago)
- Topics: automatic, automation, autoviz, data-analysis, data-analysis-python, data-science, data-visualization, dtale, dtale-library, eda, exploratory-data-analysis, ml, pandas, pandas-profiling, python, python-library, sweetviz
- Language: Jupyter Notebook
- Homepage:
- Size: 4.57 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
---
# Automatic EDA Tools
Welcome to the **Automatic EDA Tools** repository! This repository contains examples and documentation for various tools that automate Exploratory Data Analysis (EDA) in Python. These tools help data scientists and analysts quickly understand their datasets and generate insightful reports with minimal effort.
## Table of Contents
- [Introduction](#introduction)
- [Tools](#tools)
- [Pandas Profiling](#pandas-profiling)
- [AutoViz](#autoviz)
- [Sweetviz](#sweetviz)
- [D-Tale](#d-tale)
- [Installation](#installation)
- [Usage](#usage)
- [Contributing](#contributing)
- [License](#license)## Introduction
Exploratory Data Analysis (EDA) is a crucial step in the data analysis process. It involves summarizing the main characteristics of a dataset, often using visual methods. Automating EDA can save time and provide a comprehensive overview of the data. This repository showcases four popular Python tools for automatic EDA: Pandas Profiling, AutoViz, Sweetviz, and D-Tale.
## Tools
### Pandas Profiling
**Pandas Profiling** generates profile reports from a pandas DataFrame. The report includes a variety of statistics and visualizations, such as missing values, correlations, and distributions.
- **Features**:
- Descriptive statistics for each column
- Visualizations for distributions and correlations
- Interactive HTML reports- **Example**:
```python
import pandas as pd
from pandas_profiling import ProfileReportdf = pd.read_csv('your_dataset.csv')
profile = ProfileReport(df, title="Pandas Profiling Report")
profile.to_file("output.html")
```### AutoViz
**AutoViz** automatically visualizes any dataset with a single line of code. It can handle both small and large datasets and provides a variety of plots to understand the data better.
- **Features**:
- Automatic visualization of data
- Handles large datasets efficiently
- Variety of plots including scatter, bar, and box plots- **Example**:
```python
from autoviz.AutoViz_Class import AutoViz_ClassAV = AutoViz_Class()
df = AV.AutoViz('your_dataset.csv')
```### Sweetviz
**Sweetviz** creates beautiful, high-density visualizations to help you understand your data quickly. It provides a detailed analysis of each feature and compares datasets.
- **Features**:
- High-density visualizations
- Detailed feature analysis
- Comparison of datasets- **Example**:
```python
import pandas as pd
import sweetviz as svdf = pd.read_csv('your_dataset.csv')
report = sv.analyze(df)
report.show_html('sweetviz_report.html')
```### D-Tale
**D-Tale** is a powerful tool that combines the capabilities of a pandas DataFrame with an interactive web-based interface. It allows you to explore and analyze your data in a user-friendly environment.
- **Features**:
- Interactive web-based interface
- Real-time data manipulation and visualization
- Integration with pandas DataFrames- **Example**:
```python
import dtale
import pandas as pddf = pd.read_csv('your_dataset.csv')
d = dtale.show(df)
d.open_browser()
```## Installation
To install these tools, you can use pip:
```bash
pip install pandas-profiling autoviz sweetviz dtale
```## Usage
Refer to the examples provided above for each tool to get started with automatic EDA. You can also check the official documentation for more detailed usage instructions.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request or open an Issue if you have any suggestions or improvements.
## License
This repository is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.
---
Source:
(1) 4 Ways to Automate Exploratory Data Analysis (EDA) in Python. https://builtin.com/data-science/EDA-python.
(2) Streamlining Data Exploration: A Comparison of Profiling Tools for .... https://dataroots.io/blog/streamlining-data-exploration-a-comparison-of-pandas-profiler-sweet-viz-and-pandas-gui-for-effective-eda.
(3) Tools to Automate EDA - GeeksforGeeks. https://www.geeksforgeeks.org/tools-to-automate-eda/.
(4) Sweetviz: Automate Exploratory Data Analysis (EDA) - CoderzColumn. https://coderzcolumn.com/tutorials/data-science/sweetviz-automate-exploratory-data-analysis-eda.
(5) Automated EDA with Python - Open Source Automation. https://theautomatic.net/2021/07/02/automated-eda-with-python/.