Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jenishajustin/elevate_eda

Elevate EDA is an interactive data exploration tool designed to simplify Exploratory Data Analysis (EDA). Powered by Python and Streamlit, it offers visualizations, summary statistics, and automated reports to help users gain insights from their datasets efficiently.
https://github.com/jenishajustin/elevate_eda

plotly-express sklearn streamlit

Last synced: 1 day ago
JSON representation

Elevate EDA is an interactive data exploration tool designed to simplify Exploratory Data Analysis (EDA). Powered by Python and Streamlit, it offers visualizations, summary statistics, and automated reports to help users gain insights from their datasets efficiently.

Awesome Lists containing this project

README

        

# Elevate EDA

Elevate EDA is a comprehensive data exploration tool designed to streamline and enhance the process of performing **Exploratory Data Analysis (EDA)**. It provides users with an intuitive interface, powerful visualizations, and insights to better understand and make sense of their datasets.

## Features
- **Data Visualization:** Create a wide range of graphs and charts, including histograms, scatter plots, heatmaps, and more.
- **Summary Statistics:** Easily generate summary statistics for your datasets, including mean, median, variance, and standard deviation.
- **Correlation Analysis:** Visualize the relationships between different features with correlation matrices.
- **Interactive UI:** Built with Streamlit, offering an easy-to-use, interactive interface.
- **Automated Reports:** Export reports in a customizable format for presentations and decision-making.

## Technologies Used
- **Python**: The core programming language used for data processing.
- **Streamlit**: Framework for building the user interface.
- **Pandas**: Data manipulation and analysis library.
- **Matplotlib** & **Seaborn**: Visualization libraries for creating plots.
- **NumPy**: For numerical computing.
- **Scikit-learn**: For data preprocessing and basic machine learning tasks.

## Installation

To get started with Elevate EDA, follow these steps:

1. **Clone the repository:**

```bash
git clone https://github.com/Jenishajustin/elevate-eda.git
```

2. **Navigate to the project directory:**

```bash
cd Elevate_EDA
```

3. **Create a virtual environment (optional but recommended):**

```bash
python -m venv venv
source venv/bin/activate # For Linux/Mac
.\venv\Scripts\activate # For Windows
```

4. **Install the required dependencies:**

```bash
pip install -r requirements.txt
```

5. **Run the Streamlit application:**

```bash
streamlit run app.py
```

6. **Access the web app:**
Open a browser and go to `http://localhost:8501/`.

## Wanna Access Elevate EDA?
Have fun with my EDA tool 👉 - https://elevate-eda-360.streamlit.app/
## Usage

1. Upload your dataset (CSV format).
2. Explore summary statistics, data visualizations, and correlations.
3. Generate insights through visual tools provided in the app.
4. Optionally, download an automated EDA report summarizing the key findings.

## Example

Here's an example of how to use Elevate EDA on a sample dataset:

1. Upload your data using the file uploader.
2. View visualizations like correlation heatmaps, box plots, and pair plots.
3. Analyze outliers and generate descriptive statistics.

## Screenshots

### Main Dashboard
![Screenshot 2024-10-20 175313](https://github.com/user-attachments/assets/bb602f2e-1b43-4f79-bec8-04c545b9950a)
![Screenshot 2024-10-20 175401](https://github.com/user-attachments/assets/99a6de71-b0f6-4a9c-a70c-2d3ca5b9bc28)

### Visualizations
#### Bar Chart
![Screenshot 2024-10-20 175522](https://github.com/user-attachments/assets/70fe15df-b23a-453e-b6c9-9105cae3485c)

#### Scatter Plot
![Screenshot 2024-10-20 175628](https://github.com/user-attachments/assets/957320b9-53f7-470a-8100-96e3f163b40b)

#### Density Contour
![Screenshot 2024-10-20 175718](https://github.com/user-attachments/assets/6ce76283-a050-44c9-a6d1-7765604c5d98)

### Correlation Matrix
![Screenshot 2024-10-20 181336](https://github.com/user-attachments/assets/7a2ee25b-d270-4e8c-8c8b-ac7c2f2a81b0)

### K-Means Clustering
![Screenshot 2024-10-20 181422](https://github.com/user-attachments/assets/7526634f-9830-4ad6-9396-a948e5e4e8d4)

### Cluster Summary
![Screenshot 2024-10-20 181439](https://github.com/user-attachments/assets/7efd0914-11e3-45d4-adef-719e0bd26d98)

### Data Profile
![Screenshot 2024-10-20 181540](https://github.com/user-attachments/assets/a1c844c0-1847-485f-91ca-6c2d4835d14f)

### Download Filtered Data
![Screenshot 2024-10-20 181628](https://github.com/user-attachments/assets/6304c2cd-fc39-4a07-a8d9-d1f4abc0e59c)

## Future Enhancements

- **Feature Selection and Engineering Tools:** Automated feature selection for more advanced EDA.
- **Data Preprocessing:** Clean and preprocess your data, including handling missing values, outliers,formatting and scaling features.
- **Machine Learning Integration:** Add basic machine learning model training to the tool.
- **Custom Reports:** Further customization options for automated reports.
- **More Visualization Options:** Advanced and custom charting options.

## Contribution

We welcome contributions! To contribute:

1. Fork the repository.
2. Create a new branch for your feature/bugfix.
3. Submit a pull request.

## License

This project is licensed under the MIT License - see the [LICENSE](https://github.com/Jenishajustin/Elevate_EDA/blob/main/LICENSE) file for details.

## Contact

If you have any questions or suggestions, feel free to reach out at:

- **Email**: [email protected]
- **GitHub**: [Jenishajustin][(https://github.com/Jenishajustin)]