https://github.com/srinibas-masanta/streamlit-dataprofile
Data Profiler is a Streamlit app designed to provide insightful data analysis and visualization. Users can upload their datasets in '.csv' or '.xlsx' format, and the app generates a comprehensive profiling report using the YData Profiling library.
https://github.com/srinibas-masanta/streamlit-dataprofile
streamlit ydata-profiling
Last synced: about 2 months ago
JSON representation
Data Profiler is a Streamlit app designed to provide insightful data analysis and visualization. Users can upload their datasets in '.csv' or '.xlsx' format, and the app generates a comprehensive profiling report using the YData Profiling library.
- Host: GitHub
- URL: https://github.com/srinibas-masanta/streamlit-dataprofile
- Owner: srinibas-masanta
- License: mit
- Created: 2024-08-11T15:06:17.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-08-17T18:12:53.000Z (9 months ago)
- Last Synced: 2024-08-17T19:28:43.137Z (9 months ago)
- Topics: streamlit, ydata-profiling
- Language: Python
- Homepage:
- Size: 1.1 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# **Data Profiler**
**Data Profiler** is a powerful and user-friendly web application built with Streamlit that allows you to analyze and visualize your datasets with ease. Simply upload your data in `.csv` or `.xlsx` format, and generate comprehensive profiling reports that help you detect anomalies, patterns, and trends within your data.
## **Features**
- **Automated Data Analysis:** Quickly generate detailed profiling reports by uploading your dataset.
- **Customizable Reports:** Choose between different display modes, including `Primary`, `Dark`, and `Orange`.
- **Support for Multiple Formats:** Upload `.csv` or `.xlsx` files (up to 10 MB) for analysis.
- **Interactive UI:** Easy-to-use interface with options to select specific sheets for `.xlsx` files.
- **Downloadable Reports:** Save the profiling report as an HTML file for offline analysis.## **Installation**
To run the Data Profiler application on your local machine, follow the steps below:
### **1. Clone the Repository**
```bash
git clone https://github.com/srinibas-masanta/data-profiler.git
cd data-profiler
```### **2. Set Up a Virtual Environment**
Create and activate a virtual environment to manage dependencies.
```bash
python -m venv dataprofile
.\dataprofile\Scripts\activate # On Windows
source dataprofile/bin/activate # On macOS/Linux
```### **3. Install Dependencies**
Install the required Python packages listed in the `requirements.txt` file.
```bash
pip install -r requirements.txt
```Alternatively, manually install the necessary packages:
```bash
pip install numpy pandas scipy matplotlib streamlit ydata-profiling streamlit-pandas-profiling openpyxl xlrd
```### **4. Run the Application**
Start the Streamlit application by running the following command:
```bash
streamlit run app.py
```## **Usage**
Once the application is running, follow these steps:
1. **Upload Your Data:** Use the sidebar to upload a `.csv` or `.xlsx` file (up to 10 MB).
2. **Select Options:** Choose the report mode (`Primary`, `Dark`, `Orange`), and decide if you want a minimal report or a full report.
3. **Generate Report:** Click to generate the report, which will be displayed within the app.
4. **Download Report (Optional):** If desired, save the report as an HTML file using the download button.## **Project Structure**
- **app.py:** Main script containing the Streamlit application logic.
- **media/DP Logo.jpg:** Logo used in the welcome page of the application.
- **requirements.txt:** List of all the Python dependencies required to run the application.## **License**
This project is licensed under the MIT License - see the [LICENSE.txt](LICENSE.txt) file for details.