https://github.com/basemax/qt-dataset-explorer
Interactive desktop GUI application for exploring datasets and performing statistical analysis. Built using Python, PyQt5, and scientific libraries (pandas, matplotlib, seaborn, scipy). Interactive desktop app for exploring datasets and statistics visually. Built using Python, PyQt5, and scientific libraries.
https://github.com/basemax/qt-dataset-explorer
dataset dataset-checker dataset-explorer datasets py py3 pyqt pyqt5 python python3 qt qt5
Last synced: 2 months ago
JSON representation
Interactive desktop GUI application for exploring datasets and performing statistical analysis. Built using Python, PyQt5, and scientific libraries (pandas, matplotlib, seaborn, scipy). Interactive desktop app for exploring datasets and statistics visually. Built using Python, PyQt5, and scientific libraries.
- Host: GitHub
- URL: https://github.com/basemax/qt-dataset-explorer
- Owner: BaseMax
- License: mit
- Created: 2025-12-21T22:33:32.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-12-22T09:47:40.000Z (6 months ago)
- Last Synced: 2025-12-23T20:55:58.608Z (6 months ago)
- Topics: dataset, dataset-checker, dataset-explorer, datasets, py, py3, pyqt, pyqt5, python, python3, qt, qt5
- Language: Python
- Homepage:
- Size: 15.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Qt Dataset Explorer
Interactive desktop GUI application for exploring datasets and performing statistical analysis. Built using Python, PyQt5, and scientific libraries (pandas, matplotlib, seaborn, scipy).



## Features
- **Dataset Loading**: Support for CSV and Excel files
- **Dataset Preview**: Interactive table view with up to 1000 rows displayed
- **Data Filtering**: Apply filters using various operators (==, !=, >, <, >=, <=, contains, startswith, endswith)
- **Descriptive Statistics**: Comprehensive statistical analysis including:
- Mean, median, standard deviation, min, max
- Variance, skewness, kurtosis
- Missing value counts
- Categorical value distributions
- **Visualization**: Multiple plot types:
- Histograms
- Box plots
- Scatter plots
- Correlation heatmaps
- Bar plots
- **Hypothesis Testing**: Statistical tests including:
- Independent T-Test
- Paired T-Test
- Chi-Square Test
- ANOVA
- Correlation Test (Pearson & Spearman)
- **Export Tools**:
- Export filtered data to CSV/Excel
- Export statistics to text file
- Save plots as PNG/PDF
## Installation
1. Clone the repository:
```bash
git clone https://github.com/BaseMax/qt-dataset-explorer.git
cd qt-dataset-explorer
```
2. Install required dependencies:
```bash
pip install -r requirements.txt
```
## Usage
Run the application:
```bash
python dataset_explorer.py
```
### Quick Start Guide
1. **Load a Dataset**:
- Click "Load Dataset" button
- Select a CSV or Excel file
- The dataset will be loaded and displayed in the Preview tab
2. **Filter Data**:
- Go to the "Preview & Filter" tab
- Select a column from the dropdown
- Choose an operator (==, !=, >, <, contains, etc.)
- Enter a value to filter by
- Click "Apply Filter"
- Click "Reset Filter" to show all data again
3. **View Statistics**:
- Go to the "Statistics" tab
- Statistics are automatically calculated when you load data
- Click "Refresh Statistics" to update after filtering
4. **Generate Plots**:
- Go to the "Plots" tab
- Select a plot type (Histogram, Box Plot, Scatter Plot, etc.)
- Choose columns for X and Y axes (if applicable)
- Click "Generate Plot"
- Click "Save Plot" to export as PNG or PDF
5. **Run Hypothesis Tests**:
- Go to the "Hypothesis Testing" tab
- Select a test type
- Choose columns to test
- Click "Run Test"
- View detailed results including p-values and conclusions
6. **Export Data**:
- Click "Export Data" to save filtered dataset
- Click "Export Statistics" to save statistical summary
## Sample Data
A sample dataset (`sample_data.csv`) is included for testing the application. It contains employee information with columns: Name, Age, Gender, Salary, Department, and Experience.
## Requirements
- Python 3.7+
- PyQt5 5.15.10
- pandas 2.0.3
- matplotlib 3.7.2
- seaborn 0.12.2
- scipy 1.11.2
- numpy 1.24.3
- openpyxl 3.1.2
## License
See LICENSE file for details.
## Author
- GitHub: [@BaseMax](https://github.com/BaseMax)
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.