Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sunnybibyan/random_data_generation

A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.
https://github.com/sunnybibyan/random_data_generation

data-analysis data-visualization python random-data-generation statistics streamlit-webapp

Last synced: 5 days ago
JSON representation

A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.

Awesome Lists containing this project

README

        

# Random Data Generation and Basic Statistical Analysis

## Overview
This project generates a synthetic dataset using various statistical distributions, providing insights into the nature of random data. The dataset includes values from Normal, Uniform, Exponential, Random Integers, and Binomial distributions, allowing for a comprehensive analysis of different types of data.

The dataset is designed for educational purposes, offering a practical example of how to generate and analyze random data.

## Dataset Generation

### Key Features
- **Data Sources:** Data is generated using Python libraries such as NumPy and Pandas.
- **Distributions:**
- **Normal Distribution:** Simulates continuous data with a Gaussian distribution.
- **Uniform Distribution:** Provides values within a specified range.
- **Exponential Distribution:** Models the time between events.
- **Random Integers:** Simulates discrete values.
- **Binomial Distribution:** Represents binary outcomes.
- **Statistics:** Descriptive statistics including mean, median, and standard deviation are computed.
- **Visualizations:** Histograms are created to observe the distribution patterns.

## Tools & Technologies
- **Python:** For data generation and analysis.
- **NumPy:** For numerical operations and random data generation.
- **Pandas:** For data manipulation and analysis.
- **Matplotlib:** For plotting visualizations.
- **Seaborn:** For enhanced data visualization.

## Dataset Information
The generated dataset includes the following columns:
- **Normal Distribution:** Values drawn from a Gaussian distribution.
- **Uniform Distribution:** Values uniformly distributed between specified limits.
- **Exponential Distribution:** Values following an exponential distribution.
- **Random Integers:** Integer values within a specified range.
- **Binomial Distribution:** Values from a binomial distribution representing binary outcomes.

## Visualizations
The project includes histograms for each type of distribution:
- **Normal Distribution Histogram:** Shows the distribution of values from the Gaussian distribution.
- **Uniform Distribution Histogram:** Displays the range and frequency of uniformly distributed values.
- **Exponential Distribution Histogram:** Illustrates the spread of values from the exponential distribution.
- **Random Integers Histogram:** Visualizes the frequency of discrete integer values.
- **Binomial Distribution Histogram:** Represents the frequency of binary outcomes.

## Project Structure

### How to Use the Project
1. **Run the Script:** Execute `App.py` to generate the dataset and visualizations.
2. **Explore Visualizations:** Use the Streamlit interface to select columns and view histograms.
3. **Download Data:** Use the download button to save the generated dataset as a CSV file.

## Requirements
- **Install the necessary Python libraries:**
```sh
pip install -r requirements.txt
## Insights and Recommendations**
- **Distribution Patterns:** Analyze how different statistical distributions generate data with varying patterns.
- **Data Analysis:** Utilize the generated dataset for educational purposes, testing, and further analysis.

## License
This project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.

## Connect with Me
- **LinkedIn**: [Profile](https://www.linkedin.com/posts/sunny-bibyan)
- **Contact**: [Sunny Bibyan](mailto:[email protected])