Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sunnybibyan/random_data_generation
A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.
https://github.com/sunnybibyan/random_data_generation
data-analysis data-visualization python random-data-generation statistics streamlit-webapp
Last synced: 5 days ago
JSON representation
A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.
- Host: GitHub
- URL: https://github.com/sunnybibyan/random_data_generation
- Owner: SunnyBibyan
- License: mit
- Created: 2024-09-12T12:59:57.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-09-12T14:03:12.000Z (4 months ago)
- Last Synced: 2024-11-09T19:06:15.548Z (2 months ago)
- Topics: data-analysis, data-visualization, python, random-data-generation, statistics, streamlit-webapp
- Language: Python
- Homepage: https://sunnybibyan-random-data-generation-app-mptcjw.streamlit.app/
- Size: 14.6 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Random Data Generation and Basic Statistical Analysis
## Overview
This project generates a synthetic dataset using various statistical distributions, providing insights into the nature of random data. The dataset includes values from Normal, Uniform, Exponential, Random Integers, and Binomial distributions, allowing for a comprehensive analysis of different types of data.The dataset is designed for educational purposes, offering a practical example of how to generate and analyze random data.
## Dataset Generation
### Key Features
- **Data Sources:** Data is generated using Python libraries such as NumPy and Pandas.
- **Distributions:**
- **Normal Distribution:** Simulates continuous data with a Gaussian distribution.
- **Uniform Distribution:** Provides values within a specified range.
- **Exponential Distribution:** Models the time between events.
- **Random Integers:** Simulates discrete values.
- **Binomial Distribution:** Represents binary outcomes.
- **Statistics:** Descriptive statistics including mean, median, and standard deviation are computed.
- **Visualizations:** Histograms are created to observe the distribution patterns.## Tools & Technologies
- **Python:** For data generation and analysis.
- **NumPy:** For numerical operations and random data generation.
- **Pandas:** For data manipulation and analysis.
- **Matplotlib:** For plotting visualizations.
- **Seaborn:** For enhanced data visualization.## Dataset Information
The generated dataset includes the following columns:
- **Normal Distribution:** Values drawn from a Gaussian distribution.
- **Uniform Distribution:** Values uniformly distributed between specified limits.
- **Exponential Distribution:** Values following an exponential distribution.
- **Random Integers:** Integer values within a specified range.
- **Binomial Distribution:** Values from a binomial distribution representing binary outcomes.## Visualizations
The project includes histograms for each type of distribution:
- **Normal Distribution Histogram:** Shows the distribution of values from the Gaussian distribution.
- **Uniform Distribution Histogram:** Displays the range and frequency of uniformly distributed values.
- **Exponential Distribution Histogram:** Illustrates the spread of values from the exponential distribution.
- **Random Integers Histogram:** Visualizes the frequency of discrete integer values.
- **Binomial Distribution Histogram:** Represents the frequency of binary outcomes.## Project Structure
### How to Use the Project
1. **Run the Script:** Execute `App.py` to generate the dataset and visualizations.
2. **Explore Visualizations:** Use the Streamlit interface to select columns and view histograms.
3. **Download Data:** Use the download button to save the generated dataset as a CSV file.## Requirements
- **Install the necessary Python libraries:**
```sh
pip install -r requirements.txt
## Insights and Recommendations**
- **Distribution Patterns:** Analyze how different statistical distributions generate data with varying patterns.
- **Data Analysis:** Utilize the generated dataset for educational purposes, testing, and further analysis.
## License
This project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.## Connect with Me
- **LinkedIn**: [Profile](https://www.linkedin.com/posts/sunny-bibyan)
- **Contact**: [Sunny Bibyan](mailto:[email protected])