Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/abhishek-patidar066/synthetic-datasets
Synthetic datasets are artificially generated data used for training machine learning models, simulating real-world data while ensuring privacy.
https://github.com/abhishek-patidar066/synthetic-datasets
clustering datasets jupyter-notebook libraries matplotlib-pyplot numpy pandas-dataframe python random sklearn
Last synced: 2 months ago
JSON representation
Synthetic datasets are artificially generated data used for training machine learning models, simulating real-world data while ensuring privacy.
- Host: GitHub
- URL: https://github.com/abhishek-patidar066/synthetic-datasets
- Owner: Abhishek-Patidar066
- Created: 2024-09-19T07:18:22.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-11-20T12:24:09.000Z (2 months ago)
- Last Synced: 2024-11-20T13:28:03.422Z (2 months ago)
- Topics: clustering, datasets, jupyter-notebook, libraries, matplotlib-pyplot, numpy, pandas-dataframe, python, random, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 1.45 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Synthetic-Datasets
Overview:
This repository contains synthetic datasets designed for machine learning and data analysis tasks. The datasets feature a variety of patterns and structures, making them ideal for testing, benchmarking, and visualizing algorithms. Each dataset simulates unique distributions or geometrical patterns, such as moons, blobs, clusters, and more.Datasets:
Moon Pattern:
A crescent moon-shaped dataset with two distinct classes forming semi-circular shapes.
Useful for evaluating clustering and classification algorithms in non-linear decision boundary scenarios.Blob Pattern:
A set of isotropic Gaussian blobs for clustering.
Adjustable parameters allow control over the number of clusters, cluster size, and standard deviation.Name-Like Patterns:
Datasets resembling the shape of characters or words.
Often used for creative visualization or pattern recognition tasks.Grid/Checkerboard Patterns:
Structured grids or alternating square patterns.
Ideal for exploring spatial relationships and segmentation tasks.Custom Shapes:
Arbitrary shapes or patterns to simulate unique data distributions.
Can be used for specialized tasks like anomaly detection or unsupervised learning.Features:
Customizable Parameters: Modify the size, density, noise, and dimensions to fit your requirements.
Easy Integration: Datasets are provided in formats ready for popular machine learning libraries like scikit-learn, TensorFlow, or PyTorch.
Versatility: Suitable for tasks like supervised/unsupervised learning, visualization, and algorithm prototyping.![abhishek](https://github.com/user-attachments/assets/8d0d311a-82ef-4ee2-b134-d341500af2a3)
![moon_img](https://github.com/user-attachments/assets/b9346c39-34b7-4b59-9d1a-a5b0d248ab53)
![Blob_4img](https://github.com/user-attachments/assets/6e1a9489-0b90-49c2-a1d2-f1ceef13033f)